# Numpy I

## Arrays and Lists

Before getting introduced to Numpy library, we need to be familiar with a very widely used data structure called 'array'. An **array** is a collection of homogenous elements. Here homogenous means elements of the same data type. And so an array can be a collection of integers (*int* datatypes), collection of fractions/decimal values (*float* datatypes) or a collection of characters (*char* datatype) also referred to as a *string*.

A **list** is very similar to an array in structure differing only by the characteristic that, a list is a collection of heterogenous elements, i.e. elements with various datatypes.

<img src="../images/arrayvslist.png" width="800"><br>

Both arrays and lists can be accessed in the same way, using an index, which is a reference to the position of the element within the collection. Index begins from zero and ends with n-1, where n is the total length of the array (which is the number of elements in the array or list).

## 1. NumPy Introduction

NumPy is a library in Python that supports creation and operations on large, multidimensional arrays and matrices. It facilitates scientific and numeric computing with high level mathematical functions to operate on these arrays.

A NumPy object is generally a multi-dimensional array. It is a table of elements with the same data type (int, float, char, etc.), indexed by a tuple of positive integers (also called *indices*). In Numpy, dimensions are called *axes*. The number of axes is *rank*. (ref: scipy.org)

Consider a 3D space of x, y and z coordinates. A  point in 3D space [1.0, 3.0, 5.0] is an array of rank 1. If there are several such points as shown below, then the dimensions are of the nature m by n. In the example shown below, there are 4 rows and 3 columns (i.e., 3 elements or values for each point/observation), this translates to an n-dimensional array with a shape of 4 by 3.
```python
[[ 1.0, 0.3, 4.5],
 [ 0.5, 1.5, 2.3],
 [ 6.0, 4.6, 3.5],
 [ 4.5, 3.5, 6.3]]
```
To use a library, we need to import them. Any python library can be referenced by an alias that is mentioned during the import. For example, NumPy library is most commonly imported in the short form as np:
```python
# Importing the numpy library
import numpy as np
```

There are several ways to initialize an array in numpy :
* a = np.array([0,1,2,3])   ...   creates an array of rank 1
* a = np.array([[0,1,2,3],[4,5,6,7]])   ...   creates a 2x4 matrix  
* a = np.ones((3,3))   ...   creates a 3x3 matrix with all 1s
* a = np.zeroes((2,2))   ...   creates a 2x2 matrix with all 0s
* a = np.eye(3)   ...   creates a 3x3 matrix with 1s at the diagonal and 0s otherwise (i.e. an identity matrix)
  
Ref: http://www.numpy.org

We will look at various functions (most widely used functions) within the numpy library and practice applying them.

## 2. Casting a non-array datatype into an array

The np.array() method can be used to convert a variable of any data type into an n-dimensional array. By simply passing the variable as an argument to this function, we can convert the variable which can be an integer, float, list, series or dataframe into an n-dimensional array.

For Example:

```python
a = 5
type(a)
>>> int

a = np.array(a)
type(a)
>>> numpy.ndarray

b = [1.2, 3.4, 5.6]
type(b)
>>> list

b = np.array(b)
type(b)
>>> numpy.ndarray
```

### Exercise:

Given the list of numbers use the np.array() method to cast the list into a numpy array


In [1]:
import numpy as np

# Cast the below given list into an array
a = ['Hello',4,5,17.32,25.21,'c']

# hint

Use np.array() method and pass the list as an argument.

In [2]:
a = np.array(a)
type(a)

numpy.ndarray

In [3]:
from refactored import unittest

ref_tmp_var = False

ref_tmp_var = unittest.test_value(np.array(a))

assert ref_tmp_var

<br>
### Wait! The above list was heterogenous...

Remember that an array is a collection of homogenous elements. However, in the above exercise the list 'a' was a collection of heterogenous elements/values (values of different datatypes). The np.array() method worked anyway and converted the list into an array. The method acheives this by converting all the list elements into string data types. Another interesting behavior to note is that once the elements are cast as a string data types, they cannot be individually re-cast into integer, float or other data types.

i.e., In the above example:

```python
print(a[2],type(a[2]))
>>> 5 <class 'numpy.str_'>

# Casting to 'int'
a[2] = int(a[2])
print(a[2],type(a[2]))
>>> 5 <class 'numpy.str_'>
```
There is no change in array element data type, by individual recasting.

## 3. Casting a list of lists into a 2-dimensional array

A list of lists structure can be converted into a 2-D array (or matrix) using the np.array() method.

```python
my_list2 = [[1,2,3],[4,5,6],[7,8,9],[10,11,12]]
list_of_lists = np.array(my_list2)
print(list_of_lists,"\n",type(list_of_lists))
>>> [[ 1  2  3]
  [ 4  5  6]
  [ 7  8  9]
  [10 11 12]]
  <class 'numpy.ndarray'>
```

## 4. Checking the shape and data type of a numpy array

Shape of an array can be described as the lengths of every dimension pertaining to the array. <br/>
For a 2-dimensional array, shape would be number of rows * number of columns, denoted as 'mxn' and read as 'm by n'.

<img src="../images/numpy_1-shape_of_array.png">

The command shape is used to determine the dimensions of the array.

```python
print(list_of_lists.shape)
>>> (4, 3)
```

The data type of each individual elements in an array can be identified by using the 'dtype' attribute.

```python
my_list2 = [[1,2,3],[4,5,6],[7,8,9],[10,11,12]]
list_of_lists = np.array(my_list2)
print(list_of_lists.dtype)
>>> int32
```

### Exercise:

Find the shape of the array below and assign to the variable: two_d_shape and print it out.
Also find the data type of the elements in the array assign to the variable: two_d_datatype and print it out.

```python
[[1, 0.3, 4.5]
 [2, 4.0, 6.0]]
```

In [4]:
import numpy as np

two_d_array = np.array([[ 1 ,  0.3,  4.5],
                        [2, 4.0, 6.0]])

# hint
<p>Use array_name.shape() and array_name.dtype()</p>

In [5]:
two_d_shape = two_d_array.shape
two_d_datatype = two_d_array.dtype
print(two_d_shape,two_d_datatype)

(2, 3) float64


In [6]:
ref_tmp_var_1 = False

ref_tmp_var_1 = unittest.test_value(two_d_array.shape) and unittest.test_value(two_d_array.dtype)

assert ref_tmp_var_1


## 5. NumPy Array Operations
 
<br>
In this lesson we shall study various operations on numpy arrays. 

#### 5.1 Addition of two arrays

To add another array of the same dimension use
```python
C = np.add(A, B)      OR     C = A + B
```

#### 5.2 Subtraction of two arrays

To subtract an array from another array use 
```python
D = np.subtract(A,B)  OR     D = A - B
```

#### 5.3 Multiplication

Using __np.multiply__ (or the __* symbol__) you can

> 1) multiply a constant with the elements of the array
```python
K = 10
Y = np.multiply(K,[1, 2, 3, 4, 5])   OR     Y = K * np.array([1, 2, 3, 4, 5])

>>>> array([10, 20, 30, 40, 50])
```
> 2) multiply an array with another array, performs element-wise multiplication (only if both arrays are of equal shape)
``` python
A = np.array([1,2,3])
B = np.array([4,5,6])
C = np.multiply(A,B)                 OR     C = A * B

>>>> array([ 4, 10, 18])
```

Using __np.dot__ you can perform

> 3) dot product of two arrays, calculates sum of product of elements (only if both arrays are of equal shape)
```python
A = np.array([1,2,3])
B = np.array([4,5,6])
C = np.dot(A, B)

>>>> 32
```

Using __np.matmul__ you can perform

> 4) matrix multiplication (only if number of columns in the 1st one equals the number of rows in the 2nd one)
``` python
A = [[1, 0], [0, 1]]
B = [[4, 1], [2, 2]]
C = np.matmul(A, B)

>>>> array([[4, 1],
       [2, 2]])
```

#### Other Common Operations

The other common operations such as square root and exponential functions can be computed with the extensions that are common to most other languages such as :

```python
numpy.sqrt(B), numpy.exp(A), 
```
In this lesson, we can observe that most operations with respect to arrays in the area of data science, can be accessed through the commonly available extensions in other languages as such.


### Exercise:

Given two arrays:
```python
A = [1, 2, 3, 4]
B = [2, 3, 4, 5]
```

- Initilize the above arrays as variables A & B.
- Perform a dot product of the two vectors and assign it to the variable C.
- Print C.

In [7]:
import numpy as np


<p>For power in B:</p>

<p>  A**power will contain the array. Append this to another empty array.</p>

<p>Or use __np.dot__ function instead</p>

In [8]:
A = np.array([1, 2, 3, 4])
B = np.array([2, 3, 4, 5])

C=np.dot(A,B)
C

40

In [9]:
ref_tmp_var = False

ref_tmp_var = unittest.test_value(np.dot(A,B))

assert ref_tmp_var

## 6. Numpy special functions

### 6.1 arange() function

There is a basic function in python to generate a list of values. **range**(lower, upper, increment) - where the function starts from the lower value and iterates the value using increment, up to the upper value(exclusive). <br>
Numpy's **arange**(lower, upper, increment) function has the same functionality, except that the output of this function would be an array with the iterated values.

For example:
```python
np.arange(1,5)
>>> array([1, 2, 3, 4])

np.arange(1,11,2)
>>> array([1, 3, 5, 7, 9])

np.arange(11,1,-2)
>>> array([11,  9,  7,  5,  3])
```

### Exercise

Create 2 arrays with values 1,2,3,4 and 5.
* first array using **range()** function
* second array using **arange()** function

In [10]:
array_one = []
array_two = []

# hint

Refer to the examples given.
* To create an array with **range** function generate a list and cast it into an np.array(). 
* To create an array with **arange** function use the np.arange() function.

In [11]:
array_one = []
for i in range(1,6):
    array_one.append(i)
array_one = np.array(array_one)
array_two = np.arange(1,6)

print(array_one, array_two)

[1 2 3 4 5] [1 2 3 4 5]


In [12]:
ref_tmp_var = False

ref_tmp_var = unittest.test_value(np.array([x for x in range(1,6)])) and unittest.test_value(np.arange(1,6)) 

assert ref_tmp_var

### 6.2 linspace() function

The linear space function creates an array of values which are equally spaced within specified limits. The function accepts a lower limit, an upper limit and the length of the array(say 'n') and it generates _'n'_ equally spaced elements from lower limit to upper limit (inclusive).

```python
np.linspace(1,5,5)
>>> array([ 1.,  2.,  3.,  4.,  5.])

np.linspace(0,1,10)
>>> array([ 0.        ,  0.11111111,  0.22222222,  0.33333333,  0.44444444,
         0.55555556,  0.66666667,  0.77777778,  0.88888889,  1.        ])

np.linspace(-6,6,5)
>>> array([-6., -3.,  0.,  3.,  6.])
```

### Exercise

Initialize and print:
* a linearly spaced array with 5 values between 5 and 50

In [13]:
# Modify code below

lin_arr = []

# hint

Refer to the examples given.

In [14]:
lin_arr = np.linspace(5,50,5)
lin_arr

array([ 5.  , 16.25, 27.5 , 38.75, 50.  ])

In [15]:
ref_tmp_var = False

ref_tmp_var = unittest.test_value(np.linspace(5,50,5))

assert ref_tmp_var

### 6.3 Zeros, Ones and Eye

The **np.zeros()** and **np.ones()** functions are used to create n-dimensional arrays with all elements as zeros or ones respectively. Such arrays are extremely useful in many numeric and mathematical operations. The functions take the shape of the array that is to be created, as the argument.

```python
np.zeros(1,5)
>>> array([[ 0.,  0.,  0.,  0.,  0.]])

np.ones((2,4))
>>> array([[1., 1., 1., 1.],
       [1., 1., 1., 1.]])
```

The **np.eye()** function creates a square matrix, nxn, with all diagonal elements as ones and all non-diagonal elements are zeros. In mathematics, such kind of a matrix is called the "Identity Matrix", as the multiplicative product of any matrix A and the appropriate identity matrix, is always A itself.

``` python
np.eye(2)
>>> array([[1., 0.],
       [0., 1.]])
```
<img src="../images/numpy_1-zeroes_ones_eye.png">


### Exercise

Initiate three arrays:
* A zeros array of shape (3,3)
* A ones array of shape (4,4)
* A 3x3 Identity Matrix

In [16]:
# Modify code below

zeros_arr = []
ones_arr = []
eye_mat = []

# hint

Refer to the examples given.

In [17]:
zeros_arr = np.zeros((3,3))
ones_arr = np.ones((4,4))
eye_mat = np.eye(3)

In [18]:
ref_tmp_var = False

ref_tmp_var = unittest.test_value(np.zeros((3,3))) and unittest.test_value(np.ones((4,4))) and unittest.test_value(np.eye(3))

assert ref_tmp_var