# Introduction to Numpy


<div align="center"><img src="https://raw.githubusercontent.com/eitanlees/ISC-3313/master/Lectures/Week-04/images/automobile.gif" width="900"/></div>

By now you should be familiar with using the standard list object. 

Lists are very general and can hold heterogeneous data.

For example:

In [1]:
L = [1, 3.14, 'Hello', False]

While the generality is useful in some application, often in scientific computing we are interested in lists that contain all the same data type, usually numbers. 

In [1]:
data = [1, 2, 3, 4]

For homogeneous numerical data structures there are some mathematical functions which would be more convenient to work with. 

For example if we want to take the `sin` of our data list

In [33]:
import math

# math.sin(data)

We are forced to loop over every element of the list 

In [3]:
new_data = []
for element in data:
    new_data.append(math.sin(element)) 
    
print(new_data)

[0.8414709848078965, 0.9092974268256817, 0.1411200080598672, -0.7568024953079282]


Wouldn't it be nice if we could perform operations on each element of a list without having to loop. 

Numpy addresses this issue. 

In [5]:
import numpy as np

type(np.sin(data))

numpy.ndarray

NumPy (short for Numerical Python) provides an efficient interface to store and operate on dense data buffers. 

In some ways, NumPy arrays are like Python's built-in list type, but NumPy arrays provide much more efficient storage and data operations as the arrays grow larger in size. 

NumPy arrays form the core of nearly the entire ecosystem of scientific computing tools in Python

## Creating Arrays from Python Lists

First, we can use ``np.array`` to create arrays from Python lists:

In [5]:
# integer array:
np.array([1, 4, 2, 5, 3]) 

array([1, 4, 2, 5, 3])

Remember that unlike Python lists, NumPy is constrained to arrays that all contain the same type.

If types do not match, NumPy will upcast if possible (here, integers are up-cast to floating point):

In [6]:
np.array([3.14, 4, 2, 3])

array([3.14, 4.  , 2.  , 3.  ])

If we want to explicitly set the data type of the resulting array, we can use the ``dtype`` keyword:

In [7]:
np.array([1, 2, 3, 4], dtype='float32')

array([1., 2., 3., 4.], dtype=float32)

There are many different data types listed in the [numpy docs](https://docs.scipy.org/doc/numpy/user/basics.types.html)

Finally, unlike Python lists, NumPy arrays can explicitly be multi-dimensional; here's one way of initializing a multidimensional array using a list of lists:

In [8]:
# nested lists result in multi-dimensional arrays
x = np.array([[1, 2, 3], 
              [4, 5, 6]])
x

array([[1, 2, 3],
       [4, 5, 6]])

Each array has attributes:
- ``ndim`` (the number of dimensions) 
- ``shape`` (the size of each dimension)
- ``size`` (the total size of the array):

In [9]:
print("x ndim: ", x.ndim)
print("x shape:", x.shape)
print("x size: ", x.size)

x ndim:  2
x shape: (2, 3)
x size:  6


NumPy arrays can be multidimensional.

<div align="center"><img src="https://raw.githubusercontent.com/eitanlees/ISC-3313/master/Lectures/Week-04/images/nd_array.png" width="900"/></div>

Like lists and strings, numpy arrays support indexing and slicing.

In [22]:
a = np.array([1,2.0,3.2])
a[0] # first entry in a

1.0

In [10]:
b = np.array([[1, 2, 3.0], 
              [1.2, 2.2, 2]])
b[1] # second entry in b, each entry is a row

array([1.2, 2.2, 2. ])

When we have a multidimensional array (or an array of arrays of equal or unequal length) we can access the element at row i and column j using the syntax:

In [103]:
b[1][0] # first element in the second row of b

1.2

Or the equivalent syntax:

In [104]:
b[1,0]

1.2

Slicing is done in exactly the same way.

In [11]:
a = np.array([[1,2,3],
              [4,5,6],
              [7,8,9],
              [10,11,12]])
print(a)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


In [12]:
print(a[0,0:2]) # fist 2 entries of row 0

[1 2]


In [14]:
print(a[:2,:3]) # first 2 rows and columns 2 and 3

[[1 2 3]
 [4 5 6]]


The colon operator by itself means the entire row or column.

In [141]:
print(a[:, 1]) # entire second column

[ 2  5  8 11]


## Exercise

Create a NumPy array representing the data:
$$ \begin{bmatrix} 1 & 2 & 3 & 4\\ 5 & 6 & 7 & 8\\ 9 & 10 & 11 & 12\\ 13 & 14 & 15 & 16\end{bmatrix}$$

Print the number of dimensions, the size of each dimension, and the total size of the array

## Exercise

Extract the middle 2x2 array from the array from the previous exercise. i.e. use slicing to extract the array:
$$ \begin{bmatrix} 6 & 7\\ 10 & 11\end{bmatrix}$$

## Creating Arrays from Scratch

Especially for larger arrays, it is more efficient to create arrays from scratch using routines built into NumPy.
Here are several examples:

In [17]:
# Create a length-10 integer array filled with zeros
np.zeros((5), dtype=int)

array([0, 0, 0, 0, 0])

In [13]:
# Create a 3x5 floating-point array filled with ones
np.ones((3, 5), dtype=float)

array([[ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.]])

In [14]:
# Create a 3x5 array filled with 3.14
np.full((3, 5), 3.14) 

array([[ 3.14,  3.14,  3.14,  3.14,  3.14],
       [ 3.14,  3.14,  3.14,  3.14,  3.14],
       [ 3.14,  3.14,  3.14,  3.14,  3.14]])

### np.arange

Generating sequences is a very common task. 

For example suppose we want to create an array ranging from 1 to 20. 

NumPy provides a function `np.arange` that does just that.

In [19]:
a = np.arange(1,21, 2.5)
print(a)

[ 1.   3.5  6.   8.5 11.  13.5 16.  18.5]


The arange function works in a similar way to the `range` class we saw earlier. It can take in up to three arguments: a starting value, an end value and a step size. 

It returns a 1D array that starts at the starting value and adds the step size until it reaches or exceeds the end value. 

In [31]:
# Create an array filled with a linear sequence
# Starting at 0, ending at 20, stepping by 2
# (this is similar to the built-in range() function)
np.arange(0, 20, 2)

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

### np.linspace

NumPy provides a seperate function that creates a linearly spaced array of a specified length. 

The `np.linspace` function takes in a start and end value as well as the number of points. 

Unless you are dealing with integers, linspace is prefered over arange to generate equally spaced arrays. 

In [32]:
a = np.linspace(0, 2.2, 5)
print(a)

[0.   0.55 1.1  1.65 2.2 ]


The last parameter in linspace is the number of  desired points in the array. 

If the start value is greater than the end value, then linspace automatically takes negative step sizes.

In [33]:
a = np.linspace(5,1,5)
print(a)

[5. 4. 3. 2. 1.]


In [20]:
# Create an array of five values evenly spaced between 0 and 1
np.linspace(0, 1, 50)

array([0.        , 0.02040816, 0.04081633, 0.06122449, 0.08163265,
       0.10204082, 0.12244898, 0.14285714, 0.16326531, 0.18367347,
       0.20408163, 0.2244898 , 0.24489796, 0.26530612, 0.28571429,
       0.30612245, 0.32653061, 0.34693878, 0.36734694, 0.3877551 ,
       0.40816327, 0.42857143, 0.44897959, 0.46938776, 0.48979592,
       0.51020408, 0.53061224, 0.55102041, 0.57142857, 0.59183673,
       0.6122449 , 0.63265306, 0.65306122, 0.67346939, 0.69387755,
       0.71428571, 0.73469388, 0.75510204, 0.7755102 , 0.79591837,
       0.81632653, 0.83673469, 0.85714286, 0.87755102, 0.89795918,
       0.91836735, 0.93877551, 0.95918367, 0.97959184, 1.        ])

There are other methods of quickly generating points which you might encounter. 

In [35]:
# Create a 3x3 array of uniformly distributed
# random values between 0 and 1
np.random.random((3, 3))

array([[0.08822517, 0.77703301, 0.41913025],
       [0.78809797, 0.79295614, 0.29183254],
       [0.96343743, 0.65199014, 0.19567299]])

In [20]:
# Create a 3x3 identity matrix
np.eye(3)

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

In [21]:
# Create an uninitialized array of three integers
# The values will be whatever happens to already exist at that memory location
np.empty(3)

array([ 1.,  1.,  1.])

## Reshaping of Arrays

Another useful type of operation is reshaping of arrays.
The most flexible way of doing this is with the ``reshape`` method.

For example, if you want to put the numbers 1 through 9 in a $3 \times 3$ grid, you can do the following:

In [32]:
grid = np.arange(1, 10).reshape((3,3))
print(grid)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


Note that for this to work, the size of the initial array must match the size of the reshaped array. 

## Exercise

Using the linspace and reshape functions, create the array:
$$ A = \begin{bmatrix} 0.1 & 0.2 & 0.3 & 0.4 & 0.5\\ 0.6 & 0.7 & 0.8 & 0.9 & 1\end{bmatrix}.$$

[0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]
[0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]
