# 1.2 - Numpy

In [None]:
import numpy as np

### Why Numpy is necessary?
First, let's create some python arrays

In [None]:
array_1 = np.array([2,3,5,7])
array_2 = np.array([4,9,25,49])
print('numpy array 1 =' ,array_1)
print('numpy array 2 =' ,array_2)

print(array_1[2])
print(array_2[3])

It actually looks like a python list! So why should I use a numpy array?

To understand it, let's create two python lists and try to do basic operation between them

In [None]:
list_A = [2,3,5,7]
list_B = [4,9,25,49]

In [None]:
print(list_A*2)
print(list_A+list_B)

In [None]:
print(list_A*list_B)

As you can see, python doesn't know how to compute operation between the two lists
and to do it you have to di in a very non convenient way.

In [None]:
result = (list_A[0]*list_B[0], list_A[1]*list_B[1], list_A[2]*list_B[2], 'etc')
print(result)

Doing the same with numpy is completely straightforward

In [None]:
print(array_1*2)
print(array_1+array_2)
print(array_1*array_2)

### Something more
There are several other advantages in using numpy instead of python:
#### 1) Multi-dimensional arrays

In [None]:
a = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
print(a)
a.ndim

In [None]:
a.shape

In [None]:
a.reshape(6,2)

#### 2) It's faster

In [None]:
import timeit

L = range(1000)
%timeit [i**2 for i in L]

a = np.arange(1000)
%timeit a**2

#### 3) It requires less memory

In [None]:
import sys

print(sys.getsizeof(4)*len(L)) #get the byte size of a number in the list and multiply it for the lenght of the list

print(a.size*a.itemsize) #get the size of the array and multiply it for the byte size of an item in the array

### Indexing, fancy indexing, slicing
Indexing and slicing in NumPy is a topic that could fill pages and pages. A single array element or a subset can be selected in multiple ways. The following will be a short reminder on array indexing and slicing in NumPy. In the case of one-dimensional arrays there are apparently no differences with respect to indexing in Python lists:

In [None]:
arr1D = np.arange(10)+0.5
arr1D

We obtain the sixth element of the array similarly to what one would do in a list:

In [None]:
arr1D[5]

The subset formed by the sixth, seventh, and eigth element:

In [None]:
arr1D[5:8]  #as in lists [start:end:step] where first index is included, last index is omitted

We can modify the value of a single element or a subset of the vector:

In [None]:
arr1D[:3] = [11, 12, 13]
arr1D

If an scalar is assigned to a slice of the array, the value is _broadcasted_:

In [None]:
arr1D[-3:] = 66 # a negative index reverse the order of elements
arr1D

Fancy indexing is a term adopted by NumPy to describe indexing using integer arrays.

To illustrate this indexing style let us build a 7x5 matrix filled with random numbers distributed between 0 and 20.

In [None]:
arr2D = np.random.randint(0,20,35).reshape(7,5)
arr2D

In [None]:
arr2D[0::2,[0,3,4]] #Fancy index is an array requesting rows 0:6 step 2 and columns 0,3,4

In [None]:
arr2D[[0,2]] #Fancy index is an array requesting rows 0 and 2

In [None]:
arr2D[0,2] #Pay attention to syntax: this is normal indexing!

### Exercise: Removing negative numbers with *Fancy Indexing*

We have an array that has small negative numbers and we want to remove them, converting them to zero. First, we generate a random array.

In [None]:
np.random.seed(3)
a = np.random.random((5,4))-.2
a

In [None]:
a[(1,2,2,4),(2,2,0,0)] = 0
a

### Exercise: Removing negative numbers with *Boolean mask*

In [None]:
np.random.seed(3) #This way we can reproduce the same "random" array as before.
a = np.random.random((5,4))-.2
a

In [None]:
mask = [a<0]
a[mask]=0
a

Just to let you know, there's an alternative method for which the initial array a will not be modified

In [None]:
i = np.where(a<0,0,a)
i

### Basic numpy array operations

Generate arrays of numbers

In [None]:
arr = np.arange(0,101,10) #(fist:last:step)
arr

Generate array of n elements and reshape it

In [None]:
arr = np.arange(12).reshape(3,4)
arr

Generate an array of n elements which are all zeros

In [None]:
zeroarr = np.zeros(50)
zeroarr

Generate an array of elements which are linearly spaced

In [None]:
linspace_arr = np.linspace(1,5,10)
linspace_arr

You can use the ravel function to flatten an array

In [None]:
arr.ravel()

Find maxium, minimum, sum number togheter

In [None]:
print(arr.min())
print(arr.max())
print(arr.sum())

This will operate over all the array, but we can do the same only for axis

In [None]:
arr = np.arange(12).reshape(3,4)
arr

In [None]:
print(arr.sum(axis=0)) # axis = 0 sum over the columns
print(arr.sum(axis=1)) # axis = 1 sum over the raws

Stack horizontally or vertically the elements of two arrays

In [None]:
a = np.arange(6).reshape(3,2)
b = np.arange(6,12).reshape(3,2)
a

In [None]:
b

In [None]:
np.vstack((a,b))

In [None]:
np.hstack((a,b))

Likewise, we can horizontally or vertically slice an array into n equally sized arrays

In [None]:
a = np.arange(30).reshape(2,15)
a

In [None]:
np.hsplit(a,3)

In [None]:
result = np.hsplit(a,3)
print(result[0])
print(result[1])
print(result[2])

In [None]:
result = np.vsplit(a,2)
print(result[0])
print(result[1])

# Homework
### Dot product

if __A__ is an _n × m_ matrix and __B__ is an _m × p_ matrix,


##### $ A_{n,m} = 
 \begin{pmatrix}
  a_{1,1} & a_{1,2} & \cdots & a_{1,m} \\
  a_{2,1} & a_{2,2} & \cdots & a_{2,m} \\
  \vdots  & \vdots  & \ddots & \vdots  \\
  a_{n,1} & a_{n,2} & \cdots & a_{m,m} 
 \end{pmatrix}
 $
 
 ##### $ B_{m,p} = 
 \begin{pmatrix}
  a_{1,1} & a_{1,2} & \cdots & a_{1,p} \\
  a_{2,1} & a_{2,2} & \cdots & a_{2,p} \\
  \vdots  & \vdots  & \ddots & \vdots  \\
  a_{m,1} & a_{m,2} & \cdots & a_{m,p} 
 \end{pmatrix}
 $
 
 The __matrix product AB__ (denoted without multiplication signs or dots) is defined to be the _n × p_ matrix
 
 ##### $ AB_{n,p} = 
 \begin{pmatrix}
  (ab)_{1,1} & (ab)_{1,2} & \cdots & (ab)_{1,p} \\
  (ab)_{2,1} & (ab)_{2,2} & \cdots & (ab)_{2,p} \\
  \vdots  & \vdots  & \ddots & \vdots  \\
  (ab)_{n,1} & (ab)_{n,2} & \cdots & (ab)_{n,p} 
 \end{pmatrix}
 $
 
 where each _i, j_ entry is given by multiplying the entries $A_{ik}$ (across row i of A) by the entries $B_{kj}$ (down column j of B), for k = 1, 2, ..., m, and summing the results over k:
 
$(AB)_{ij} = \displaystyle\sum_{k=1}^{m} A_{ik}B_{kj}$


### Exercise:
The dot product is so common that it is implemented in numpy. But if it was not there, could you code it?

In [None]:
A = np.random.randint(0,50,35).reshape(7,5)
B = np.random.randint(50,100,35).reshape(5,7)

In [None]:
# Your code

In [None]:
%timeit #Your code

In [None]:
%timeit A.dot(B)

## Useful
We can get a list of all the available methods for a python object

In [None]:
print('Numpy array available methods:')
print(dir(array_1))

In [None]:
print('List available methods')
print(dir(list_A))

We can also get all the available functions for a package

In [None]:
import inspect

all_functions = inspect.getmembers(np, inspect.isfunction) # here the first argument np is for numpy

In [None]:
all_functions

#### Actually it doesn't give _all_ the available functions. 
Trigonometric functions, some mathematical ones are not included (still don't know why).
To look the __complete__ list of the numpy package I suggest to use the standard way:
type np. and press the TAB key.