<div style="font-size: 200%; font-weight: bold; color: maroon;">301_Intro_to_numpy</div>


## Some introduction to numpy

numpy (most frequently imported as np) is the linear algebra library for python environments. 

In order to work -to implement vectorization, which is the basis of the computational advantage of numpy- you have first to define a numpy object: a matrix.

Once defined a matrix, ie x, you can simply call vectorized functions using np. syntax.

For instance:

- np.exp(x) works for any np.array x and applies the exponential function to every coordinate

In summary, numpy has efficient built-in functions for computing matrices, it is fast because it vectorizes the computations


The following is adapted from the great Andrew Ng's coursera on deep learning

## Important numpy links / reference material 

numpy official user guide (good, but long):  https://numpy.org/doc/stable/user/

one simple / quick numpy cheatsheet : https://s3.amazonaws.com/dq-blog-files/numpy-cheat-sheet.pdf

This tutorial looks rather useful:
https://physics.nyu.edu/pine/pymanual/html/chap3/chap3_arrays.html

# Vectorization


In machine or deep learning, you deal with very large datasets. Hence, a non-computationally-optimal function can become a huge bottleneck in your algorithm and can result in a model that takes ages to run. To make sure that your code is  computationally efficient, you will use vectorization. For example, try to tell the difference between the following implementations of the dot/outer/elementwise product.

In [None]:
import time
import random
import numpy as np


El objeto básico de numpy son las matrices. Las matrices son colecciones multidimensionales, esto es, pueden tener entre 1 y n dimensiones. 

## PROPIEDADES BÁSICAS DE LAS MATRICES:

- Cada dimensión de una matriz se puede definir como una lista, y de hecho una matriz numpy será una lista de listas (dobles/triples... [])pero:
- Todos los elementos de la matriz deben ser del mismo tipo (entero, float -sería raro pero es posible que fueran strings o booleanos)
- Todas las listas (las "columnas" de esa matriz/tabla multidimensional) deben tener el mismo número de elementos. Si falta alguno debería contener np.nan (NaN, not a number)

In [None]:
matriz1 = np.array([[1., 0., 0.],
                    [0., 1., 2.]])

In [None]:
matriz1.shape

In [None]:
matriz1.ndim

In [None]:
unos = np.ones((2, 3, 4), dtype=np.int16)
unos

In [None]:
aleat1 = np.random.rand(2, 3, 4)

In [None]:
aleat1

In [None]:
aleat1[0, 2, 1]

In [None]:
data = np.array([[1, 2], [5, 3], [4, 6]])

In [None]:
data

In [None]:
data.shape

Índices de numpy arrays son diferentes de los de objetos (tablas) pandas, y son de hecho un poco confusos. Para una buena explicación:

https://www.sharpsightlabs.com/blog/numpy-axes-explained/

In [None]:
data.max(axis=0)

In [None]:
data.max(axis=1)

## Check the advantages of vectorization

In [None]:
x1 = [random.random() for e in range(10**4)]  # remember list comprehension
x2 = [random.random() for e in range(10**4)]

In [None]:
print(type(x1))

In [None]:
print(len(x1))

For the next example code, we will make basic matrix computations **without** numpy, this is using classical for loops. 

Later we will do the same but using vectorization, ie, numpy

NOTE:

- np.zeros(x, y) produces a matrix with zeros with dimensions x, y

- np.random.rand(x, y) creates a random float numbers matrix with dimensions x, y

In [None]:
### CLASSIC DOT PRODUCT OF VECTORS IMPLEMENTATION ###
tic = time.process_time()
dot = 0
for i in range(len(x1)):
    dot+= x1[i]*x2[i]
toc = time.process_time()
print ("dot = " + str(dot))
print ("\n ----- Computation time dot product (for) = " + str(1000*(toc - tic)) + "ms")


### CLASSIC OUTER PRODUCT IMPLEMENTATION ###
tic = time.process_time()
outer = np.zeros((len(x1),len(x2))) # we create a len(x1)*len(x2) matrix with only zeros
for i in range(len(x1)):
    for j in range(len(x2)):
        outer[i,j] = x1[i]*x2[j]
toc = time.process_time()
#print ("outer = " + str(outer))
print ("\n ----- Computation time outer prod (for)= " + str(1000*(toc - tic)) + "ms")

### CLASSIC ELEMENTWISE IMPLEMENTATION ###
tic = time.process_time()
mul = np.zeros(len(x1))
for i in range(len(x1)):
    mul[i] = x1[i]*x2[i]
toc = time.process_time()
#print ("elementwise multiplication = " + str(mul))
print ("\n ----- Computation time elementwise (for)= " + str(1000*(toc - tic)) + "ms")

### CLASSIC GENERAL DOT PRODUCT IMPLEMENTATION ###
W = np.random.rand(3,len(x1)) # Random 3*len(x1) numpy array
tic = time.process_time()
gdot = np.zeros(W.shape[0])
for i in range(W.shape[0]):
    for j in range(len(x1)):
        gdot[i] += W[i,j]*x1[j]
toc = time.process_time()
#print ("gdot = " + str(gdot))
print ("\n ----- Computation time dot prod general (for) = " + str(1000*(toc - tic)) + "ms")

# Exercise

Look for the np methods for:

1. dot product of vectors

2. outer product of vectors

3. elementwise multiplication

4. general dot product of W (previously generated) and x1


As you may have noticed, the vectorized implementation is much cleaner and more efficient. For bigger vectors/matrices, you simply **cannot compute them without numpy**. 

**Note** that `np.dot()` performs a matrix-matrix or matrix-vector multiplication. This is different from `np.multiply()` and the `*` operator (which is equivalent to  `.*` in Matlab/Octave), which performs an element-wise multiplication.

# Exercise: Loss and cost functions for binary target in numpy
## i.e. as logistic regression, or those linking x and y through sigmoid function

The sigmoid function in a generalized linear model computes the likelihood (ie a kind of probability) that an arbitrary output is either 0 or 1. Its formula is:
$$ \sigma(z^{(i)}) = \frac{1}{(1 + e^{(-z)})}$$

- The loss is used to evaluate the performance of your model. The bigger your loss is, the more different your predictions ($ \hat{y} $) are from the true values ($y$). 
- In deep learning, you use optimization algorithms like Gradient Descent to train your model and to minimize the cost.
- For binary targets loss is defined as:

For one example $x^{(i)}$:
$$z^{(i)} = w^T x^{(i)} + b \tag{1}$$
$$\hat{y}^{(i)} = a^{(i)} = sigmoid(z^{(i)})\tag{2}$$ 
$$ \mathcal{L}(a^{(i)}, y^{(i)}) =  - y^{(i)}  \log(a^{(i)}) - (1-y^{(i)} )  \log(1-a^{(i)})\tag{3}$$

The cost is then computed by summing over all training examples:
$$ J = \frac{1}{m} \sum_{i=1}^m \mathcal{L}(a^{(i)}, y^{(i)})\tag{6}$$


In [None]:
# COMPUTE THE SIGMOID FUNCTION WITH NUMPY

def sigmoid(z):
    """
    Compute the sigmoid of z

    Arguments:
    z -- A scalar or numpy array of any size.

    Return:
    s -- sigmoid(z)
    """

    ### START CODE HERE ### (1 line of code) REMEMBER USE np method!!!

    return s

In [None]:
print ("sigmoid([0, 2]) = " + str(sigmoid(np.array([0,2]))))

**Expected Output**: 

<table>
  <tr>
    <td>**sigmoid([0, 2])**</td>
    <td> [ 0.5         0.88079708]</td> 
  </tr>
</table>

In [None]:
# ERROR FUNCTION

def loss(a, y):
    """
    Arguments:
    yhat = a =  vector of size m (predicted labels)
    y -- vector of size m (true labels)
    
    Returns:
    loss -- the value of the loss function defined above **for each target element**
    """
    
    # 1 line to compute loss
    
    return loss

In [None]:
a = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])
print("loss = " + str(loss(a,y)))

**Expected Output**:

<table style="width:20%">
     <tr> 
       <td> **loss** </td> 
       <td> [0.10536052 0.22314355 0.10536052 0.91629073 0.10536052] </td> 
     </tr>
</table>

## Write the cost function

Once you have computed the loss function you can easily compute (using numpy) the cost function (which is not more than the average of the loss functions across the y or actual target vector).

The cost can be computed by summing over all training examples:
$$ J = \frac{1}{m} \sum_{i=1}^m \mathcal{L}(a^{(i)}, y^{(i)})\tag{6}$$


In [None]:
# FUNCTION: compute cost

def cost(a, y):
    """
    Computes the cost function by summing loss over all training examples.
    
    Arguments:
    a -- A numpy vector or array
    y -- A scalar or numpy vector
    
    You must obtain
    m -- the size of y (use len(y))
    
    Return:
    cost -- Your computed cost.
    """
    
    ### START CODE HERE ### (≈ 2 lines of code)
    ### HINT : Define first m as a property of the input object y

    ### END CODE HERE ###
    
    return cost

In [None]:
a = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])
print("loss = " + str(loss(a,y)))
print("cost = " + str(cost(a,y)))

# What to remember
- Vectorization is very important in deep learning. It provides computational efficiency and clarity.
- You have reviewed the loss and cost functions for binary targets
- You are familiar with many numpy functions such as np.sum, np.dot, np.multiply, np.maximum, etc...