## Lecture 3: Formalizing the problem

### by Long Nguyen

This homework notebook is supplemental to [Lecture 3](https://youtu.be/10NSz_zLEl4) of the series "Image Recognition with Neural Networks".

Many of these functions are implemented in the [video](https://youtu.be/10NSz_zLEl4).

In [1]:
from mnist_loader import load_data_wrapper
import numpy as np

In [2]:
training_data, validation_data, test_data = load_data_wrapper()

In [10]:
with open("parameters.npy", mode="rb") as r:
    parameters = np.load(r)
    W1, B1, W2, B2 = parameters

### Vectorization

In the lecture, we discussed vectorization. Suppose we have $m$ images:

$$\{(x^{(1)},y^{(1)}),(x^{(1)},y^{(1)}),\ldots,(x^{(m)},y^{(m)})\}$$

where $x^{(i)}\in\mathbb{R}^{784}$ are the images and $y^{(i)}\in\mathbb{R}^{10}$ are their one-hot encoding labels. 

We form $X$ by stacking horizontally the vectors $x^{(i)}$ and form $Y$ by stacking horizontally the vectors $y^{(i)}$.

If `X` is a list of $n$ `(m,1)` 2D numpy arrays, then the function `np.hstack(X)` will produce a `(m,n)` numpy array. 


In [3]:
import numpy as np

In [4]:
x1 = np.array([[1],[2],[3]])
x2 = np.array([[4],[5],[6]])
print(x1)
print(x1.shape)

[[1]
 [2]
 [3]]
(3, 1)


In [5]:
X = [x1,x2] # a list of 2 numpy arrays
print(np.hstack(X))

[[1 4]
 [2 5]
 [3 6]]


#### Write the function `vectorize_mini_batch` below which accepts a minibatch of `(image,label)` tuples of a certain `size` and calls `np.hstack` to return a tuple `X,Y` where `X` contains all of the images and `Y` contains all of the labels stacked horizontally. 

#### For example `X,Y = vectorize_mini_batch(training_data[0:20],20)` should return `X` of shape `(784,20)` and `Y` of shape `(10,20)`. 

Hint: You can create two empty lists and use the function append() to insert the images and labels. Or you can use list comprehensions. 

In [1]:
def vectorize_mini_batch(mini_batch):
    """Given a minibatch of (image,lable) tuples 
    return the tuple X,Y where X contains all of the images and Y contains
    all of the labels stacked horizontally """
    mini_batch_x = [mini_batch[k][0] for k in range(0,len(mini_batch))]
    mini_batch_y = [mini_batch[k][1] for k in range(0,len(mini_batch))]
    X = np.hstack(mini_batch_x)
    Y = np.hstack(mini_batch_y)
    return X, Y

In [7]:
X,Y = vectorize_mini_batch(training_data[0:20])

In [12]:
def sigmoid(x):
    return 1/(1+np.exp(-x))

#### Write the vectorized version of the score function or model f below. Here, `X` is the matrix of images stacked horizontally. 

#### Thus if, 
$$X=\left[ \begin{array}{rrr}
x^{(1)} & x^{(2)} & \ldots & x^{(m)} 
\end{array} \right]$$
#### is an (784,m) array then 
$$f(X)=\left[ \begin{array}{rrr}
f(x^{(1)}) & f(x^{(2)}) & \ldots & f(x^{(m)}) 
\end{array} \right]$$
#### is an (10,m) array.

In [8]:
def f(X, W1, W2, B1, B2):
    """Vectorized version. 
    Return the output of the network if ``X`` is the input consists
    of a collection of images. """
    Z1 =
    A1 = 
    Z2 = 
    A2 =
    return A2

#### Write the vectorized version of the predict function. 

In [None]:
def predict(images, W1, W2, B1, B2):
    """Vectorized version. 
    The parameter images is a list of (image, label) tuples. 
    Call vectorize_mini_batch and the vectorized version of the model f
    to predict the labels of the images. 
    Hint: Use np.argmax using an axis. 
    """
    
    
    return predictions

#### Use the predict function above to predict the first `20` images from the training set. 