Deep Learning Lab 02 - Neural Network Basics
# Implementing a Multi-layer Perceptron in NumPy

## 0 - Packages
import the packages to be used:

Set the numpy random seed to 42 so that the results are consistent

In [0]:
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

np.random.seed(42) # set a seed so that the results are consistent

BTW: Line numbers can be enabled by pressing `CTRL`+`M`+`L`.

## 1 - Obtaining the Image Dataset

Execute the function below to download the dataset to be used in this lab to `/tmp/binary_flowers.npz`:

In [0]:
#@title dataset downloader
import requests

def download_file_from_google_drive(id, destination):
    URL = "https://docs.google.com/uc?export=download"
    session = requests.Session()
    response = session.get(URL, params = { 'id' : id }, stream = True)
    token = get_confirm_token(response)
    if token:
        params = { 'id' : id, 'confirm' : token }
        response = session.get(URL, params = params, stream = True)
    save_response_content(response, destination)

def get_confirm_token(response):
    for key, value in response.cookies.items():
        if key.startswith('download_warning'):
            return value
    return None

def save_response_content(response, destination):
    CHUNK_SIZE = 32768
    with open(destination, "wb") as f:
        for chunk in response.iter_content(CHUNK_SIZE):
            if chunk: # filter out keep-alive new chunks
                f.write(chunk)

download_file_from_google_drive('1BoDcLIEqfb9qD5VJBKr4BhArluPyOBrh', '/tmp/binary_flowers.npz')

The dataset is provided as archive of numpy arrays, each stored as binary file. Unpack them to `X_train, y_train, X_test, y_test`:

In [0]:
with np.load('/tmp/binary_flowers.npz') as data:
    X_train, y_train, X_test, y_test = [ data[key] for key in ['X_train', 'y_train', 'X_test', 'y_test'] ]

Plot some images to get an idea about the task:

In [0]:
fig, ax = plt.subplots(nrows=2, ncols=5, sharex=True, sharey=True,)
fig.set_size_inches(14, 8)
ax = ax.flatten()
for class_ in range(2):
  for sample_ in range(5):
    img = X_train[y_train == class_][sample_] + .5
    ax[5*class_ + sample_].axis('Off')
    ax[5*class_ + sample_].imshow(img)
plt.tight_layout()
plt.show()

How many training examples do you have? In addition, what is the `shape` of the variables `X_train` and `y_train`?

In [0]:
print('Number training examples:', X_train.shape[0])
print(X_train.shape)
print(y_train.shape)

### Vectorization over Training Examples

In order to vectorize the code, reshape the images of shape `(n_samples, width, height, channels)` to an array of shape `(n_features, n_samples)`

Reshape the training and test data. Evaluate your result by plotting the first training sample. For plotting as image, you'll again have to reshape to an array of shape `(width, height, channels)`.

In [0]:
print(X_train.shape)

X_train.shape = (X_train.shape[0], np.product(X_train.shape[1:]))
X_train = X_train.T

X_test.shape = (X_test.shape[0], np.product(X_test.shape[1:]))
X_test = X_test.T

print(X_train.shape, X_test.shape)
plt.imshow(X_train[:, 0].reshape(64,64,3) + .5)

## 2 - Implementing a 2 Layer Neural Network

Implement the neural network.

Complete the implementation of the class NeuralNet and its functions based on our discussion.


In [0]:
import sys

class NeuralNet():
  '''Feed-forward neural network with 1 hidden layer
  
  Parameters
  ------------
  n_hidden : int (default:10)
  l2_lambda : float (default: .01)
        Lambda value for L2-regularization.
  epochs : int (default: 20)
        Number of runs over the complete training set.
  alpha : float (default: .001)
        Learning rate
  shuffle : bool (default: True)
        If true, training data is shuffled every epoch.
  minibatch_size : int (default: 1)
      Number of training samples per minibatch.
  seed : int (default: None)
      Random seed for initializing weights and shuffling.

  Attributes
  ------------
  eval_ : dict
    Dictionary collecting the cost, training accuracy,
    and validation accuracy for every training epoch.
  '''

  def __init__(self, n_hidden=10,
               l2_lambda=.01, epochs=20, 
               alpha=.001, shuffle=True,
               minibatch_size=1, seed=None):
    
    self.random = np.random.RandomState(seed)
    self.n_hidden = n_hidden
    self.l2_lambda = l2_lambda
    self.epochs = epochs
    self.alpha = alpha
    self.shuffle = shuffle
    self.minibatch_size = minibatch_size
    self._number_of_parameters = None

  def _sigmoid(self, z):
    '''Compute sigmoid (logistic) function'''
    return 1/(1+np.exp(-z))

  def _forward(self, X):
    '''Compute forward propagation step'''

    # Step 1: net input of hidden layer
    # (n_hidden, n_features) dot (n_features, n_samples) => (n_hidden, n_samples)
    z_hidden = np.dot( self.w_hidden, X ) + self.b_hidden

    # Step 2: activation of hidden layer
    a_hidden = self._sigmoid( z_hidden )

    # Step 3: input of output layer
    # (n_output, n_hidden) dot (n_hidden, n_samples) => (n_output, n_samples)
    z_output = np.dot( self.w_output, a_hidden) + self.b_output

    # Step 4: activation of output layer
    a_output = self._sigmoid( z_output )

    return z_hidden, a_hidden, z_output, a_output

  def _compute_cost(self, y, output):
    '''Compute the cost function.

    Parameters
    ------------
    y : array, shape = (n_samples,)
        Array of binary labels.
    output : array, shape = (n_samples,)
        Activation of the output layer.

    Returns
    ---------
    cost : float
        (Regularized) cost of the output.
    '''

    L2_term = ( self.l2_lambda * 
               (np.sum(self.w_hidden ** 2.) + 
                np.sum(self.w_output ** 2.)) )
    
    cost = - y * np.log(output) - (1-y)*np.log(1-output) + L2_term

    return cost

  def predict(self, X):
    '''Predict class labels
    
    Parameters
    ------------
    X : array, shape = (n_features, n_samples)
        Original input features.
        
    Returns
    ---------
    y_prediction : array, shape = (n_samples)
        Predicted boolean label.
    '''
    
    z_hidden, a_hidden, z_output, a_output = self._forward(X)
    y_prediction = np.round( a_output )
    
    return y_prediction
  
  def initialize_weights(self, n_features):
    '''Initialize weights of the hidden and output layer

    Parameters
    ------------
    n_features : int
        Number of input features.
    '''
    # Weights of the hidden layer
    self.w_hidden = np.random.normal(scale = .1, size=(self.n_hidden, n_features))
    self.b_hidden = np.zeros( (self.n_hidden, 1) )
    
    # Weights of the output layer
    self.w_output = np.random.normal(scale = .1, size=(1, self.n_hidden))
    self.b_output = np.zeros( (1, 1) )
    
    print('Network initialized. Total number of parameters: {}'.format(self.number_of_parameters))

  @property
  def number_of_parameters(self):
    if hasattr(self, 'w_hidden'):
      self._number_of_parameters = np.sum([layer.size for layer in (
          self.w_hidden, self.b_hidden, self.w_output, self.b_output)])
    else:
      print('No network parameters found.')
    return self._number_of_parameters

  def plot_loss(self):
    if not hasattr(self, 'eval_'):
      print('No evaluation history found. Run `.fit` to train your model.')
      return
    plt.plot(range(self.epochs), self.eval_['cost'])
    plt.ylabel('Cost')
    plt.xlabel('Epochs')
    plt.show()

  def plot_accuracy(self):
    if not hasattr(self, 'eval_'):
      print('No evaluation history found. Run `.fit` to train your model.')
      return
    plt.plot(range(self.epochs), self.eval_['train_acc'], 
         label='training')
    plt.plot(range(self.epochs), self.eval_['valid_acc'], 
            label='validation', linestyle='--')
    plt.ylabel('Accuracy')
    plt.xlabel('Epochs')
    plt.legend()
    plt.show()

  def fit(self, X_train, y_train, X_valid, y_valid):
    '''Learn weights from training data.

    Parameters
    ------------
    X_train : array, shape = (n_features, n_samples)
        Original input features for training.
    y_train : array, shape = (n_samples,)
        Array of binary labels for training.
    X_valid : array, shape = (n_features, n_samples)
        Original input features for validation.
    y_valid : array, shape = (n_samples,)
        Array of binary labels for validation.

    Returns
    ---------
    self
    '''
  
    return self


## 3 - Testing the Forward Computation
Now it's time to initialize the network. As no backpropagation is implemented yet (you'll do that in the next lab!), you can only compute the forward propagation.

Create an instance of the class `NeuralNet` with number of hidden nodes to be 10, size of minibatch and epochs to be 100:

In [0]:
my_NN = NeuralNet(n_hidden=10, minibatch_size=100, epochs=100, shuffle=False)

If you initialize the weights randomly, the untrained network will return random predictions.

For binary classification task, random guessing should return an accuracy of 50%:

In [0]:
my_NN.initialize_weights(X_test.shape[0])
predictions = my_NN.predict(X_test)
print('Accuracy: {:.2f}%'.format( np.mean(y_test == predictions) ))