## CS 109B/STAT 121B/AC 209B/CSCI E-109B: Homework 6
### Neural Networks - CNNs and RNNs
** Harvard University ** <br>
** Spring  2018 ** <br>
** Instructors:** Pavlos Protopapas and Mark Glickman 

---

### INSTRUCTIONS

- To submit your assignment follow the instructions given in canvas.
- Restart the kernel and run the whole notebook again before you submit. 
- Do not include your name(s) in the notebook if you are submitting as a group. 
- If you submit individually and you have worked with someone, please include the name of your [one] partner below. 

---

** Your partner's name (if you submit separately): **

** Enrollment Status (109B, 121B, 209B, or E109B): **

## Problem 1: Convolutional Neural Network Basics  (10 pts)

In convolutional neural networks, a convolution is a multiplicative operation on a local region of values. Convolutional layers have been very useful in image classification, as it allows the network to retain local spatial information for feature extraction.

### Part A: Understanding Convolutional Operations

For the following 2D matrix:

| | | |
|--|--|--|
|1|2|2|
|3|1|2|
|4|1|0|

Use the following 2x2 kernel to perform a 2D convolution on the matrix:

| | |
|--|--|
|2|1|
|1|2|

**1. Compute this operation by hand assuming a) valid, b) same, and c) full border modes. Please indicate what the resulting matrix shape is compared to the original shape. **

### Part B: Understanding Pooling Operations

Pooling operations are used in convolutional neural networks to reduce the dimensionality of the feature maps and overall network complexity. Two main types of pooling are used in CNNs: AveragePooling and MaxPooling.

** 1. Using the matrix below, write the output of the AveragePooling and MaxPooling operations with a pool size of 2x2 and stride 2x2. Then, write the outputs for the same operations, except with a stride size of 1.**


| | | | |
|--|--|--|--|
|1|2|2|4|
|3|1|2|1|
|4|1|0|2|
|5|2|2|1|

### Part C: Puppy Example 

Consider the following image of a dog, which you will find in `dog.jpg`:

![dog](dog.jpg)

Load the image as a 2D Numpy array. Normalize the image by the following operation so that values fall within [-0.5, 0.5].

**Perform the following steps for four images:**

**1. Randomly generate a 3x3 kernel.**

**2. Use this kernel and convolve over the image with same border mode (with [scipy.signal.convolve2d](https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.convolve2d.html)).**

**3. In the resulting image, set all pixel values less than zero to zero (using np.clip()). In other words:**


`
if x < 0:
    x = 0
else:
    x = x
`

(This is the `ReLU` activation function.)

**4. Plot the image.**

Take a moment to examine the convolved images. You should see that certain features in the puppy are accentuated, while others are de-emphasized. Now consider the effect of performing additional convolution operations on these filtered images and how to they relate to additional layers in a neural network.

## Problem 2: Running a Convolutional Neural Network (20 pts) 

### Part A: Building the Model

In this first part, you will create a convolutional neural network using Keras to predict the type of object in an image. Load the [CIFAR-10](https://keras.io/datasets/#cifar10-small-image-classification) dataset, which contains 50,000 32x32 training images and 10,000 test images of the same size, with a total of 10 sizes.

Use a combination of the [following layers](https://keras.io/layers/convolutional/): Conv2D, MaxPooling2D, Dense, Dropout and Flatten Layers (not necessarily in this order).
You may use an existing architecture like AlexNet or VGG16, or create one of your own design. However, you should write your own layers and not use a pre-written implementation.

Convolutional neural networks are very computationally intensive. We highly recommend that you train your model on JupyterHub using GPUs. On CPUs, this training can take up to several hours. On GPUs, it can be done within minutes.

** 1. Report the total number of parameters.**

** 2. How does the number of total parameters change (linearly, exponentially) as the number of filters per layer increases?**

** 3. Generate a plot showing this relationship and explain why.**

For instance, start by assigning 32 filters to each Conv2D layer, then 64, 128, etc. and recording the total number of parameters for each model.





## Part B: Training and Evaluating the Model
** Now train your model. You can choose to train your model for as long as you'd like, but you should aim for at least 10 epochs. ** Your validation accuracy should exceed 70%. Training for 10 epochs on a CPU should take about 30-60 minutes.

## Part C: Visualizing the Feature Maps

We would also like to examine the feature maps that are produced by the intermediate layers of the network.

**Using your model, extract 9 feature maps from an intermediate convolutional layer of your choice and plot the images in a 3x3 grid. Also plot your original input image (choose an image of your choice).**

You may use the helper function `get_feature_maps()` to extract weights.

In [10]:
import matplotlib.pyplot as plt
%matplotlib inline

def get_feature_maps(model, layer_id, input_image):
    model_ = Model(inputs=[model.input], outputs=[model.layers[layer_id].output])
    return model_.predict(np.expand_dims(input_image, axis=0))[0,:,:,:].transpose((2,0,1))

# Problem 3: Recurrent Neural Networks (20 pts)


## Learning to add numbers with a recurrent neural network

In this exercise, we will be using using recurrent neural network to add three digit numbers, encoded as character strings. 

For example, given a string '223+12', we would like to return '235', without teaching the model explicit addition rules.

You are given the class __CharacterTable__ to assist with encoding and decoding,  which is initialized below:

In [14]:
from HW6_functions import *
chars = '0123456789+ '
ctable = CharacterTable(chars)

__CharacterTable__ contains functions _encode_ and _decode_.

_encode_ takes in a string and the number of rows needed in the one hot encoding.

_decode_ returns the string corresponding to the encoded one hot encoding.

An example of usage below:

In [15]:
encoded_123 = ctable.encode('123', 3)
print("Encoded Format: \n {}".format(encoded_123))
decoded_123 = ctable.decode(encoded_123)
print("Decoded Format: {}".format(decoded_123))

Encoded Format: 
 [[0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]]
Decoded Format: 123


## Generating Training Data

Your first task is to create the data to train on. Luckily, we have virtually unlimited training data because addition is trivial for Python.

You will populate two arrays, _problems_ and _answers_, which contain your predictors and target variables.

Examples from _problems_:

In [16]:
'    1+7'

'  12+10'

'520+880'

'520+880'

Examples from _answers_:

In [17]:
'8   '

'22  '

'1400'

'1400'

Notice that spaces are inserted to the left and right of strings within _problems_ and _answers_ to keep the dimensions of the input and output the same. When adding three digit numbers, the maximum possible length of a string in _problems_ is 7, while the maximum possible length of a string in _answers_ is 4.


In [18]:
TRAINING_SIZE = 50000
DIGITS = 3
MAXLEN = DIGITS + 1 + DIGITS

In [1]:
problems = []
answers = []

** 1. Populate the two matrices _X_ and _y_, which contain the encoded version of problems and answers. **
The _i_ th row in both matrices should contain one encoded problem and answer, respectively. 

** 2. Next, shuffle your data and split it into training and validation sets. **
These matrices should be named x_train, y_train, x_val, and y_val.

## Building the Model

** 1. Using Keras, create a recurrent model that takes in _X_ and returns _y_. **
You are free to choose LSTM, or just a vanilla RNN to implement your model. Your model should take in NUM_LAYERS as a parameter.

** 2. Create and train models with 1, 2, and 3 layers over 50 epochs. Plot test accuracy as a function of epoch for each model. **
Note: You do not have to print the progress bars for each model in your final report, you only have to include the accuracy plots.

** 3. Which model has the highest test accuracy? By looking at the accuracy over epochs, what can you say about how depth affects training and performance for recurrent models? **

In [None]:
BATCH_SIZE = 
LAYERS = 

model = Sequential()

#Create model here

In [None]:
nb_epochs = 50
for iteration in range(1, nb_epochs):
    print()
    print('-' * 50)
    print('Iteration', iteration)
    results = model.fit(x_train, y_train,
              batch_size=BATCH_SIZE,
              epochs=1,
              validation_data=(x_val, y_val))
    # Select 10 samples from the validation set at random so we can visualize
    # errors.
    print_results(x_val, y_val, model)
    
    #To get validation accuracy per epoch, store results.history['val_acc'] in an array.