# Convolutional Neural Networks

In this notebook we will implement a convolutional neural network. Rather than doing everything from scratch we will make use of [TensorFlow 2](https://www.tensorflow.org/) and the [Keras](https://keras.io) high level interface.

## Installing TensorFlow and Keras

TensorFlow and Keras are not included with the base Anaconda install, but can be easily installed by running the following commands on the Anaconda Command Prompt/terminal window:
```
conda install notebook jupyterlab nb_conda_kernels
conda create -n tf tensorflow ipykernel mkl
```
Once this has been done, you should be able to select the `Python [conda env:tf]` kernel from the Kernel->Change Kernel menu item at the top of this notebook. Then, we import TensorFlow package:

In [1]:
import tensorflow as tf

## Creating a simple network with TensorFlow

We will start by creating a very simple fully connected feedforward network using TensorFlow/Keras. The network will mimic the one we implemented previously, but TensorFlow/Keras will take care of most of the details for us.

### MNIST Dataset

First, let us load the MNIST digits dataset that we will be using to train our network. This is available directly within Keras:

In [2]:
(x_train, y_train),(x_test, y_test) = tf.keras.datasets.mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


The data comes as a set of integers in the range [0,255] representing the shade of gray of a given pixel. Let's first rescale them to be in the range [0,1]:

In [3]:
x_train, x_test = x_train / 255.0, x_test / 255.0

Now we can build a neural network model using Keras. This uses a very simple high-level modular structure where we only have the specify the layers in our model and the properties of each layer. The layers we will have are as follows:
1. Input layer: This will be a 28x28 matrix of numbers.
2. `Flatten` layer: Convert our 28x28 pixel image into an array of size 784.
3. `Dense` layer: a fully-connected layer of the type we have been using up to now. We will use 30 neurons and the sigmoid activation function.
4. `Dense` layer: fully-connected output layer. 

In [4]:
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(30, activation='sigmoid'),
  tf.keras.layers.Dense(10, activation='softmax')
])

In [5]:
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

In [6]:
model.fit(x_train, y_train, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x2cb4d6efa30>

In [7]:
model.evaluate(x_test, y_test)



[0.1678958684206009, 0.9520999789237976]

#### Exercises
Experiment with this network:
1. Change the number of neurons in the hidden layer.
2. Add more hidden layers.
3. Change the activation function in the hidden layer to `relu`.
4. Change the activation in the output layer to something other than `softmax`.
How does the performance of your network change with these modifications?

#### Task
Implement the neural network in "[Gradient-based learning applied to document recognition](http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf)", by Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner.