# A Classic Neural Networks Tutorial

For this tutorial, we'll be using Keras. Keras is like a higher-level abstraction of Tensorflow -- a popular ML library -- and will allow us to do some pretty cool stuff without knowing a lot of linear algebra/calculus. :)

In [1]:
# fashion dataset import
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.utils import to_categorical

## The dataset - Fashion MNIST

We'll be using the MNIST (Modified National Institute of Standards and Technology database). They have a large database of handwritten digit images, but we'll be analyzing [Fashion-MNIST](https://github.com/zalandoresearch/fashion-mnist) for the following reasons:

**1.** The MNIST digit dataset is too easy, and even classic ML algorithms can achieve 97% accuracy on this dataset.

**2.** MNIST is overused.

**3.** MNIST is not very representative of stuff you might actually do; i.e. bad ideas might work well on the digit dataset, but not on most other datasets.

![title](img/fashion-dataset-ov.png)

### The Goal

We want to turn each of these 28x28 pixel, grayscale images into one of 10 classifications:

Label | Description
--- | --- 
0 | T-shirt/top
1 | Trouser
2 | Pullover
3 | Dress
4 | Coat
5 | Sandal
6 | Shirt
7 | Sneaker
8 | Bag
9 | Ankle Boot

The dataset is already presplit into 60,000 training images and 10,000 testing images. Let's get started by splitting the data into training, testing, and validation data!

## Intro to Perceptrons

![title](img/perceptron.png)

Notice the structure is a series of connected, feed-forward neurons (directed forward toward the output layer). We feed in data into the input layer, and the hidden layer neurons fire accordingly and communicate with the output layer.

Also notice the input layer is flat, whereas our data is 2D. How can we solve this?

### Flattening the data

In [9]:
# Load in data
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

In [10]:
x_train.shape

(60000, 28, 28)

In [11]:
# Flatten data from 2D to 1D:
# reshape(__, -1) means that we're telling numpy that there are
# two dimensions, and to infer the second dimension
x_train = x_train.reshape(x_train.shape[0], -1) / 255.0
x_test = x_test.reshape(x_test.shape[0], -1) / 255.0
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

In [12]:
# note 28 * 28 = 784!
x_train.shape

(60000, 784)

In [14]:
# Category
y_train[0]

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 1.], dtype=float32)

### Creating the Model

We'll be using the ```Sequential``` model with ```Dense``` layers. If you're curious, a ```Dense``` layer is just a regular NN layer that does: ```output = activation(dot(input, kernel) + bias)```.

This means that it takes the dot product of your input vector and a weight matrix, and adds a bias vector.

In [16]:
# Literally the simplest neural net model that Keras has
# Sequential is a linear stack of layers
model = Sequential()

In [18]:
# add layers to model
# also note the activation functions we're using
model.add(Dense(10, input_dim=784, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

In [20]:
# Train
model.fit(x_train, y_train, epochs=10, validation_split=0.1)

Train on 54000 samples, validate on 6000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x1364b5a50>

In [21]:
# Widen network
model2 = Sequential()
model2.add(Dense(50, input_dim=784, activation='relu'))
model2.add(Dense(10, activation='softmax'))
model2.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model2.fit(x_train, y_train, epochs=10, validation_split=0.1)

Train on 54000 samples, validate on 6000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x16b704390>

In [22]:
# Deeper
model3 = Sequential()
model3.add(Dense(50, input_dim=784, activation='relu'))
model3.add(Dense(50, activation='relu'))
model3.add(Dense(10, activation='softmax'))
model3.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model3.fit(x_train, y_train, epochs=10, validation_split=0.1)

Train on 54000 samples, validate on 6000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x1370a3390>

In [23]:
_, test_acc = model3.evaluate(x_test, y_test)
print(test_acc)



0.8773


### Convolutional Neural Nets

In [25]:
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten
import numpy as np
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
x_train = x_train[:,:,:,np.newaxis] / 255.0
x_test = x_test[:,:,:,np.newaxis] / 255.0
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

In [27]:
model4 = Sequential()
model4.add(Conv2D(filters=64, kernel_size=2, padding='same', activation='relu', input_shape=(28,28, 1))) 
model4.add(MaxPooling2D(pool_size=2))
model4.add(Flatten())
model4.add(Dense(10, activation='softmax'))
model4.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

In [28]:
model4.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 28, 28, 64)        320       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 64)        0         
_________________________________________________________________
flatten (Flatten)            (None, 12544)             0         
_________________________________________________________________
dense_11 (Dense)             (None, 10)                125450    
Total params: 125,770
Trainable params: 125,770
Non-trainable params: 0
_________________________________________________________________


In [29]:
model4.fit(x_train, y_train, epochs=10, validation_split=0.1)

Train on 54000 samples, validate on 6000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x138564150>

In [30]:
_, test_acc = model4.evaluate(x_test, y_test)
print(test_acc)



0.8988
