# Fun with Neural Nets

---

# Tensorflow/Keras Lab: Digit Recognition

In this lab, you'll create a submission to [Kaggle's Digit Recognizer](https://www.kaggle.com/c/digit-recognizer) competition using a neural network you've built in Keras.

From Kaggle:

> MNIST ("Modified National Institute of Standards and Technology") is the de facto “hello world” dataset of computer vision. Since its release in 1999, this classic dataset of handwritten images has served as the basis for benchmarking classification algorithms. As new machine learning techniques emerge, MNIST remains a reliable resource for researchers and learners alike.

> In this competition, your goal is to correctly identify digits from a dataset of tens of thousands of handwritten images. We’ve curated a set of tutorial-style kernels which cover everything from regression to neural networks. We encourage you to experiment with different algorithms to learn first-hand what works well and how techniques compare.


Below is a procedure for building a neural network to recognize handwritten digits.  The data is from Kaggle, and you will submit your results to Kaggle to test how well you did!

1. Load the training data (`train.csv`) from Kaggle
2. Setup X and y (feature matrix and target vector)
3. Split X and y into train and test subsets.
4. Preprocess your data

   - When dealing with image data, you need to normalize your `X` by dividing each value by the max value of a pixel (255).
   - Since this is a multiclass classification problem, keras needs `y` to be a one-hot encoded matrix
   
5. Create your network.

   - Remember that for multi-class classification you need a softamx activation function on the output layer.
   - You may want to consider using regularization or dropout to improve performance.
   
6. Trian your network.
7. If you are unhappy with your model performance, try to tighten up your model by adding hidden layers, adding hidden layer units, chaning the activation functions on the hidden layers, etc.
8. Load in Kaggle's `test.csv`
9. Create your predictions (these should be numbers in the range 0-9).
10. Save your predictions and submit them to Kaggle.

---

For this lab, you should complete the above sequence of steps for _at least_ two of the three "configurations":

1. Using a `tensorflow` network
2. Using a `keras` "sequential" network
3. Using a `keras` convolutional network
4. Using a `tensorflow` convolutional network (we did _not_ cover this in class!)

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.utils import np_utils
from keras.datasets import mnist
from keras.utils import to_categorical

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_regression

Using TensorFlow backend.


In [2]:
# 1.Load the training data (train.csv) from Kaggle
training_df = pd.read_csv('digit-recognizer/train.csv')

In [3]:
training_df.head()

Unnamed: 0,label,pixel0,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,...,pixel774,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783
0,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,4,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [4]:
# 2.Setup X and y (feature matrix and target vector) 
X = training_df.drop(columns = 'label')
y = training_df['label']

In [5]:
# 3. Split X and y into train and test subsets.
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state = 42)


In [6]:
X_train.shape

(31500, 784)

In [7]:
X_test.shape

(10500, 784)

## 4. Preprocess your data



In [8]:
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

In [9]:
X_train_sc = X_train/255
X_test_sc = X_test/255

In [10]:
X_train_sc = X_train_sc.values.reshape(X_train_sc.shape[0], 28, 28, 1)
X_test_sc = X_test_sc.values.reshape(X_test_sc.shape[0], 28, 28, 1)

In [11]:
y_train = pd.get_dummies(y_train)
y_test = pd.get_dummies(y_test)

In [12]:
#y_train = y_train.values

In [13]:
#y_test = y_test.values

## Create Your Network

In [14]:
model = Sequential()

In [15]:
model.add(Flatten())

In [16]:


model.add(Dense(128, input_shape=(28, 28), activation='relu'))

model.add(Dense(10, activation='relu'))

model.compile(loss = 'categorical_crossentropy',
             optimizer='adam',
             metrics=['accuracy'])



Instructions for updating:
Colocations handled automatically by placer.


In [18]:
model.fit(X_train_sc,
          y_train.values, 
          batch_size=128,
          validation_data=(X_test_sc, y_test),
          epochs=10,
          verbose=1)

Instructions for updating:
Use tf.cast instead.
Train on 31500 samples, validate on 10500 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x13f1c3940>

In [19]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
flatten_1 (Flatten)          (None, 784)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               100480    
_________________________________________________________________
dense_2 (Dense)              (None, 10)                1290      
Total params: 101,770
Trainable params: 101,770
Non-trainable params: 0
_________________________________________________________________


In [20]:
score = model.evaluate(X_test_sc,
                      y_test,
                      verbose = 1)

labels = model.metrics_names



In [21]:
score

[nan, 0.09761904761904762]

In [22]:
print(f'{labels[0]}: {score[0]}')
print(f'{labels[1]}: {score[1]}') # Why is my sequential model so bad?

loss: nan
acc: 0.09761904761904762


## CNN MODEL

In [69]:
cnn_model = Sequential()

In [70]:
cnn_model.add(Conv2D(filters = 8,            # number of filters
                     kernel_size = 3,        # height/width of filter
                     activation='relu',      # activation function 
                     input_shape=(28,28,1))) # shape of input (image)

In [71]:
cnn_model.add(MaxPooling2D(pool_size=(2,2)))

In [72]:
cnn_model.add(Conv2D(16,
                     kernel_size=3,
                     activation='relu'))

In [73]:
cnn_model.add(MaxPooling2D(pool_size=(2,2)))


In [74]:
cnn_model.add(Flatten())

In [75]:
cnn_model.add(Dense(128, activation='relu'))

In [76]:
cnn_model.add(Dense(10, activation='softmax'))

In [77]:
cnn_model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])

In [78]:
cnn_model.fit(X_train_sc,
              y_train.values,
              batch_size=64,
              validation_data=(X_test_sc, y_test),
              epochs=15,
              verbose=1)

Train on 31500 samples, validate on 10500 samples
Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


<keras.callbacks.History at 0x176b83f60>

In [82]:
cnn_score = cnn_model.evaluate(X_test_sc,
                              y_test,
                              verbose=1)



In [83]:
cnn_labels = cnn_model.metrics_names

In [84]:
print(f'CNN {cnn_labels[0]}  : {cnn_score[0]}')
print(f'CNN {cnn_labels[1]}   : {cnn_score[1]}')

CNN loss  : 0.046079806208902264
CNN acc   : 0.9877142857142858


In [35]:
print(f'CNN {cnn_labels[0]}  : {cnn_score[0]}')
print(f'CNN {cnn_labels[1]}   : {cnn_score[1]}')
print()
print(f'FFNN {labels[0]} : {score[0]}')
print(f'FFNN {labels[1]}  : {score[1]}')

CNN loss  : 0.07258203695994979
CNN acc   : 0.9779047619047619

FFNN loss : nan
FFNN acc  : 0.09761904761904762


## 8. Load in Kaggle's test.csv

In [85]:
test_df = pd.read_csv('digit-recognizer/test.csv')

In [86]:
test_df.head()

Unnamed: 0,pixel0,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,pixel9,...,pixel774,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0



### 9. Create your predictions (these should be numbers in the range 0-9).

In [87]:
test_df = test_df.astype('float32')

In [88]:
test_df /= 255

In [89]:
test_df = test_df.values.reshape(test_df.shape[0], 28, 28, 1 )

In [90]:
preds = cnn_model.predict_classes(test_df)

In [91]:
preds_df = pd.DataFrame(preds, columns = ['Label'])

In [92]:
preds_df.index.name = 'ImageID'

In [93]:
preds_df.index += 1

In [94]:
preds_df.head(50)

Unnamed: 0_level_0,Label
ImageID,Unnamed: 1_level_1
1,2
2,0
3,9
4,9
5,3
6,7
7,0
8,3
9,0
10,3


In [95]:
preds.shape

(28000,)

In [96]:
preds_df.to_csv('Thirdsub.csv')