<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# Lab: Fun with Neural Nets

---

Below is a procedure for building a neural network to recognize handwritten digits.  The data is from [Kaggle](https://www.kaggle.com/c/digit-recognizer/data), and you will submit your results to Kaggle to test how well you did!

1. Load the training data (`train.csv`) from [Kaggle](https://www.kaggle.com/c/digit-recognizer/data)
2. Setup X and y (feature matrix and target vector)
3. Split X and y into train and test subsets.
4. Preprocess your data

   - When dealing with image data, you need to normalize your `X` by dividing each value by the max value of a pixel (255).
   - Since this is a multiclass classification problem, keras needs `y` to be a one-hot encoded matrix
   
5. Create your network.

   - Remember that for multi-class classification you need a softamx activation function on the output layer.
   - You may want to consider using regularization or dropout to improve performance.
   
6. Trian your network.
7. If you are unhappy with your model performance, try to tighten up your model by adding hidden layers, adding hidden layer units, chaning the activation functions on the hidden layers, etc.
8. Load in [Kaggle's](https://www.kaggle.com/c/digit-recognizer/data) `test.csv`
9. Create your predictions (these should be numbers in the range 0-9).
10. Save your predictions and submit them to Kaggle.

---

For this lab, you should complete the above sequence of steps for _at least_ two of the four "configurations":

1. Using a `tensorflow` network (we did _not_ cover this in class!)
2. Using a `keras` convolutional network
3. Using a `keras` network with regularization
4. Using a `tensorflow` convolutional network (we did _not_ cover this in class!)

In [1]:
import pandas as pd 
from sklearn.preprocessing import MinMaxScaler
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import *
import matplotlib.pyplot as plt
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D, GlobalMaxPooling2D
from tensorflow.keras import utils

# Keras CNN

In [2]:
train_data = pd.read_csv('data/train.csv')
X = train_data.drop ("label", axis = 1)
y = train_data['label']

In [3]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [4]:
X_train = X_train / 255.0
X_test = X_test / 255.0

# One-hot encode y
from keras.utils import to_categorical

y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

In [5]:
from keras.models import Sequential
from keras.layers import Dense, Dropout

model = Sequential()
model.add(Dense(128, activation='relu', input_shape=(784,)))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [6]:
history = model.fit(X_train, y_train, epochs=10, batch_size=64, validation_data=(X_test, y_test))

Epoch 1/10
[1m525/525[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.5807 - loss: 1.2488 - val_accuracy: 0.9201 - val_loss: 0.2666
Epoch 2/10
[1m525/525[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.8788 - loss: 0.4142 - val_accuracy: 0.9398 - val_loss: 0.2037
Epoch 3/10
[1m525/525[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 944us/step - accuracy: 0.9074 - loss: 0.3247 - val_accuracy: 0.9490 - val_loss: 0.1774
Epoch 4/10
[1m525/525[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 937us/step - accuracy: 0.9209 - loss: 0.2793 - val_accuracy: 0.9564 - val_loss: 0.1538
Epoch 5/10
[1m525/525[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 974us/step - accuracy: 0.9305 - loss: 0.2473 - val_accuracy: 0.9560 - val_loss: 0.1472
Epoch 6/10
[1m525/525[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 924us/step - accuracy: 0.9345 - loss: 0.2314 - val_accuracy: 0.9612 - val_loss: 0.1355
Epoch 7/10
[1m525/5

In [7]:
test_data = pd.read_csv('data/test.csv')

In [8]:
test_data = test_data / 255.0
predictions = model.predict(test_data)


[1m875/875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 271us/step


In [9]:
import numpy as np

# Convert predicted probabilities to class labels
predictions = np.argmax(predictions, axis=1)

# Create a range of ImageIds starting from 1
image_ids = range(1, len(predictions) + 1)

# Create a DataFrame with ImageId and Prediction columns
predictions_df = pd.DataFrame({'ImageId': image_ids, 'Label': predictions})

# Save the DataFrame to a CSV file
predictions_df.to_csv('predictions.csv', index=False)

-------
## Keras with Regularisation

In [10]:
from keras import regularizers

In [11]:
model = Sequential()
model.add(Dense(128, activation='relu', kernel_regularizer=regularizers.l2(0.01)))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu', kernel_regularizer=regularizers.l2(0.01)))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])


In [12]:
history1 = model.fit(X_train, y_train, epochs=10, batch_size=64, validation_data=(X_test, y_test))

Epoch 1/10
[1m525/525[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.5716 - loss: 2.5565 - val_accuracy: 0.9057 - val_loss: 0.7432
Epoch 2/10
[1m525/525[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.8434 - loss: 0.8989 - val_accuracy: 0.9076 - val_loss: 0.6262
Epoch 3/10
[1m525/525[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.8610 - loss: 0.7790 - val_accuracy: 0.9192 - val_loss: 0.5823
Epoch 4/10
[1m525/525[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.8742 - loss: 0.7330 - val_accuracy: 0.9225 - val_loss: 0.5477
Epoch 5/10
[1m525/525[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.8723 - loss: 0.7245 - val_accuracy: 0.9200 - val_loss: 0.5339
Epoch 6/10
[1m525/525[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.8770 - loss: 0.6940 - val_accuracy: 0.9217 - val_loss: 0.5252
Epoch 7/10
[1m104/525[0m 

In [None]:
test_data1 = pd.read_csv('data/test.csv')
test_data1 = test_data1 / 255.0
predictions1 = model.predict(test_data1)

[1m875/875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 275us/step


In [None]:
predictions1

array([[4.0207444e-05, 1.8835470e-05, 9.9724251e-01, ..., 6.4949505e-04,
        9.3971277e-05, 1.8326566e-06],
       [9.9541724e-01, 2.3616117e-07, 4.9268408e-04, ..., 2.6988040e-04,
        6.3026186e-05, 6.9658920e-05],
       [2.0837171e-03, 3.0334014e-03, 2.9302191e-03, ..., 1.5779218e-02,
        6.1544828e-02, 7.0317346e-01],
       ...,
       [8.5869160e-06, 6.1601029e-05, 8.5710851e-04, ..., 4.7521575e-05,
        1.3381143e-03, 1.7420357e-04],
       [1.8585748e-04, 2.3497243e-05, 1.2839003e-04, ..., 4.8515368e-03,
        3.6049779e-03, 9.5459437e-01],
       [6.9241528e-04, 3.6762343e-04, 9.8437357e-01, ..., 2.7369447e-03,
        1.6999011e-03, 7.9831414e-05]], dtype=float32)

In [None]:
ImageID = range(1, len(predictions) + 1)
predictions1_df = pd.DataFrame({'ImageId': ImageID, 'Label': predictions})
predictions1_df.to_csv('predictions1.csv', index=False)

ValueError: Per-column arrays must each be 1-dimensional

In [None]:
import numpy as np

# Convert predicted probabilities to class labels
#predictions1 = np.argmax(predictions1, axis=1)

# Create a range of ImageIds starting from 1
image_ids = range(1, len(predictions1) + 1)

# Create a DataFrame with ImageId and Prediction columns
predictions1_df = pd.DataFrame({'ImageId': image_ids, 'Label': predictions1})

# Save the DataFrame to a CSV file
predictions1_df.to_csv('predictions1.csv', index=False)


In [None]:
df = pd.read_csv('predictions1.csv')

In [None]:
print (predictions1.shape)

(28000, 10)
