<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# Lab: Fun with Neural Nets

---

Below is a procedure for building a neural network to recognize handwritten digits.  The data is from [Kaggle](https://www.kaggle.com/c/digit-recognizer/data), and you will submit your results to Kaggle to test how well you did!

1. Load the training data (`train.csv`) from [Kaggle](https://www.kaggle.com/c/digit-recognizer/data)
2. Setup X and y (feature matrix and target vector).
3. Split X and y into train and test subsets.
4. Preprocess your data:

   - When dealing with image data, you need to normalize your `X` by dividing each value by the max value of a pixel (255).
   - Since this is a multiclass classification problem, keras needs `y` to be a one-hot encoded matrix.
   
5. Create your network:
   - Remember that for multi-class classification you need a softmax activation function on the output layer.
   - You may want to consider using regularization or dropout to improve performance.
   
6. Train your network.
7. If you are unhappy with your model performance, try to tighten up your model by adding hidden layers, adding hidden layer units, chaining the activation functions on the hidden layers, etc.
8. Load in [Kaggle's](https://www.kaggle.com/c/digit-recognizer/data) `test.csv`.
9. Create your predictions (these should be numbers in the range 0-9).
10. Save your predictions and submit them to Kaggle.

---

For this lab, you should complete the above sequence of steps for **_at least_** two of the four **"configurations"**:

1. Using a `tensorflow` network
2. Using a `keras` convolutional network
3. Using a `keras` network with regularization
4. Using a `tensorflow` convolutional network (we did _not_ cover this in class!)

In [4]:
import pandas as pd
import numpy as np
from  sklearn.preprocessing import OneHotEncoder
from sklearn.model_selection import train_test_split
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Dense, Dropout

In [5]:
# Read data
train_df = pd.read_csv('digit-recognizer/train.csv')
test_df = pd.read_csv('digit-recognizer/test.csv')

In [6]:
# check null
train_df.isnull().sum()[train_df.isnull().sum()!=0]

Series([], dtype: int64)

In [7]:
# check null values
test_df.isnull().sum()[test_df.isnull().sum() != 0]

Series([], dtype: int64)

In [8]:
# define features and target
X = train_df.drop(columns=['label'])
y = train_df['label']

In [9]:
# hotcode y
onehot = OneHotEncoder(sparse_output=False)
y_onehot = onehot.fit_transform(y.values.reshape(-1,1))

In [10]:
# Normalize pixel values
X = X / 255.0

In [11]:
# Train test split the data
X_train, X_test, y_train, y_test = train_test_split(X, y_onehot, random_state=42, stratify=y_onehot)

In [12]:
X_train.shape, X_test.shape, y_train.shape, y_test.shape

((31500, 784), (10500, 784), (31500, 10), (10500, 10))

### Using a tensorflow network

In [61]:
# Initiate the model
model = Sequential([Input(shape=(X_train.shape[1],)),
                   Dense(256, activation='relu'), 
                   Dense(128, activation='relu'),
                   Dense(10, activation='softmax')
                   ])

# compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy']
             )

# Train the model
model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2) # 32 batch size as a standard

Epoch 1/10
[1m788/788[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - accuracy: 0.8484 - loss: 0.5209 - val_accuracy: 0.9511 - val_loss: 0.1611
Epoch 2/10
[1m788/788[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9606 - loss: 0.1285 - val_accuracy: 0.9624 - val_loss: 0.1166
Epoch 3/10
[1m788/788[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.9758 - loss: 0.0792 - val_accuracy: 0.9641 - val_loss: 0.1172
Epoch 4/10
[1m788/788[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9835 - loss: 0.0541 - val_accuracy: 0.9686 - val_loss: 0.1085
Epoch 5/10
[1m788/788[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9901 - loss: 0.0348 - val_accuracy: 0.9683 - val_loss: 0.1107
Epoch 6/10
[1m788/788[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9914 - loss: 0.0285 - val_accuracy: 0.9692 - val_loss: 0.1131
Epoch 7/10
[1m788/788[0m 

<keras.src.callbacks.history.History at 0x17c3dca40>

In [63]:
# make prediciton 
X_test = test_df.values / 255.0
preds = model.predict(X_test)

[1m875/875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 429us/step


In [65]:
# determine which class the model predict by using np.argmax()
prediction = np.argmax(preds, axis=1)
prediction

array([2, 0, 9, ..., 3, 9, 2])

In [67]:
X_test.shape, y_test.shape

((28000, 784), (10500, 10))

In [71]:
# put into dataframe 
sub1 = pd.DataFrame({'ImageId': range(1, len(prediction) + 1), 'Label': prediction})
sub1.to_csv('Results/tensorflow-network.csv', index=False)

### Using a Keras Network with Regularization

In [45]:
# Build the network with regularization
model_reg = Sequential([Input(shape=(X_train.shape[1],)),
    Dense(128, activation='relu'),
    Dropout(0.3),  # Dropout regularization
    Dense(64, activation='relu'),
    Dropout(0.3),
    Dense(10, activation='softmax')
])

# Compile the model
model_reg.compile(optimizer='adam', 
                  loss='categorical_crossentropy', 
                  metrics=['accuracy'])

# Train the model
model_reg.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2)


Epoch 1/10
[1m788/788[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.7073 - loss: 0.9186 - val_accuracy: 0.9375 - val_loss: 0.2110
Epoch 2/10
[1m788/788[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9180 - loss: 0.2741 - val_accuracy: 0.9475 - val_loss: 0.1753
Epoch 3/10
[1m788/788[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.9349 - loss: 0.2112 - val_accuracy: 0.9546 - val_loss: 0.1520
Epoch 4/10
[1m788/788[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.9469 - loss: 0.1787 - val_accuracy: 0.9589 - val_loss: 0.1359
Epoch 5/10
[1m788/788[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.9537 - loss: 0.1504 - val_accuracy: 0.9643 - val_loss: 0.1218
Epoch 6/10
[1m788/788[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.9590 - loss: 0.1340 - val_accuracy: 0.9625 - val_loss: 0.1302
Epoch 7/10
[1m788/788[0m 

<keras.src.callbacks.history.History at 0x177e3b4d0>

In [73]:
# make prediciton 
X_test = test_df.values / 255.0
preds = model_reg.predict(X_test)

[1m875/875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 320us/step


In [77]:
# put into dataframe 
sub2 = pd.DataFrame({'ImageId': range(1, len(prediction) + 1), 'Label': prediction})
sub2.to_csv('Results/Network_with_Regularization.csv', index=False)

kaggkle score for both model are 0.96471