**Name:** Cristina Caro González


**Exercise:**

Develop and train a Feedforward Artificial Neural Network in Keras in the following classification problem. The objective class is called "y" and it's the first column in both csv.

If you see the dataset is very big and you machine is not powerful enough to train in a reasonable time using it you may pick a subset as we saw in the notebook.

You may upload either a .py or a Jupyter Notebook.

I am going to try several methods, and little by little improving the methods, introducing regularization, dropout, etc. Methods that we have seen in clase, to improve our deep learning models.

In [1]:
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split

# Load the train and test dataset
train_df = pd.read_csv("w4_train_data.csv") ## We read the training dataset of our exercise
test_df = pd.read_csv("w4_test_data.csv") ## We read the testing dataset of our exercise

# test_df is the test dataset that is used to evaluate the final performance of the model. It contains the input features and the target(s) variables. The test_df is used 
# to measure the accuracy of the model after it has been trained on the training dataset.

# Split the train dataset into input features (X) and target variable (y)
X = train_df.iloc[:, 1:] # all columns except the first one, because the first one is our target variable.
y = train_df.iloc[:, 0] # first column, i.e, our target variable.

# Convert the target variable to categorical
y = to_categorical(y, num_classes=10) # We have 10 categories, because the target value goes from 0 to 9, i.e. 10 different values.

# Split the train data into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42) # In deep learning is very important to use a training and validation set. After training our model here, we will test it in the test dataset.
# I select a validation size of 20% . We use a seed of 42 so we can replicate the results.

# X_val and y_val are the validation data, a subset of the training data that is used to tune the hyperparameters of the model. The validation data is used to 
# monitor the performance of the model during training and to prevent overfitting, which, as we saw in the classes, is very common problem in deep learning

# Define the model architecture

model = Sequential()
model.add(Dense(64, activation='relu', input_shape=(784,)))
model.add(Dense(32, activation='relu'))
model.add(Dense(10, activation='softmax'))

# Compile the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) # I select the loss funcion categorical crossentropy, optimizer adam and metric accuracy.

# Train the model on the training data
history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_val, y_val)) # To start, we create a basic model with only 10 epochs

# Evaluate the model on the test data
test_loss, test_accuracy = model.evaluate(test_df.iloc[:, 1:], to_categorical(test_df.iloc[:, 0], num_classes=10)) # We test the results of the first model
print('Test Loss:', test_loss)
print('Test Accuracy:', test_accuracy)

# We can see that the final results are not good, this means we will have to improve the model using other techniques as regularization, dropout and number of epochs.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test Loss: 0.1903935819864273
Test Accuracy: 0.9569000005722046


With this first model, the results could be better. As we learnt in class, we can use different techniques to improve our model.
Some of them are: increase the number of epochs, use regularization techniques, increase the size of the model adding more layers or increasing the number of neurons, trying with different activation functions and use normalization and standarization, very important in deep learning.

In this second model, we are going to use regularization L2.

Additionally, we increase the number of epochs, from 10 to 40.

The rest of the model is basically the same as the previous one.

In this case, I avoid to repeat the comments I put in the previous model, as the would be the same.

In [2]:
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from keras.regularizers import l2 # This is to apply the regularization L2

# Load the train and test datasets
train_df = pd.read_csv("w4_train_data.csv")
test_df = pd.read_csv("w4_test_data.csv")

# Split the train dataset into input features (X) and target variable (y)
X = train_df.iloc[:, 1:] # all columns except the first one
y = train_df.iloc[:, 0] # first column

# Convert the target variable to categorical
y = to_categorical(y, num_classes=10)

# Split the train data into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# Define the model architecture
model = Sequential()
model.add(Dense(64, activation='relu', input_shape=(784,), kernel_regularizer=l2(0.001)))
model.add(Dense(32, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(10, activation='softmax'))

# We have to be careful with the regularization parameter, the model can get worse results if we use a L2 regularization with a weight too large, and it will underfit the training data.
# To improve the performance, we should reduce the L2 regularization weight. In this case, I have chosen 0.001

# Compile the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train the model on the training data
history = model.fit(X_train, y_train, epochs=40, batch_size=32, validation_data=(X_val, y_val))

# Evaluate the model on the test data
test_loss, test_accuracy = model.evaluate(test_df.iloc[:, 1:], to_categorical(test_df.iloc[:, 0], num_classes=10))
print('Test Loss:', test_loss)
print('Test Accuracy:', test_accuracy)


Epoch 1/40
Epoch 2/40
Epoch 3/40
Epoch 4/40
Epoch 5/40
Epoch 6/40
Epoch 7/40
Epoch 8/40
Epoch 9/40
Epoch 10/40
Epoch 11/40
Epoch 12/40
Epoch 13/40
Epoch 14/40
Epoch 15/40
Epoch 16/40
Epoch 17/40
Epoch 18/40
Epoch 19/40
Epoch 20/40
Epoch 21/40
Epoch 22/40
Epoch 23/40
Epoch 24/40
Epoch 25/40
Epoch 26/40
Epoch 27/40
Epoch 28/40
Epoch 29/40
Epoch 30/40
Epoch 31/40
Epoch 32/40
Epoch 33/40
Epoch 34/40
Epoch 35/40
Epoch 36/40
Epoch 37/40
Epoch 38/40
Epoch 39/40
Epoch 40/40
Test Loss: 0.20809772610664368
Test Accuracy: 0.9602000117301941


We can see that our results, in the second model, have improve a little bit we the changes we made.
Let's try a third model with more changes, in this case, we are going to increase the number of epochs to 50, we are going to also apply L2 regularization with a parameter of 0.001, we are going to increase the number of layers, and the number of neurons.

In the following model, we increase the number of neurons, and we add a new layer. We increase the number of epochs to 50 and the batch size from 32 to 64. The batch size determines the number of samples that are used to update the weights of the network in one iteration.

In [3]:
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from keras.regularizers import l2
from keras.optimizers import Adam

# Load the train and test datasets
train_df = pd.read_csv("w4_train_data.csv")
test_df = pd.read_csv("w4_test_data.csv")

# Split the train dataset into input features (X) and target variable (y)
X = train_df.iloc[:, 1:] # all columns except the first one
y = train_df.iloc[:, 0] # first column

# Convert the target variable to categorical
y = to_categorical(y, num_classes=10)

# Split the train data into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# Define the model architecture
model = Sequential()
model.add(Dense(128, activation='relu', input_shape=(784,), kernel_regularizer=l2(0.001))) #We use regularization as well, and we use 128 neurons in this case.
model.add(Dense(64, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(32, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(10, activation='softmax'))

# Compile the model
optimizer = Adam(lr=0.001)
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])

# Train the model on the training data
history = model.fit(X_train, y_train, epochs=50, batch_size=64, validation_data=(X_val, y_val)) #We set the number of epochs to 50 and the batch size to 64

# Evaluate the model on the test data
test_loss, test_accuracy = model.evaluate(test_df.iloc[:, 1:], to_categorical(test_df.iloc[:, 0], 10))


  super().__init__(name, **kwargs)


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [4]:
print('Test Loss:', test_loss)
print('Test Accuracy:', test_accuracy)
# We see that in this last model, the results have improved even more.

Test Loss: 0.1654212474822998
Test Accuracy: 0.9711999893188477
