# Trying Out Perceptrons

In this project, we tested out different encoding and imputation schemes using a much simpler data regressor. Now we use the best methods found from there onto a perceptron to get even better results. 

In [1]:
# Before Starting up anything, we need to add the folder containing all the source code to Jupyter Notebooks
import sys
import os

module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path+"\\project_code")

In [2]:
from pathlib import Path
import matplotlib.pyplot as plt
from missingdata import DataImputer
from exploration_helper_functions import *
import seaborn as sns
from encoding import DataEncoder
import tensorflow as tf

# Loading up and Preparing data
data_path = Path(r'data/netflix_data.csv')
df = load_data(data_path)

imputer = DataImputer()
imputer.fit_transform(df)
encoder = DataEncoder()
x, y = encoder.fit_transform(dataframe=df)

## Trying Out Different Models
### Model 1

In [3]:
model1 = tf.keras.models.Sequential([tf.keras.layers.Dense(16,  activation='relu'),
                                    tf.keras.layers.Dense(16,  activation='relu'),
                                    tf.keras.layers.Dense(8,  activation='relu'),
                                    tf.keras.layers.Dense(8,  activation='relu'),
                                    tf.keras.layers.Dense(4,  activation='relu'),
                                    tf.keras.layers.Dense(1)])
model1.compile(optimizer='adam', loss='mse')
model1.fit(x.toarray(), y, epochs=30, validation_split=0.2)

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<tensorflow.python.keras.callbacks.History at 0x29e43afa4f0>

This model gives us a training MSE of 1.26e-4 and validation MSE 5.2e-3. Both of which are much lower the loss from our previous XGBoost regressor.

### Model 2
Using Softmax activation

In [5]:
model1 = tf.keras.models.Sequential([tf.keras.layers.Dense(16,  activation='softmax'),
                                    tf.keras.layers.Dense(16,  activation='softmax'),
                                    tf.keras.layers.Dense(8,  activation='softmax'),
                                    tf.keras.layers.Dense(8,  activation='softmax'),
                                    tf.keras.layers.Dense(4,  activation='softmax'),
                                    tf.keras.layers.Dense(1)])
model1.compile(optimizer='adam', loss='mse')
model1.fit(x.toarray(), y, epochs=60, validation_split=0.2)

Epoch 1/60
Epoch 2/60
Epoch 3/60
Epoch 4/60
Epoch 5/60
Epoch 6/60
Epoch 7/60
Epoch 8/60
Epoch 9/60
Epoch 10/60
Epoch 11/60
Epoch 12/60
Epoch 13/60
Epoch 14/60
Epoch 15/60
Epoch 16/60
Epoch 17/60
Epoch 18/60
Epoch 19/60
Epoch 20/60
Epoch 21/60
Epoch 22/60
Epoch 23/60
Epoch 24/60
Epoch 25/60
Epoch 26/60
Epoch 27/60
Epoch 28/60
Epoch 29/60
Epoch 30/60
Epoch 31/60
Epoch 32/60
Epoch 33/60
Epoch 34/60
Epoch 35/60
Epoch 36/60
Epoch 37/60
Epoch 38/60
Epoch 39/60
Epoch 40/60
Epoch 41/60
Epoch 42/60
Epoch 43/60
Epoch 44/60
Epoch 45/60
Epoch 46/60
Epoch 47/60
Epoch 48/60
Epoch 49/60
Epoch 50/60
Epoch 51/60
Epoch 52/60
Epoch 53/60
Epoch 54/60
Epoch 55/60
Epoch 56/60
Epoch 57/60
Epoch 58/60
Epoch 59/60
Epoch 60/60


<tensorflow.python.keras.callbacks.History at 0x29ecd6326a0>

This model gives us a training MSE of 1.88e-4 and validation MSE 5.3e-3. which pretty comparable to our previous model although this time it took a more epochs to converge.

### Model 3
A deeper model

In [8]:
model1 = tf.keras.models.Sequential([tf.keras.layers.Dense(64,  activation='relu'),
                                    tf.keras.layers.Dense(64,  activation='relu'),
                                    tf.keras.layers.Dense(32,  activation='relu'),
                                    tf.keras.layers.Dense(32,  activation='relu'),
                                    tf.keras.layers.Dense(16,  activation='relu'),
                                    tf.keras.layers.Dense(16,  activation='relu'),
                                    tf.keras.layers.Dense(8,  activation='relu'),
                                    tf.keras.layers.Dense(8,  activation='relu'),
                                    tf.keras.layers.Dense(4,  activation='relu'),
                                    tf.keras.layers.Dense(1)])
model1.compile(optimizer='adam', loss='mse')
model1.fit(x.toarray(), y, epochs=40, validation_split=0.2)

Epoch 1/40
Epoch 2/40
Epoch 3/40
Epoch 4/40
Epoch 5/40
Epoch 6/40
Epoch 7/40
Epoch 8/40
Epoch 9/40
Epoch 10/40
Epoch 11/40
Epoch 12/40
Epoch 13/40
Epoch 14/40
Epoch 15/40
Epoch 16/40
Epoch 17/40
Epoch 18/40
Epoch 19/40
Epoch 20/40
Epoch 21/40
Epoch 22/40
Epoch 23/40
Epoch 24/40
Epoch 25/40
Epoch 26/40
Epoch 27/40
Epoch 28/40
Epoch 29/40
Epoch 30/40
Epoch 31/40
Epoch 32/40
Epoch 33/40
Epoch 34/40
Epoch 35/40
Epoch 36/40
Epoch 37/40
Epoch 38/40
Epoch 39/40
Epoch 40/40


<tensorflow.python.keras.callbacks.History at 0x29ecc0675e0>

This model gives us a training MSE of 7e-5 and validation MSE 5e-3. Which is also very comparable to the previous to models(At least in terms of validation loss). So increasing layers did not give us any performance advantage.

### Model 4
using tanh activation

In [9]:
model1 = tf.keras.models.Sequential([tf.keras.layers.Dense(16,  activation='relu'),
                                    tf.keras.layers.Dense(16,  activation='tanh'),
                                    tf.keras.layers.Dense(8,  activation='tanh'),
                                    tf.keras.layers.Dense(8,  activation='tanh'),
                                    tf.keras.layers.Dense(4,  activation='tanh'),
                                    tf.keras.layers.Dense(1)])
model1.compile(optimizer='adam', loss='mse')
model1.fit(x.toarray(), y, epochs=30, validation_split=0.2)

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<tensorflow.python.keras.callbacks.History at 0x29ecb452b80>

This model has 1.4e-4 training loss and 5.1e-3 validation loss which is also very comparable to the previous ones.

## Conclusion
Using a perceptron with 6 layers gives us a better performance than the XGBoost algortihms we usedd previously. 