# Final Project
### Course Introduction to Deep Learning & Neural Networks with Keras

### Part D

D. Increase the number of hidden layers (5 marks)

Repeat part B but use a neural network with the following instead:

- Three hidden layers, each of 10 nodes and ReLU activation function.

How does the mean of the mean squared errors compare to that from Step B?

In [1]:
# basic data manipulation
import pandas as pd
import numpy as np

# scikit-learn pacaakges
from sklearn.model_selection import train_test_split # separate train data and test data
from sklearn import metrics # to evaluate the model
from sklearn.preprocessing import StandardScaler

# keras packages
from keras.models import Sequential # basic model
from keras.layers import Dense # layers

#### Load dataset

In [2]:
df = pd.read_csv("concrete_data.csv")
df.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


#### Standardize columns

In [3]:
for column in df.columns.values:
    new_column = "_".join(column.split(" "))
    new_column = new_column.lower()
    df.rename(columns={column: new_column}, inplace=True)

df.head()

Unnamed: 0,cement,blast_furnace_slag,fly_ash,water,superplasticizer,coarse_aggregate,fine_aggregate,age,strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


### Separate features and target

Let's predict the strength column, so we will use all the others columns as features

In [4]:
X = df.drop(columns=["strength"])
Y = df["strength"]
X.shape,Y.shape

((1030, 8), (1030,))

In [5]:
n_features = X.shape[1]
X = StandardScaler().fit_transform(X)
X

array([[ 2.47791487, -0.85688789, -0.84714393, ...,  0.86315424,
        -1.21767004, -0.27973311],
       [ 2.47791487, -0.85688789, -0.84714393, ...,  1.05616419,
        -1.21767004, -0.27973311],
       [ 0.49142531,  0.79552649, -0.84714393, ..., -0.52651741,
        -2.24091709,  3.55306569],
       ...,
       [-1.27008832,  0.75957923,  0.85063487, ..., -1.03606368,
         0.0801067 , -0.27973311],
       [-1.16860982,  1.30806485, -0.84714393, ...,  0.21464081,
         0.19116644, -0.27973311],
       [-0.19403325,  0.30849909,  0.3769452 , ..., -1.39506219,
        -0.15074782, -0.27973311]])

### Define principal function

In [6]:
def neural_network_D(n_cols):
    """
    This function returns a neural network with the follow specifications:
        - Three hidden layers with 10 nodes each one (ReLU as activation function)
        - adam optimizer and MSE as loss function
    """
    model = Sequential()
    model.add(Dense(10, activation="relu", input_shape=(n_cols,)))
    model.add(Dense(10, activation="relu"))
    model.add(Dense(10, activation="relu"))
    model.add(Dense(1))
    
    model.compile(optimizer="adam", loss='mean_squared_error')
    
    return model

## Fit & Predict

In this part, we will do a for loop with 50 iterations, where we will create a neural network, fit it, predict with it and evaluate it. Meanwhile we will saving the metrics (MSE) in a list

We will use 30% with test porpuses, and the other 70% as train data.

In [7]:
mean_squared_error_list = []
for _ in range(51):
    X_train, X_test, y_train, y_test = train_test_split(X,Y, test_size=.3, random_state=42)
    D = neural_network_D(n_features)
    D.fit(X_train, y_train, epochs=100, verbose=0)
    
    prediction = D.predict(X_test)
    mean_squared_error_list.append(metrics.mean_squared_error(y_true=y_test, y_pred=prediction))

## Mean of the errors

In [8]:
print("The mean between the 50 different errors is:: ", np.mean(mean_squared_error_list))
print("The standard deviation between the 50 different errors is::", np.std(mean_squared_error_list))

The mean between the 50 different errors is::  87.37144562587042
The standard deviation between the 50 different errors is:: 23.260415355429704
