<div align=center><font size = 6>Build a Regression Model in Keras</font></dev>

<div><font size=3><strong>Table Of Contents</strong></font></div>

1. Some useful functions
2. Loading input corpus
3. Reviewing the loaded data
4. Normalizing input data
5. Splitting corpus into traning set and testing set
6. A - Experiment with a baseline model
7. B - Experiment with Normalized Data
8. C. Increate the number of epochs
9. D. Increase the number of hidden layers
10. Discussion

<strong>The dataset is about the compressive strength of different samples of concrete based on the volumes of the different ingredients that were used to make them. Ingredients include:</strong>

<strong>1. Cement</strong>

<strong>2. Blast Furnace Slag</strong>

<strong>3. Fly Ash</strong>

<strong>4. Water</strong>

<strong>5. Superplasticizer</strong>

<strong>6. Coarse Aggregate</strong>

<strong>7. Fine Aggregate</strong>

# Some useful functions

Let's start by importing the libraries, such as, os, pandas, numpy, scikit_learn, etc.

In [46]:
import os
import pandas as pd
import numpy as np

from keras.models import Sequential
from keras.layers import Dense

In [2]:
def get_mse(y_tst, y_tst_hat):
    """Residual sum of squares (MSE)
    """
    return np.mean(np.power(y_tst - y_tst_hat, 2))

def get_rmse(y_tst, y_tst_hat):
    """Root of Residual sum of squares (RMSE) or Standard deviation of residuals 
    """
    return np.sqrt(np.mean(np.power(y_tst - y_tst_hat, 2)))

def get_round(score, num_of_digits=2):
    """Get round with given number of decimal digits 
    """
    return round(score, num_of_digits)

def get_report_mse_and_rmse(y_test, y_hat, name_of_experiment):
    """Get report (dataframe) of two metrics: 
    The mean and the standard deviation of the mean squared errors
    """
    mse_baseline = get_mse(y_test, y_hat)
    mse_baseline_score = get_round(mse_baseline[COL_NAME_STRENGTH])

    rmse_baseline = get_rmse(y_test, y_hat)
    rmse_baseline_score = get_round(rmse_baseline[COL_NAME_STRENGTH])

    print("The mean and the standard deviation of the mean squared errors are: {} and {}, respectively".format(
          mse_baseline_score, rmse_baseline_score))

    values = [[name_of_experiment, mse_baseline_score, rmse_baseline_score]]

    df_result = pd.DataFrame(columns=header_of_df_mse_and_rmse, data=values)
    return df_result

# Loading input corpus

Let's assign the path of input corpus and the relative column names

In [3]:
file_input_path = "concrete_data.csv"

COL_NAME_CEMENT = "Cement"
COL_NAME_BLAST_FURNACE_SLAG = "Blast Furnace Slag"
COL_NAME_FLY_ASH = "Fly Ash"
COL_NAME_WATER = "Water"
COL_NAME_SUPERPLASTICIZER = "Superplasticizer"
COL_NAME_COARSE_AGGREGATE = "Coarse Aggregate"
COL_NAME_FINE_AGGREGATE = "Fine Aggregate"
COL_NAME_AGE = "Age"
COL_NAME_STRENGTH = "Strength"

COL_NAME_EXPERIMENT = "Experiment"
COL_NAME_MSE = "MSE"
COL_NAME_RMSE = "RMSE"

# This dataframe contains three columns: 
# name_of_experiments, mse, rmse
header_of_df_mse_and_rmse = [COL_NAME_EXPERIMENT, COL_NAME_MSE, COL_NAME_RMSE]
df_mse_and_rmse = pd.DataFrame(columns=header_of_df_mse_and_rmse, data=[])

Let's verify the path of input-file

In [4]:
if os.path.exists(file_input_path):
    print("We will load the data from file '{}' to dataframe.".format(file_input_path))
else:
    print("File not found : {}".format(file_input_path))

We will load the data from file 'concrete_data.csv' to dataframe.


Let's read input data into a dataframe

In [5]:
df = pd.read_csv(file_input_path, header=0)

# Reviewing the loaded data

In [6]:
df.columns

Index(['Cement', 'Blast Furnace Slag', 'Fly Ash', 'Water', 'Superplasticizer',
       'Coarse Aggregate', 'Fine Aggregate', 'Age', 'Strength'],
      dtype='object')

In [7]:
df.describe()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
count,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0
mean,281.167864,73.895825,54.18835,181.567282,6.20466,972.918932,773.580485,45.662136,35.817961
std,104.506364,86.279342,63.997004,21.354219,5.973841,77.753954,80.17598,63.169912,16.705742
min,102.0,0.0,0.0,121.8,0.0,801.0,594.0,1.0,2.33
25%,192.375,0.0,0.0,164.9,0.0,932.0,730.95,7.0,23.71
50%,272.9,22.0,0.0,185.0,6.4,968.0,779.5,28.0,34.445
75%,350.0,142.95,118.3,192.0,10.2,1029.4,824.0,56.0,46.135
max,540.0,359.4,200.1,247.0,32.2,1145.0,992.6,365.0,82.6


In [8]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1030 entries, 0 to 1029
Data columns (total 9 columns):
Cement                1030 non-null float64
Blast Furnace Slag    1030 non-null float64
Fly Ash               1030 non-null float64
Water                 1030 non-null float64
Superplasticizer      1030 non-null float64
Coarse Aggregate      1030 non-null float64
Fine Aggregate        1030 non-null float64
Age                   1030 non-null int64
Strength              1030 non-null float64
dtypes: float64(8), int64(1)
memory usage: 72.5 KB


In [9]:
df.head(3)

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27


So the first concrete sample has "540" cubic meter of cement, "0" cubic meter of blast furnace slag, "0" cubic meter of fly ash, "162" cubic meter of water, "2.5" cubic meter of superplaticizer, "1040" cubic meter of coarse aggregate, "676" cubic meter of fine aggregate. Such a concrete mix which is "28" days old, has a compressive strength of "79.99" MPa. 

In [10]:
print("(row, column) = {}".format(df.shape))

(row, column) = (1030, 9)


So, there are approximately 1000 samples to train our model on when splitting with 30% for the data of testing.

Let's check the data for any missing value

In [11]:
df.isnull().sum()

Cement                0
Blast Furnace Slag    0
Fly Ash               0
Water                 0
Superplasticizer      0
Coarse Aggregate      0
Fine Aggregate        0
Age                   0
Strength              0
dtype: int64

As you see, the above input corpus look pretty good to train the model. However, we could use the normalization technique to normalize it. 

# Normalizing input data

In [12]:
list_of_column_names = df.columns
list_of_column_names

Index(['Cement', 'Blast Furnace Slag', 'Fly Ash', 'Water', 'Superplasticizer',
       'Coarse Aggregate', 'Fine Aggregate', 'Age', 'Strength'],
      dtype='object')

## Splitting into predictors and target

Filtering the list of column names of dataframe predictors

In [13]:
list_of_col_names_predictors = [x for x in list_of_column_names 
                                if x != COL_NAME_STRENGTH]

In [14]:
list_of_col_names_predictors

['Cement',
 'Blast Furnace Slag',
 'Fly Ash',
 'Water',
 'Superplasticizer',
 'Coarse Aggregate',
 'Fine Aggregate',
 'Age']

In [15]:
df_predictors = df[list_of_col_names_predictors]

In [16]:
df_target = df[[COL_NAME_STRENGTH]]

Reviewing the data in two dataframes: predictors and target

In [17]:
df_predictors.head(3)

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270


In [18]:
df_target.head(3)

Unnamed: 0,Strength
0,79.99
1,61.89
2,40.27


## Applying normalization method

Recall that one way to normalize the data is by subtracting the mean from the individual predictors and dividing by the standard deviation.

In [19]:
df_predictors_norm = (df_predictors - df_predictors.mean())/df_predictors.std()

In [20]:
df_predictors_norm.head(3)

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,0.862735,-1.217079,-0.279597
1,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,1.055651,-1.217079,-0.279597
2,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,3.55134


# Splitting corpus into training set and testing set

In [21]:
# Geting a random 70% of the entire set
X_train = df_predictors.sample(frac=0.7, random_state=1)

# List of index in X_train
list_of_indices_in_x_train = X_train.index
"""
Int64Index([339, 244, 882, 567, 923, 358, 576,  27, 994, 563,
            ...
            999, 342, 256, 182, 617, 300, 690, 791, 240, 230],
           dtype='int64', length=721)
"""

# Getting the left out portion of the dataset
X_test = df_predictors.loc[~df_predictors.index.isin(list_of_indices_in_x_train)]

y_train = df_target.loc[df_target.index.isin(list_of_indices_in_x_train)]
y_test = df_target.loc[~df_target.index.isin(list_of_indices_in_x_train)]

# Normalized input corpus
X_train_norm = df_predictors_norm.loc[df_predictors_norm.index.isin(list_of_indices_in_x_train)]
X_test_norm = df_predictors_norm.loc[~df_predictors_norm.index.isin(list_of_indices_in_x_train)]

In [22]:
print("Training set: ", X_train.shape, y_train.shape)
print("Testing set: ", X_test.shape, y_test.shape)

print("-" * 72)

print("Training set - after normalizing: ", X_train_norm.shape, y_train.shape)
print("Testing set - after normalizing: ", X_test_norm.shape, y_test.shape)

Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
------------------------------------------------------------------------
Training set - after normalizing:  (721, 8) (721, 1)
Testing set - after normalizing:  (309, 8) (309, 1)


# A - Experiment with a baseline model

Use the Keras library to build a neural network with the following:

   + One hidden layer of **10 nodes**, and a **ReLU activation function**
   
   + Use the **adam optimizer** and the **mean squared error as the loss function**.

1. **Randomly split** the data into a training and test sets by holding **30% of the data for testing**. You can use the train_test_split helper function from Scikit-learn.

2. Train the model on the training data using **50 epochs**.

3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.

4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

5. Report the mean and the standard deviation of the mean squared errors.

## Building and Training with the baseline model

In [23]:
num_of_features = X_train.shape[1]
print("Number of features for input layer : ", num_of_features)

Number of features for input layer :  8


In [24]:
def build_baseline_model(num_of_features=3):
    """ Building baseline model that contains:

    + One hidden layer of 10 nodes, and a ReLU activation function.
    + Use the adam optimizer and the mean squared error as the loss function.
    """
    
    # Create model
    model = Sequential()

    model.add(Dense(10, activation="relu", input_shape=(num_of_features,)))
    model.add(Dense(1))

    # Compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

def build_and_fit_baseline_model(X, y, num_of_features=3, epochs=50):
    # Build baseline model
    model = build_baseline_model(num_of_features=num_of_features)

    # Fit the built model with training set
    model.fit(X, y, epochs=epochs)
    
    return model

In [25]:
# Build baseline model & Fit the built model with training set
epochs = 50
model_A = build_and_fit_baseline_model(X_train, 
                                      y_train, 
                                      num_of_features=num_of_features,
                                      epochs=epochs)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


## Predicting the result

In [26]:
y_hat_A = model_A.predict(X_test)
y_hat_A[:3]

array([[20.566294],
       [43.373783],
       [29.387392]], dtype=float32)

## Report the mean and the standard deviation of the mean squared errors

In [27]:
y_hat = y_hat_A
name_of_experiment = "Baseline-Raw (50 epochs)"

df_result_baseline = get_report_mse_and_rmse(y_test, y_hat, name_of_experiment)

# Report the mean and the standard deviation of the mean squared errors
df_result_baseline

The mean and the standard deviation of the mean squared errors are: 277.65 and 16.66, respectively


Unnamed: 0,Experiment,MSE,RMSE
0,Baseline-Raw (50 epochs),277.65,16.66


In [28]:
# Concat baseline dataframe into result
df_mse_and_rmse = pd.concat([df_mse_and_rmse, df_result_baseline], axis=0)

# Review the result dataframe
df_mse_and_rmse

Unnamed: 0,Experiment,MSE,RMSE
0,Baseline-Raw (50 epochs),277.65,16.66


# B - Experiment with Normalized Data

Repeat Part A but use a normalized version of the data. Recall that one way to normalize the data is by subtracting the mean from the individual predictors and dividing by the standard deviation.

How does the mean of the mean squared errors compare to that from Step A?

## Normalize the data 
by substracting the mean and dividing by the standard deviation.

### Before normalization

In [29]:
X_train.head(3)

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
339,297.2,0.0,117.5,174.8,9.5,1022.8,753.5,3
244,238.1,0.0,94.1,186.7,7.0,949.9,847.0,3
882,140.0,133.0,103.0,200.0,7.0,916.0,753.0,28


In [30]:
X_test.head(3)

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
7,380.0,95.0,0.0,228.0,0.0,932.0,594.0,28
10,198.6,132.4,0.0,192.0,0.0,978.4,825.5,90
15,380.0,0.0,0.0,228.0,0.0,932.0,670.0,90


### After normalization

In [31]:
X_train_norm.head(3)

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,0.862735,-1.217079,-0.279597
1,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,1.055651,-1.217079,-0.279597
2,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,3.55134


In [32]:
X_test_norm[:3]

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
7,0.945704,0.244603,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,-0.279597
10,-0.790075,0.678079,-0.846733,0.488555,-1.038638,0.070492,0.647569,0.701883
15,0.945704,-0.856472,-0.846733,2.174405,-1.038638,-0.526262,-1.291914,0.701883


## Building and Training with the baseline model after normalizing the data with 50 epochs

In [33]:
# Build baseline model & Fit the built model with training set
epochs = 50
model_B = build_and_fit_baseline_model(X_train_norm, 
                                       y_train, 
                                       num_of_features=num_of_features,
                                       epochs=epochs)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


## Predicting the result

In [34]:
y_hat_B = model_B.predict(X_test_norm)
y_hat_B[:3]

array([[15.706019 ],
       [ 7.2380624],
       [16.208492 ]], dtype=float32)

## Report the mean and the standard deviation of the mean squared errors

In [35]:
y_hat = y_hat_B
name_of_experiment = "Normalized-1 Hidden Layers(50 epochs)"

df_result_baseline = get_report_mse_and_rmse(y_test, y_hat, name_of_experiment)

# Report the mean and the standard deviation of the mean squared errors
df_result_baseline

The mean and the standard deviation of the mean squared errors are: 405.01 and 20.12, respectively


Unnamed: 0,Experiment,MSE,RMSE
0,Normalized-1 Hidden Layers(50 epochs),405.01,20.12


In [36]:
# Concat baseline dataframe into result
df_mse_and_rmse = pd.concat([df_mse_and_rmse, df_result_baseline], axis=0)

# Review the result dataframe
df_mse_and_rmse

Unnamed: 0,Experiment,MSE,RMSE
0,Baseline-Raw (50 epochs),277.65,16.66
0,Normalized-1 Hidden Layers(50 epochs),405.01,20.12


# C. Increate the number of epochs

Repeat Part B but use 100 epochs this time for training.

How does the mean of the mean squared errors compare to that from Step B?

## Building and Training with the baseline model after normalizing the data with 100 epochs

In [37]:
# Build baseline model & Fit the built model with training set
epochs = 100
model_C = build_and_fit_baseline_model(X_train_norm, 
                                       y_train, 
                                       num_of_features=num_of_features,
                                       epochs=epochs)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100


## Predicting the result

In [38]:
y_hat_C = model_C.predict(X_test_norm)
y_hat_C[:3]

array([[45.088585],
       [20.125774],
       [34.869392]], dtype=float32)

## Report the mean and the standard deviation of the mean squared errors

In [39]:
y_hat = y_hat_C
name_of_experiment = "Normalized-1 Hidden Layers(100 epochs)"

df_result_baseline = get_report_mse_and_rmse(y_test, y_hat, name_of_experiment)

# Report the mean and the standard deviation of the mean squared errors
df_result_baseline

The mean and the standard deviation of the mean squared errors are: 177.38 and 13.32, respectively


Unnamed: 0,Experiment,MSE,RMSE
0,Normalized-1 Hidden Layers(100 epochs),177.38,13.32


In [40]:
# Concat baseline dataframe into result
df_mse_and_rmse = pd.concat([df_mse_and_rmse, df_result_baseline], axis=0)

# Review the result dataframe
df_mse_and_rmse

Unnamed: 0,Experiment,MSE,RMSE
0,Baseline-Raw (50 epochs),277.65,16.66
0,Normalized-1 Hidden Layers(50 epochs),405.01,20.12
0,Normalized-1 Hidden Layers(100 epochs),177.38,13.32


# D. Increase the number of hidden layers

Repeat part B but use a neural network with the following instead:

- Three hidden layers, each of 10 nodes and ReLU activation function.

How does the mean of the mean squared errors compare to that from Step B?

## Building and Training with the model

In [41]:
def build_model(num_of_features=3):
    """ Building model that contains:
    
    + Three hidden layers, each of 10 nodes and ReLU activation function.    
    + Use the adam optimizer and the mean squared error as the loss function.
    """
    
    # Create model
    model = Sequential()

    model.add(Dense(10, activation="relu", input_shape=(num_of_features,)))
    model.add(Dense(10, activation="relu"))
    model.add(Dense(10, activation="relu"))
    model.add(Dense(1))

    # Compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

def build_and_fit_model(X, y, num_of_features=3, epochs=50):
    # Build baseline model
    model = build_model(num_of_features=num_of_features)

    # Fit the built model with training set
    model.fit(X, y, epochs=epochs)
    
    return model

## Building and Training with the model after normalizing the data with 50 epochs

In [42]:
# Build baseline model & Fit the built model with training set
epochs = 50
model_D = build_and_fit_model(X_train_norm, 
                              y_train, 
                              num_of_features=num_of_features,
                              epochs=epochs)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


## Predicting the result

In [43]:
y_hat_D = model_D.predict(X_test_norm)
y_hat_D[:3]

array([[33.291393],
       [22.518543],
       [32.91312 ]], dtype=float32)

## Report the mean and the standard deviation of the mean squared errors

In [44]:
y_hat = y_hat_D
name_of_experiment = "Normalized-3 Hidden Layers(50 epochs)"
                                  
# Report the mean and the standard deviation of the mean squared errors
df_result_baseline = get_report_mse_and_rmse(y_test, y_hat, name_of_experiment)

The mean and the standard deviation of the mean squared errors are: 100.04 and 10.0, respectively


In [45]:
# Concat baseline dataframe into result
df_mse_and_rmse = pd.concat([df_mse_and_rmse, df_result_baseline], axis=0)

# Review the result dataframe
df_mse_and_rmse

Unnamed: 0,Experiment,MSE,RMSE
0,Baseline-Raw (50 epochs),277.65,16.66
0,Normalized-1 Hidden Layers(50 epochs),405.01,20.12
0,Normalized-1 Hidden Layers(100 epochs),177.38,13.32
0,Normalized-3 Hidden Layers(50 epochs),100.04,10.0


# Discussion

As you see, the mean squared error (MSE) tells us how close a regression line is to our testing set. 

Thus, according to the mean squared error, the smaller score, the closer we are finding the regression line of best fit.

Indeed, the **model (D)**, which is trained with three hidden layers, each of 10 nodes and ReLU activation function, is the best one. Because its mean squared error is **336.66**. Moreover, its error is lower than about *305* and **77** when comparing with the trained model applying one hidden layer with *50* epochs and **100** epochs. 

Also, when comparing to the MSE of baseline model, the MSE of **model (D)** is lower, about **178**.

However, it is interesting that the MSE of baseline model (A) is lower than the MSE of model (B) which is trained by the normalized data and the same configuration of model, about **127**.

In conclusion, in order to get the better result, we could apply several techniques to tune the model such as normalizing the input data, improving the number of epochs or the number of hidden layers.