<div align=center><font size = 6>Build a Regression Model in Keras</font></dev>

<div><font size=3><strong>Table Of Contents</strong></font></div>

1. Some useful functions
2. Loading input corpus
3. Reviewing the loaded data
4. Normalizing input data
5. Splitting corpus into traning set and testing set
6. A - Experiment with a baseline model
7. B - Experiment with Normalized Data
8. C. Increate the number of epochs
9. D. Increase the number of hidden layers
10. Discussion

<strong>The dataset the compressive strength of different samples of concrete based on the volumes of the different components that include:</strong>

<strong>1. Cement</strong>

<strong>2. Blast Furnace Slag</strong>

<strong>3. Fly Ash</strong>

<strong>4. Water</strong>

<strong>5. Superplasticizer</strong>

<strong>6. Coarse Aggregate</strong>

<strong>7. Fine Aggregate</strong>

# Import some useful functions

Let's start by importing the libraries : os, keras, pandas, numpy, scikit_learn, etc.

In [2]:
import time
import os
import pandas as pd
import numpy as np
import keras
from keras.models import Sequential
from keras.layers import Dense
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split

In [3]:
# Define some components for the analysis

COL_NAME_CEMENT = "Cement"
COL_NAME_BLAST_FURNACE_SLAG = "Blast Furnace Slag"
COL_NAME_FLY_ASH = "Fly Ash"
COL_NAME_WATER = "Water"
COL_NAME_SUPERPLASTICIZER = "Superplasticizer"
COL_NAME_COARSE_AGGREGATE = "Coarse Aggregate"
COL_NAME_FINE_AGGREGATE = "Fine Aggregate"
COL_NAME_AGE = "Age"
COL_NAME_STRENGTH = "Strength"

COL_NAME_EXPERIMENT = "Experiment"
COL_NAME_MSE = "Mean MSE"
COL_NAME_RMSE = "Std Deviation MSE"

# This dataframe contains three columns: name_of_experiments, mse, rmse

header_of_concrete_data_mse_and_rmse = [COL_NAME_EXPERIMENT, COL_NAME_MSE, COL_NAME_RMSE]
concrete_data_mse_and_rmse = pd.DataFrame(columns=header_of_concrete_data_mse_and_rmse, data=[])

#Get round with given number of decimal digits 
    
def get_round(score, num_of_digits=2):
    return round(score, num_of_digits)


#Estimate the mean of the data

def get_mean(concrete_data_mse_scores):

    if concrete_data_mse_scores:
        return get_round(np.mean(concrete_data_mse_scores))
    return None
    
    
#Estimate the standard deviation
def get_standard_deviation(concrete_data_mse_scores):
    
    if concrete_data_mse_scores:
        return get_round(np.std(concrete_data_mse_scores))
    return None


#Build baseline model that contains:One hidden layer of 10 nodes, and a ReLU activation function.
#Use the adam optimizer and the mean squared error as the loss function.


def build_model_with_one_hidden_layer(num_of_features=3):
       
    # Create model
    model = Sequential()

    model.add(Dense(10, activation="relu", input_shape=(num_of_features,)))
    model.add(Dense(1))

    # Compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model


 # Build model that contains:Three hidden layers, each of 10 nodes and ReLU activation function.
#Use the adam optimizer and the mean squared error as the loss function.


def build_model_with_three_hidden_layers(num_of_features=3):
   
    
    # Create the model
    model = Sequential()

    model.add(Dense(10, activation="relu", input_shape=(num_of_features,)))
    model.add(Dense(10, activation="relu"))
    model.add(Dense(10, activation="relu"))
    model.add(Dense(1))

    # Compile the model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model


#Generate report (dataframe) of two metrics: The mean and the standard deviation of the mean squared errors
     
def get_mean_squared_error(compiled_model, X, y, epochs=50, verbose=1):
      
    
    # 1. Randomly split the data into a training and test sets by holding 30% 
    # of the data for testing. You can use the train_test_split helper function 
    # from Scikit-learn.    
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=24)   
    print("Training set: ", X_train.shape, y_train.shape)
    print("Testing set: ", X_test.shape, y_test.shape)
    
    
    # 2. Train the model on the training data using 50 epochs.
    # Note that: given model which is compiled
    # Fit the built model with training set
    model.fit(X_train, y_train, epochs=epochs, verbose=verbose)    

    # 3. Evaluate the model on the test data and compute the mean squared error 
    # between the predicted concrete strength and the actual concrete strength. 
    # You can use the mean_squared_error function from Scikit-learn.    
    y_hat = model.predict(X_test)    
    mse = mean_squared_error(y_test, y_hat)
    
    # Return the mean squared error
    return mse

 #Generate the mean and the standard deviation of the mean squared errors, Then Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.  

def get_mean_and_std_of_mse(df_X, 
                            df_y, 
                            compiled_model,                
                            max_iteration=50, 
                            epochs=50, 
                            verbose=0):
     
    concrete_data_mean_squared_errors = []
    for i in range(max_iteration):
        start_time = time.time()
        print("-" * 36)
        print("Processing current number of iteration : {}".format(i+1))        
        mse = get_mean_squared_error(compiled_model, df_X, df_y, epochs=epochs, verbose=verbose)
        concrete_data_mean_squared_errors.append(mse)
        print("Duration (seconds): {}".format(time.time()-start_time))

    print("Finished - {} times.\nAnd the list of mean squared errors : {}".format(max_iteration,
                                                                                  concrete_data_mean_squared_errors))
    mean_mse = get_mean(concrete_data_mean_squared_errors)
    std_mse = get_standard_deviation(concrete_data_mean_squared_errors)

    print("-" * 72)
    print("The mean and the standard deviation of the mean squared errors are: {} and {}, respectively".format(
           mean_mse, std_mse))
    
    return mean_mse, std_mse

 
#Generate report (dataframe) of two metrics: The mean and the standard deviation of the mean squared errors

def get_report(name_of_experiment, mean_mse, std_mse):
   
    values = [[name_of_experiment, mean_mse, std_mse]]

    return pd.DataFrame(columns=header_of_concrete_data_mse_and_rmse, data=values)
    

# Loading input corpus

Let's assign the path of input corpus. Because we re-use after dowloading the input corpus.

In [4]:
concrete_data = pd.read_csv('https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0101EN/labs/data/concrete_data.csv')
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


Let's read input data into a dataframe

# Reviewing the loaded data

In [5]:
concrete_data.columns

Index(['Cement', 'Blast Furnace Slag', 'Fly Ash', 'Water', 'Superplasticizer',
       'Coarse Aggregate', 'Fine Aggregate', 'Age', 'Strength'],
      dtype='object')

In [6]:
concrete_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1030 entries, 0 to 1029
Data columns (total 9 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   Cement              1030 non-null   float64
 1   Blast Furnace Slag  1030 non-null   float64
 2   Fly Ash             1030 non-null   float64
 3   Water               1030 non-null   float64
 4   Superplasticizer    1030 non-null   float64
 5   Coarse Aggregate    1030 non-null   float64
 6   Fine Aggregate      1030 non-null   float64
 7   Age                 1030 non-null   int64  
 8   Strength            1030 non-null   float64
dtypes: float64(8), int64(1)
memory usage: 72.5 KB


In [7]:
concrete_data.describe()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
count,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0
mean,281.167864,73.895825,54.18835,181.567282,6.20466,972.918932,773.580485,45.662136,35.817961
std,104.506364,86.279342,63.997004,21.354219,5.973841,77.753954,80.17598,63.169912,16.705742
min,102.0,0.0,0.0,121.8,0.0,801.0,594.0,1.0,2.33
25%,192.375,0.0,0.0,164.9,0.0,932.0,730.95,7.0,23.71
50%,272.9,22.0,0.0,185.0,6.4,968.0,779.5,28.0,34.445
75%,350.0,142.95,118.3,192.0,10.2,1029.4,824.0,56.0,46.135
max,540.0,359.4,200.1,247.0,32.2,1145.0,992.6,365.0,82.6


In [8]:
concrete_data.head(5)

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


So the first concrete sample has "540" cubic meter of cement, "0" cubic meter of blast furnace slag, "0" cubic meter of fly ash, "162" cubic meter of water, "2.5" cubic meter of superplaticizer, "1040" cubic meter of coarse aggregate, "676" cubic meter of fine aggregate. Such a concrete mix which is "28" days old, has a compressive strength of "79.99" MPa. 

In [9]:
print("(row, column) = {}".format(concrete_data.shape))

(row, column) = (1030, 9)


So, there are approximately 1000 samples to train our model on when splitting with 30% for the data of testing.

Let's check the data for any missing value

In [10]:
concrete_data.isnull().sum()

Cement                0
Blast Furnace Slag    0
Fly Ash               0
Water                 0
Superplasticizer      0
Coarse Aggregate      0
Fine Aggregate        0
Age                   0
Strength              0
dtype: int64

As you see, the above input corpus look pretty good to train the model. However, we could use the normalization technique to normalize it. 

# Normalizing input data

In [13]:
concrete_data_column_names = concrete_data.columns
concrete_data_column_names

Index(['Cement', 'Blast Furnace Slag', 'Fly Ash', 'Water', 'Superplasticizer',
       'Coarse Aggregate', 'Fine Aggregate', 'Age', 'Strength'],
      dtype='object')

## Splitting into predictors and target

Filtering the title of column names of dataframe predictors

In [14]:
concrete_data_col_names_predictors = [x for x in concrete_data_column_names 
                                if x != COL_NAME_STRENGTH]

In [15]:
concrete_data_col_names_predictors

['Cement',
 'Blast Furnace Slag',
 'Fly Ash',
 'Water',
 'Superplasticizer',
 'Coarse Aggregate',
 'Fine Aggregate',
 'Age']

In [16]:
concrete_data_predictors = concrete_data[concrete_data_col_names_predictors]

In [17]:
concrete_data_target = concrete_data[[COL_NAME_STRENGTH]]

Separating the data in two dataframes: predictors and target

In [18]:
concrete_data_predictors.head(5)

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360


In [19]:
concrete_data_target.head(5)

Unnamed: 0,Strength
0,79.99
1,61.89
2,40.27
3,41.05
4,44.3


## Applying normalization method

Recall that one way to normalize the data is by subtracting the mean from the individual predictors and dividing by the standard deviation.

In [20]:
concrete_data_predictors_norm = (concrete_data_predictors - concrete_data_predictors.mean())/concrete_data_predictors.std()

In [21]:
concrete_data_predictors_norm.head(5)

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,0.862735,-1.217079,-0.279597
1,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,1.055651,-1.217079,-0.279597
2,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,3.55134
3,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,5.055221
4,-0.790075,0.678079,-0.846733,0.488555,-1.038638,0.070492,0.647569,4.976069


# A - Experiment with a baseline model

Use the Keras library to build a neural network with the following:

   + One hidden layer of **10 nodes**, and a **ReLU activation function**
   
   + Use the **adam optimizer** and the **mean squared error as the loss function**.

1. **Randomly split** the data into a training and test sets by holding **30% of the data for testing**. You can use the train_test_split helper function from Scikit-learn.

2. Train the model on the training data using **50 epochs**.

3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.

4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

5. Report the mean and the standard deviation of the mean squared errors.

## Building and Training with the baseline model

In [22]:
num_of_features = len(concrete_data.columns) - 1
print("Number of features for input layer : ", num_of_features)

Number of features for input layer :  8


In [26]:

# Evaluate the compiled model
model = build_model_with_one_hidden_layer(num_of_features=num_of_features)

mean_mse, std_mse = get_mean_and_std_of_mse(concrete_data_predictors, 
                                            concrete_data_target, 
                                            model, 
                                            max_iteration=50, 
                                            epochs=50, verbose=0)

2022-02-28 22:07:43.293691: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


------------------------------------
Processing current number of iteration : 1
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 2.3787529468536377
------------------------------------
Processing current number of iteration : 2
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.320302963256836
------------------------------------
Processing current number of iteration : 3
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.5278122425079346
------------------------------------
Processing current number of iteration : 4
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.3300559520721436
------------------------------------
Processing current number of iteration : 5
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.2837200164794922
------------------------------------
Processing current number of iteration : 6
T

Duration (seconds): 1.482370138168335
------------------------------------
Processing current number of iteration : 46
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.3669230937957764
------------------------------------
Processing current number of iteration : 47
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.4556410312652588
------------------------------------
Processing current number of iteration : 48
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.3436169624328613
------------------------------------
Processing current number of iteration : 49
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.6577789783477783
------------------------------------
Processing current number of iteration : 50
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.3397858142852783
Finished - 50 times.
And the list of 

## Report the mean and the standard deviation of the mean squared errors

In [27]:
name_of_experiment = "Baseline of the Raw Data with 50 epochs"

# Report the mean and the standard deviation of the mean squared errors
concrete_data_result_baseline = get_report(name_of_experiment, mean_mse, std_mse)
concrete_data_result_baseline

Unnamed: 0,Experiment,Mean MSE,Std Deviation MSE
0,Baseline of the Raw Data with 50 epochs,53.62,11.17


In [28]:
# Concat baseline dataframe into result
concrete_data_mse_and_rmse = pd.concat([concrete_data_mse_and_rmse, concrete_data_result_baseline], axis=0)

# Review the result dataframe
concrete_data_mse_and_rmse.reset_index(drop=True)

Unnamed: 0,Experiment,Mean MSE,Std Deviation MSE
0,Baseline of the Raw Data with 50 epochs,53.62,11.17


# B - Experiment with Normalized Data

Repeat Part A but use a normalized version of the data. Recall that one way to normalize the data is by subtracting the mean from the individual predictors and dividing by the standard deviation.

How does the mean of the mean squared errors compare to that from Step A?

## Normalize the data 
by substracting the mean and dividing by the standard deviation.

### Before normalization

In [29]:
concrete_data_predictors.head(5)

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360


### After normalization

In [30]:
concrete_data_predictors_norm.head(5)

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,0.862735,-1.217079,-0.279597
1,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,1.055651,-1.217079,-0.279597
2,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,3.55134
3,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,5.055221
4,-0.790075,0.678079,-0.846733,0.488555,-1.038638,0.070492,0.647569,4.976069


## Building and Training with the baseline model after normalizing the data with 50 epochs

In [31]:

# Evaluate the compiled model
model = build_model_with_one_hidden_layer(num_of_features=num_of_features)

mean_mse, std_mse = get_mean_and_std_of_mse(concrete_data_predictors_norm, 
                                            concrete_data_target, 
                                            model, 
                                            max_iteration=50, 
                                            epochs=50, verbose=0)

------------------------------------
Processing current number of iteration : 1
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.7124340534210205
------------------------------------
Processing current number of iteration : 2
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.3309669494628906
------------------------------------
Processing current number of iteration : 3
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.322847604751587
------------------------------------
Processing current number of iteration : 4
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.3305869102478027
------------------------------------
Processing current number of iteration : 5
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.3468000888824463
------------------------------------
Processing current number of iteration : 6
T

Duration (seconds): 1.5662338733673096
------------------------------------
Processing current number of iteration : 46
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.6356680393218994
------------------------------------
Processing current number of iteration : 47
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.8672170639038086
------------------------------------
Processing current number of iteration : 48
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 2.6176340579986572
------------------------------------
Processing current number of iteration : 49
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 5.537535905838013
------------------------------------
Processing current number of iteration : 50
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 2.669261932373047
Finished - 50 times.
And the list of m

## Report the mean and the standard deviation of the mean squared errors

In [34]:
name_of_experiment = "Normalized with one Hidden Layer and 50 epochs"

# Report the mean and the standard deviation of the mean squared errors
concrete_data_result_baseline = get_report(name_of_experiment, mean_mse, std_mse)
concrete_data_result_baseline

Unnamed: 0,Experiment,Mean MSE,Std Deviation MSE
0,Normalized with one Hidden Layer and 50 epochs,58.95,30.7


In [35]:
# Concat baseline dataframe into result
concrete_data_mse_and_rmse = pd.concat([concrete_data_mse_and_rmse, concrete_data_result_baseline], axis=0)

# Review the result dataframe
concrete_data_mse_and_rmse.reset_index(drop=True)

Unnamed: 0,Experiment,Mean MSE,Std Deviation MSE
0,Baseline of the Raw Data with 50 epochs,53.62,11.17
1,Normalized with one Hidden Layer and 50 epochs,58.95,30.7
2,Normalized with one Hidden Layer and 50 epochs,58.95,30.7


# C. Increate the number of epochs

Repeat Part B but use 100 epochs this time for training.

How does the mean of the mean squared errors compare to that from Step B?

## Building and Training with the baseline model after normalizing the data with 100 epochs

In [36]:

# Evaluate the compiled model
model = build_model_with_one_hidden_layer(num_of_features=num_of_features)

mean_mse, std_mse = get_mean_and_std_of_mse(concrete_data_predictors_norm, 
                                            concrete_data_target, 
                                            model, 
                                            max_iteration=50, 
                                            epochs=100, verbose=0)

------------------------------------
Processing current number of iteration : 1
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 3.5591349601745605
------------------------------------
Processing current number of iteration : 2
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 2.934518814086914
------------------------------------
Processing current number of iteration : 3
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 3.8928751945495605
------------------------------------
Processing current number of iteration : 4
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 2.9371676445007324
------------------------------------
Processing current number of iteration : 5
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 2.978286027908325
------------------------------------
Processing current number of iteration : 6
Tr

Duration (seconds): 4.534631967544556
------------------------------------
Processing current number of iteration : 46
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 4.020293712615967
------------------------------------
Processing current number of iteration : 47
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 3.1876299381256104
------------------------------------
Processing current number of iteration : 48
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 3.721435785293579
------------------------------------
Processing current number of iteration : 49
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 3.7955708503723145
------------------------------------
Processing current number of iteration : 50
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 4.0064377784729
Finished - 50 times.
And the list of mean 

## Report the mean and the standard deviation of the mean squared errors

In [37]:
name_of_experiment = "Normalized with one Hidden Layer and 100 epochs"

# Report the mean and the standard deviation of the mean squared errors
concrete_data_result_baseline = get_report(name_of_experiment, mean_mse, std_mse)
concrete_data_result_baseline

Unnamed: 0,Experiment,Mean MSE,Std Deviation MSE
0,Normalized with one Hidden Layer and 100 epochs,44.47,16.4


In [38]:
# Concat baseline dataframe into result
concrete_data_mse_and_rmse = pd.concat([concrete_data_mse_and_rmse, concrete_data_result_baseline], axis=0)

# Review the result dataframe
concrete_data_mse_and_rmse.reset_index(drop=True)

Unnamed: 0,Experiment,Mean MSE,Std Deviation MSE
0,Baseline of the Raw Data with 50 epochs,53.62,11.17
1,Normalized with one Hidden Layer and 50 epochs,58.95,30.7
2,Normalized with one Hidden Layer and 50 epochs,58.95,30.7
3,Normalized with one Hidden Layer and 100 epochs,44.47,16.4


# D. Increase the number of hidden layers

Repeat part B but use a neural network with the following instead:

- Three hidden layers, each of 10 nodes and ReLU activation function.

How does the mean of the mean squared errors compare to that from Step B?

## Building and Training with the model after normalizing the data with 50 epochs

In [39]:

# Evaluate the compiled model
model = build_model_with_three_hidden_layers(num_of_features=num_of_features)

mean_mse, std_mse = get_mean_and_std_of_mse(concrete_data_predictors_norm, 
                                            concrete_data_target, 
                                            model, 
                                            max_iteration=50, 
                                            epochs=50, 
                                            verbose=0)

------------------------------------
Processing current number of iteration : 1
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 2.5826680660247803
------------------------------------
Processing current number of iteration : 2
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 2.0521719455718994
------------------------------------
Processing current number of iteration : 3
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 2.036702871322632
------------------------------------
Processing current number of iteration : 4
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.9125189781188965
------------------------------------
Processing current number of iteration : 5
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.732224702835083
------------------------------------
Processing current number of iteration : 6
Tr

Duration (seconds): 1.8609862327575684
------------------------------------
Processing current number of iteration : 46
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 2.890643835067749
------------------------------------
Processing current number of iteration : 47
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 2.012349843978882
------------------------------------
Processing current number of iteration : 48
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 2.011723041534424
------------------------------------
Processing current number of iteration : 49
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 1.9642550945281982
------------------------------------
Processing current number of iteration : 50
Training set:  (721, 8) (721, 1)
Testing set:  (309, 8) (309, 1)
Duration (seconds): 2.089717149734497
Finished - 50 times.
And the list of mea

## Report the mean and the standard deviation of the mean squared errors

In [40]:
name_of_experiment = "Normalized with 3 Hidden Layers and 50 epochs"

# Report the mean and the standard deviation of the mean squared errors
concrete_data_result_baseline = get_report(name_of_experiment, mean_mse, std_mse)
concrete_data_result_baseline

Unnamed: 0,Experiment,Mean MSE,Std Deviation MSE
0,Normalized with 3 Hidden Layers and 50 epochs,33.33,9.03


In [41]:
# Concat baseline dataframe into result
concrete_data_mse_and_rmse = pd.concat([concrete_data_mse_and_rmse, concrete_data_result_baseline], axis=0)

# Review the result dataframe
concrete_data_mse_and_rmse.reset_index(drop=True)

Unnamed: 0,Experiment,Mean MSE,Std Deviation MSE
0,Baseline of the Raw Data with 50 epochs,53.62,11.17
1,Normalized with one Hidden Layer and 50 epochs,58.95,30.7
2,Normalized with one Hidden Layer and 50 epochs,58.95,30.7
3,Normalized with one Hidden Layer and 100 epochs,44.47,16.4
4,Normalized with 3 Hidden Layers and 50 epochs,33.33,9.03


# Discussion

The mean squared error (MSE) indicates that close a regression model is the testing set. And the standard deviation of the residuals estimate the differences between the data set and thelinear regression model.

The the modelin D-Normalized-3 Hidden Layers(50 epochs), which is trained with three hidden layers, each of 10 nodes and ReLU activation function, is the best one. Since mean of the MSE is **34.17**. Moreover, its error is lower than about *16* and **17** as compared with the trained model applying one hidden layer with *50* epochs and **100** epochs. 

Besides, when comparing to mean of the MSEs of baseline model, the MSE of **model (D)** is lower, about **35**.

It interesting to note that the mean of the MSEs of baseline model **(B-Normalized-1 Hidden Layers(50 epochs))** is lower about **1** than the mean the MSEs of **model (C-Normalized-1 Hidden Layers(100 epochs))** which is trained by the normalized data and the same configuration of model, but model (C) did 100 epochs.

In conclusion, several techniques can be applied to fine-tune the model such as normalizing the input data, thus improving the number of epochs or the number of hidden layers.