# D. Increase the number of hidden layers (5 marks)

Repeat part B but use a neural network with the following instead:

- Three hidden layers, each of 10 nodes and ReLU activation function.

How does the mean of the mean squared errors compare to that from Step B?

# Regression model using Keras (Data normalized + 3 hidden layers )

Lets import numpy and pandas to help us load and analyze data

In [1]:
import numpy as np
import pandas as pd

In [2]:
#lets load the data, and take a look at the data using .head()
data = pd.read_csv("concrete_data.csv")
data.head()


Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


In [3]:
#lets print the shape of the data (i.e. number of rows and columns)
data.shape

(1030, 9)

Therefore, our dataset has 1030 rows and only 9 columns.
Lets take a look at the data for any missing values before we start building the model using the data.

In [4]:
data.describe()


Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
count,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0
mean,281.167864,73.895825,54.18835,181.567282,6.20466,972.918932,773.580485,45.662136,35.817961
std,104.506364,86.279342,63.997004,21.354219,5.973841,77.753954,80.17598,63.169912,16.705742
min,102.0,0.0,0.0,121.8,0.0,801.0,594.0,1.0,2.33
25%,192.375,0.0,0.0,164.9,0.0,932.0,730.95,7.0,23.71
50%,272.9,22.0,0.0,185.0,6.4,968.0,779.5,28.0,34.445
75%,350.0,142.95,118.3,192.0,10.2,1029.4,824.0,56.0,46.135
max,540.0,359.4,200.1,247.0,32.2,1145.0,992.6,365.0,82.6


In [5]:
data.isnull().sum()

Cement                0
Blast Furnace Slag    0
Fly Ash               0
Water                 0
Superplasticizer      0
Coarse Aggregate      0
Fine Aggregate        0
Age                   0
Strength              0
dtype: int64

The data looks good so far, therfore we can begin the next steps.

Since, for the first part, we are not to normalioze the data, I will jump straight to splitting the dataset.

### Lets divide our dataset into predictors (X) and target variable (y) (independent and dependent variable)

In [6]:
X = data[['Cement','Blast Furnace Slag','Fly Ash',
                  'Water','Superplasticizer','Coarse Aggregate','Fine Aggregate','Age']]

X.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360


In [7]:
y = data[['Strength']]
y.head()

Unnamed: 0,Strength
0,79.99
1,61.89
2,40.27
3,41.05
4,44.3


### Lets convert both, X and y into arrays

In [8]:
X = X.values
X

array([[ 540. ,    0. ,    0. , ..., 1040. ,  676. ,   28. ],
       [ 540. ,    0. ,    0. , ..., 1055. ,  676. ,   28. ],
       [ 332.5,  142.5,    0. , ...,  932. ,  594. ,  270. ],
       ...,
       [ 148.5,  139.4,  108.6, ...,  892.4,  780. ,   28. ],
       [ 159.1,  186.7,    0. , ...,  989.6,  788.9,   28. ],
       [ 260.9,  100.5,   78.3, ...,  864.5,  761.5,   28. ]])

In [9]:
y = y.values
y

array([[79.99],
       [61.89],
       [40.27],
       ...,
       [23.7 ],
       [32.77],
       [32.4 ]])

## Lets Normalize the X (independent variables/predictors)

In [10]:
X_normalized = (X - X.mean()) / X.std() 

In [11]:
print(X_normalized.shape)
print(X.shape)

(1030, 8)
(1030, 8)


### Now that we have both, the target and predictor variabels, lets move onto splitting our dataset.


In [12]:
from sklearn import preprocessing
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split( X_normalized, y, test_size=0.3, random_state=42)
print(f"Train Set = {X_train.shape},{y_train.shape}")
print(f"Test Set = {X_test.shape},{y_test.shape}")

Train Set = (721, 8),(721, 1)
Test Set = (309, 8),(309, 1)


30% of the dataset has been reserved for testing as per the instructions

### Lets import some important libraries for building our model

In [13]:
import tensorflow

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.utils import to_categorical

In [14]:
#lets define n_cols to be the size of the number of variables in X
n_cols = X_test.shape[1]
print(n_cols)

8


#### Therefore, we will have 8 nodes in the input layer of the ANN.

In [15]:
#lets create our model
def regression_model():
    # create the model
    model = tensorflow.keras.Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,))) #1st hidden layer + input layer
    model.add(Dense(10, activation='relu')) #2nd hidden layer
    model.add(Dense(10, activation='relu')) #3rd hidden layer
    model.add(Dense(1))
    
    # compile thye model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model


The above function creates a model that has three hidden layer with 10 neurons each and uses ReLU activation function. It uses the adam optimizer and the mean squared error as the loss function, as per instructions.

In [29]:
#lets build the model
model = regression_model()

Lets train the model with 50 epochs

In [17]:
# fit the model
epochs = 50
model.fit(X_train, y_train, epochs=epochs, verbose=2)

Train on 721 samples
Epoch 1/50
721/721 - 1s - loss: 1601.1199
Epoch 2/50
721/721 - 0s - loss: 1581.1002
Epoch 3/50
721/721 - 0s - loss: 1567.4009
Epoch 4/50
721/721 - 0s - loss: 1542.3460
Epoch 5/50
721/721 - 0s - loss: 1474.8556
Epoch 6/50
721/721 - 0s - loss: 1354.0929
Epoch 7/50
721/721 - 0s - loss: 1165.8063
Epoch 8/50
721/721 - 0s - loss: 884.4956
Epoch 9/50
721/721 - 0s - loss: 569.0681
Epoch 10/50
721/721 - 0s - loss: 355.8449
Epoch 11/50
721/721 - 0s - loss: 302.2430
Epoch 12/50
721/721 - 0s - loss: 295.2118
Epoch 13/50
721/721 - 0s - loss: 292.0628
Epoch 14/50
721/721 - 0s - loss: 289.0602
Epoch 15/50
721/721 - 0s - loss: 286.1294
Epoch 16/50
721/721 - 0s - loss: 283.4480
Epoch 17/50
721/721 - 0s - loss: 281.3639
Epoch 18/50
721/721 - 0s - loss: 278.9001
Epoch 19/50
721/721 - 0s - loss: 276.2429
Epoch 20/50
721/721 - 0s - loss: 273.7385
Epoch 21/50
721/721 - 0s - loss: 271.6688
Epoch 22/50
721/721 - 0s - loss: 269.0225
Epoch 23/50
721/721 - 0s - loss: 266.5991
Epoch 24/50
721

<tensorflow.python.keras.callbacks.History at 0x7fa470324990>

In [18]:
#Lets evaluate the model now:

loss_ = model.evaluate(X_test, y_test, verbose  =2)
y_pred = model.predict(X_test)
loss_



309/1 - 0s - loss: 183.2762


210.6285982101095



Now we need to compute the mean squared error between the predicted concrete strength and the actual concrete strength.

Let's import the mean_squared_error function from Scikit-learn.



In [19]:
from sklearn.metrics import mean_squared_error

In [20]:
mean_square_error = mean_squared_error(y_test, y_pred)
mean = np.mean(mean_square_error)
standard_deviation = np.std(mean_square_error)
print(f"Mean of MSE = {mean}")
print(f"Standard Deviation of MSE = {standard_deviation}")

Mean of MSE = 210.6286005296759
Standard Deviation of MSE = 0.0


### Now, we will repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors and calculate the mean and Standard deviation of the list.

In [21]:
z =1 #for indexing 
mse_list_50 = [] #empty list for the 50 values 
model = regression_model()
epochs = 50
for x in range(50):
    
    X_train, X_test, y_train, y_test = train_test_split(X_normalized, y, test_size=0.3, random_state=x)
    model.fit(X_train, y_train, epochs=epochs, verbose=0)
    loss_1 = model.evaluate(X_test, y_test, verbose  =0)
    print(f" {z}: MSE = {loss_1}")
    y_pred1 = model.predict(X_test)
    mean_square_error = mean_squared_error(y_test, y_pred1)
    mse_list_50.append(mean_square_error)
    z += 1
    
#lets convert the list mse_list_50 into array before we calculate the mean and the standard deviation of the mean squared errors.
mse_array_50 = np.array(mse_list_50)
mse_array_50_mean = np.mean(mse_array_50)
mse_array_50_std = np.std(mse_array_50)
print(f"Mean of all 50 Mean squared error values = {mse_array_50_mean}")
print(f"Standard Deviation of all 50 Mean squared error values = {mse_array_50_std}")

 1: MSE = 201.7618093151105
 2: MSE = 143.16816849693126
 3: MSE = 110.17803043995089
 4: MSE = 117.55544735312847
 5: MSE = 120.54027584532703
 6: MSE = 106.43613475811905
 7: MSE = 122.14412743528298
 8: MSE = 90.82784982329433
 9: MSE = 107.12063060377794
 10: MSE = 85.8858844794116
 11: MSE = 71.76959077902983
 12: MSE = 56.57871137699263
 13: MSE = 61.79365176599003
 14: MSE = 54.90298731041572
 15: MSE = 46.20057662173768
 16: MSE = 44.226352543506806
 17: MSE = 45.048644815833825
 18: MSE = 48.18098234898836
 19: MSE = 40.06690701234688
 20: MSE = 45.88658233445053
 21: MSE = 40.931589120414266
 22: MSE = 41.84210272977267
 23: MSE = 42.07831492007357
 24: MSE = 42.00696900747355
 25: MSE = 42.735138396229175
 26: MSE = 42.02866993135619
 27: MSE = 40.75031596557222
 28: MSE = 39.52926720307483
 29: MSE = 50.68142556989849
 30: MSE = 44.197037496227274
 31: MSE = 43.31090982677867
 32: MSE = 35.77122025350923
 33: MSE = 44.26550974429232
 34: MSE = 43.58813077809356
 35: MSE = 3

In [22]:
#a look at the list of 50 mean squared errors
mse_list_50

[201.76180744562046,
 143.16817149624399,
 110.17802929103449,
 117.55544385286066,
 120.54027653263606,
 106.43613610244823,
 122.14412688079567,
 90.82784987778282,
 107.12062795304749,
 85.88588564652052,
 71.76958962586994,
 56.5787128627597,
 61.79365143895132,
 54.90298627293398,
 46.20057559189805,
 44.22635190825704,
 45.04864293301454,
 48.180980547914125,
 40.066908940244936,
 45.88658089984174,
 40.93158808841903,
 41.84210324458037,
 42.0783149571165,
 42.006969880273985,
 42.73513814465646,
 42.02867141593988,
 40.75031661403214,
 39.5292659807368,
 50.681427284948214,
 44.19703700615173,
 43.31091002331918,
 35.77121995277407,
 44.265510093691965,
 43.58813016343943,
 39.51839667817907,
 46.21612818924392,
 43.63307827742651,
 47.526372572205,
 42.36613087542638,
 38.17734807628841,
 45.404374550156994,
 39.5763423175541,
 39.433200614558324,
 49.275552948823716,
 45.262268573903384,
 44.23278478135047,
 44.10621150834365,
 44.57731268953261,
 44.54343816313494,
 42.84819

## How does the mean of the mean squared errors compare to that from Step B?

In [28]:
print("With 1 hidden layer only - Part B")
print(f"\nMean of all 50 Mean squared error values = 105.14580754892266")
print(f"Standard Deviation of all 50 Mean squared error values = 45.379225803390646")

With 1 hidden layer only - Part B

Mean of all 50 Mean squared error values = 105.14580754892266
Standard Deviation of all 50 Mean squared error values = 45.379225803390646


In [27]:
print("With 3 hidden layers - Part D")
print(f"\nMean of all 50 Mean squared error values = {mse_array_50_mean}")
print(f"Standard Deviation of all 50 Mean squared error values = {mse_array_50_std}")

With 3 hidden layers - Part D

Mean of all 50 Mean squared error values = 60.21374153383467
Standard Deviation of all 50 Mean squared error values = 33.73119677763262
