## Deep Learning & Neural Networks With Keras Assignment

In this course project, you will build a regression model using the deep learning Keras library, and then you will experiment with increasing the number of training epochs and changing number of hidden layers and you will see how changing these parameters impacts the performance of the model.

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>
      
1. <a href="#item41">Part A</a>   
2. <a href="#item42">Part B</a>  
3. <a href="#item43">Part C</a> 
4. <a href="#item43">Part D</a>
5. <a href="#item43">Part E</a>

</font>
</div>

## Data

For your convenience, the data can be found here again: https://cocl.us/concrete_data. To recap, the predictors in the data of concrete strength include:

1. Cement
2. Blast Furnace Slag
3. Fly Ash
4. Water
5. Superplasticizer
6. Coarse Aggregate
7. Fine Aggregate

## Download and clean data

In [1]:
import pandas as pd
import numpy as np

In [2]:
concrete_data = pd.read_csv('https://cocl.us/concrete_data')
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


In [3]:
concrete_data.shape

(1030, 9)

In [4]:
concrete_data.describe()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
count,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0
mean,281.167864,73.895825,54.18835,181.567282,6.20466,972.918932,773.580485,45.662136,35.817961
std,104.506364,86.279342,63.997004,21.354219,5.973841,77.753954,80.17598,63.169912,16.705742
min,102.0,0.0,0.0,121.8,0.0,801.0,594.0,1.0,2.33
25%,192.375,0.0,0.0,164.9,0.0,932.0,730.95,7.0,23.71
50%,272.9,22.0,0.0,185.0,6.4,968.0,779.5,28.0,34.445
75%,350.0,142.95,118.3,192.0,10.2,1029.4,824.0,56.0,46.135
max,540.0,359.4,200.1,247.0,32.2,1145.0,992.6,365.0,82.6


In [5]:
concrete_data.isnull().sum()

Cement                0
Blast Furnace Slag    0
Fly Ash               0
Water                 0
Superplasticizer      0
Coarse Aggregate      0
Fine Aggregate        0
Age                   0
Strength              0
dtype: int64

## Part A

#### **1. Randomly split the data into a training and test sets by holding 30% of the data for testing. You can use the train_test_split helper function from Scikit-learn.**

In [6]:
from sklearn.model_selection import train_test_split

In [7]:
concrete_data_columns = concrete_data.columns

X = concrete_data[concrete_data_columns[concrete_data_columns != 'Strength']] # all columns except Strength
y = concrete_data['Strength'] # Strength column

In [8]:
X.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360


In [9]:
y.head()

0    79.99
1    61.89
2    40.27
3    41.05
4    44.30
Name: Strength, dtype: float64

In [10]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [11]:
X_train.shape

(721, 8)

In [12]:
X_test.shape

(309, 8)

In [13]:
y_test.shape

(309,)

In [14]:
X_train.shape[1]

8

#### **Use the Keras library to build a neural network with the following:**
#### **- One hidden layer of 10 nodes, and a ReLU activation function**
#### **- Use the adam optimizer and the mean squared error as the loss function.**

In [15]:
import keras
from keras.models import Sequential
from keras.layers import Dense

Using TensorFlow backend.


In [16]:
# define regression model
def regression_model(X_train):
    # create model
    model = Sequential()
    model.add(Dense(50, activation='relu', input_shape=(X_train.shape[1],)))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(1))
    
    # compile model
    # Use the adam optimizer and the mean squared error as the loss function
    model.compile(optimizer='adam', loss='mean_squared_error') 
    return model

#### **2. Train the model on the training data using 50 epochs.**

In [17]:
# build the model
model = regression_model(X_train)

Instructions for updating:
Colocations handled automatically by placer.


In [18]:
# fit the model
model.fit(X_train, y_train, epochs=50, verbose=2)

Instructions for updating:
Use tf.cast instead.
Epoch 1/50
 - 2s - loss: 1920.3554
Epoch 2/50
 - 0s - loss: 477.5381
Epoch 3/50
 - 0s - loss: 260.3853
Epoch 4/50
 - 0s - loss: 215.9239
Epoch 5/50
 - 0s - loss: 189.2263
Epoch 6/50
 - 0s - loss: 170.0355
Epoch 7/50
 - 1s - loss: 156.7711
Epoch 8/50
 - 0s - loss: 140.8176
Epoch 9/50
 - 1s - loss: 154.3209
Epoch 10/50
 - 1s - loss: 133.9116
Epoch 11/50
 - 0s - loss: 116.6440
Epoch 12/50
 - 0s - loss: 103.2975
Epoch 13/50
 - 1s - loss: 96.0070
Epoch 14/50
 - 0s - loss: 104.3990
Epoch 15/50
 - 0s - loss: 91.2876
Epoch 16/50
 - 0s - loss: 88.7108
Epoch 17/50
 - 1s - loss: 94.1678
Epoch 18/50
 - 0s - loss: 92.4890
Epoch 19/50
 - 1s - loss: 82.4237
Epoch 20/50
 - 1s - loss: 85.8954
Epoch 21/50
 - 1s - loss: 81.7956
Epoch 22/50
 - 0s - loss: 101.9483
Epoch 23/50
 - 1s - loss: 88.2548
Epoch 24/50
 - 0s - loss: 90.0353
Epoch 25/50
 - 0s - loss: 89.9123
Epoch 26/50
 - 0s - loss: 92.3209
Epoch 27/50
 - 0s - loss: 77.1689
Epoch 28/50
 - 1s - loss: 73

<keras.callbacks.History at 0x7f4e03f92be0>

#### **3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.**

In [19]:
from sklearn.metrics import mean_squared_error

In [20]:
y_pred = model.predict(X_test)

In [21]:
mean_squared_error(y_test, y_pred)

57.07614096566331

#### **4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.**

In [22]:
# define spliting train test sets
def split_train_test():
    X = concrete_data[concrete_data_columns[concrete_data_columns != 'Strength']] # all columns except Strength
    y = concrete_data['Strength'] # Strength column
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=42)
    return X_train, X_test, y_train, y_test

In [23]:
mean_squared_err_list = []
for i in range(50):
    X_train, X_test, y_train, y_test = split_train_test()
    model = regression_model(X_train)
    model.fit(X_train, y_train, epochs=50, verbose=0)
    y_pred = model.predict(X_test)
    mean_squared_err_list.append(mean_squared_error(y_test, y_pred))

#### **5. Report the mean and the standard deviation of the mean squared errors.**

In [24]:
mean_squared_err_list = np.asarray(mean_squared_err_list)

In [25]:
np.mean(mean_squared_err_list)

78.6645531154259

In [26]:
np.std(mean_squared_err_list)

19.47142507177436

## Part B

#### **B. Normalize the data (5 marks)**

In [30]:
X_norm = (X - X.mean()) / X.std()
X_norm.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,0.862735,-1.217079,-0.279597
1,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,1.055651,-1.217079,-0.279597
2,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,3.55134
3,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,5.055221
4,-0.790075,0.678079,-0.846733,0.488555,-1.038638,0.070492,0.647569,4.976069


#### **Repeat Part A but use a normalized version of the data. Recall that one way to normalize the data is by subtracting the mean from the individual predictors and dividing by the standard deviation.**

In [31]:
# define spliting train test sets after normalization
def split_train_test_norm(data_norm, y):
    X_train, X_test, y_train, y_test = train_test_split(data_norm, y, test_size=0.30, random_state=42)
    return X_train, X_test, y_train, y_test

In [32]:
mean_squared_err_list_B = []
for i in range(50):
    X_train, X_test, y_train, y_test = split_train_test_norm(X_norm, y)
    model = regression_model(X_train)
    model.fit(X_train, y_train, epochs=50, verbose=2)
    y_pred = model.predict(X_test)
    mean_squared_err_list_B.append(mean_squared_error(y_test, y_pred))

Epoch 1/50
 - 6s - loss: 1602.4077
Epoch 2/50
 - 0s - loss: 1567.3730
Epoch 3/50
 - 0s - loss: 1533.1944
Epoch 4/50
 - 0s - loss: 1480.3542
Epoch 5/50
 - 1s - loss: 1396.2421
Epoch 6/50
 - 1s - loss: 1276.8071
Epoch 7/50
 - 0s - loss: 1116.5509
Epoch 8/50
 - 0s - loss: 922.1575
Epoch 9/50
 - 0s - loss: 715.7208
Epoch 10/50
 - 0s - loss: 515.4756
Epoch 11/50
 - 0s - loss: 364.1775
Epoch 12/50
 - 0s - loss: 269.8123
Epoch 13/50
 - 0s - loss: 221.4328
Epoch 14/50
 - 0s - loss: 201.5281
Epoch 15/50
 - 0s - loss: 190.3599
Epoch 16/50
 - 0s - loss: 182.9476
Epoch 17/50
 - 0s - loss: 177.3744
Epoch 18/50
 - 0s - loss: 172.5879
Epoch 19/50
 - 0s - loss: 168.1771
Epoch 20/50
 - 0s - loss: 164.5742
Epoch 21/50
 - 0s - loss: 161.6998
Epoch 22/50
 - 0s - loss: 158.3101
Epoch 23/50
 - 0s - loss: 155.6724
Epoch 24/50
 - 0s - loss: 152.8941
Epoch 25/50
 - 0s - loss: 150.4298
Epoch 26/50
 - 0s - loss: 148.0761
Epoch 27/50
 - 1s - loss: 145.7467
Epoch 28/50
 - 1s - loss: 143.6375
Epoch 29/50
 - 1s - lo

#### **How does the mean of the mean squared errors compare to that from Step A?**

In [33]:
mean_squared_err_list_B = np.asarray(mean_squared_err_list_B)

In [34]:
np.mean(mean_squared_err_list_B)

109.54834541785833

The mean of the mean squared errors from step B (109.548) is higher than the mean value from step A (78.665)

## Part C

#### **C. Increate the number of epochs (5 marks)**
#### **Repeat Part B but use 100 epochs this time for training.**

In [36]:
mean_squared_err_list_C = []
for i in range(50):
    X_train, X_test, y_train, y_test = split_train_test_norm(X_norm, y)
    model = regression_model(X_train)
    model.fit(X_train, y_train, epochs=100, verbose=0) # using 100 epochs for training
    y_pred = model.predict(X_test)
    mean_squared_err_list_C.append(mean_squared_error(y_test, y_pred))

#### **How does the mean of the mean squared errors compare to that from Step B?**

In [37]:
mean_squared_err_list_C = np.asarray(mean_squared_err_list_C)
np.mean(mean_squared_err_list_C)

61.740643630195684

The mean of the mean squared errors from step B (61.74) is much lower than the mean value from step B (109.548)

## Part D

#### **D. Increase the number of hidden layers (5 marks)**

#### **Repeat part B but use a neural network with the following instead:**
#### **- Three hidden layers, each of 10 nodes and ReLU activation function.**

In [38]:
# define regression model
# increasing hidden layers
def regression_model_C(X_train):
    # create model
    model = Sequential()
    model.add(Dense(50, activation='relu', input_shape=(X_train.shape[1],)))
    # using three hidden layers
    model.add(Dense(10, activation='relu'))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(1))
    
    # compile model
    # Use the adam optimizer and the mean squared error as the loss function
    model.compile(optimizer='adam', loss='mean_squared_error') 
    return model

In [39]:
# repeat Part B processes
mean_squared_err_list_D = []
for i in range(50):
    X_train, X_test, y_train, y_test = split_train_test_norm(X_norm, y)
    model = regression_model_C(X_train)
    model.fit(X_train, y_train, epochs=50, verbose=0)
    y_pred = model.predict(X_test)
    mean_squared_err_list_D.append(mean_squared_error(y_test, y_pred))

#### **How does the mean of the mean squared errors compare to that from Step B?**

In [40]:
mean_squared_err_list_D = np.asarray(mean_squared_err_list_D)
np.mean(mean_squared_err_list_D)

58.90093506531655

The mean of the mean squared errors from step B (58.90) is lower than the mean value from step B (61.74)