IBM: Introduction to Deep Learning & Neural Networks Final Project

Build a Regression Model in Keras

A: Build a baseline model.
Use the Keras library to build a neural network with the following:
- One hidden layer of 10 nodes, and a ReLU activation function

- Use the adam optimizer and the mean squared error  as the loss function.

1. Randomly split the data into a training and test sets by holding 30% of the data for testing.

2. Train the model on the training data using 50 epochs.

3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.

4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

5. Report the mean and the standard deviation of the mean squared errors.

In [None]:
import pandas as pd
import numpy as np
import keras
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error




In [None]:
# Load data
concrete_data = pd.read_csv('https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0101EN/labs/data/concrete_data.csv')
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


In [None]:
# Split data to predictors and target
concrete_data_columns = concrete_data.columns

predictors = concrete_data[concrete_data_columns[concrete_data_columns != 'Strength']] # all columns except Strength
target = concrete_data['Strength'] # Strength column

In [None]:
# Get number of predictors as inputs for network

n_cols = predictors.shape[1]
n_cols

8

In [None]:
def regression_model():
    # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,))) # first layer must match number of inputs
    model.add(Dense(1)) # One output value

    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

In [None]:
# Compile the model
model = regression_model()
errors = []
for i in range(50):
    # Step 1: Randomly split dataset to training and testing sets
    X_train, X_test, y_train, y_test = train_test_split(predictors, target, test_size=0.3)

    # Step 2: Train the model on the training data using 50 epochs
    model.fit(X_train, y_train, verbose=0, epochs=50)

    # Step 3: Evaluate the model on the test data and compute the mean squared error
    errors.append(model.evaluate(X_test, y_test, verbose=0))
    print('Run {} out of 50'.format(i + 1), end='\r')
    


Mean MSE: 121.07828475952148
Standard Deviation: 28.916516744971265


In [None]:
meanMSE_Part_A = np.mean(errors)
stdDev_Part_A = np.std(errors)
print('Mean MSE: {}\nStandard Deviation: {}'.format(meanMSE_Part_A, stdDev_Part_A))

B. Normalize the Data


Repeat Part A but use a normalized version of the data.

In [None]:
# Normalize predictors

predictors_norm = (predictors - predictors.mean()) / predictors.std()

In [None]:
# Re-run the model on normalized predictors
model = regression_model()
errors = []
for i in range(50):
    # Step 1: Randomly split dataset to training and testing sets
    X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size=0.3)

    # Step 2: Train the model on the training data using 50 epochs
    model.fit(X_train, y_train, verbose=0, epochs=50)

    # Step 3: Evaluate the model on the test data and compute the mean squared error
    errors.append(model.evaluate(X_test, y_test, verbose=0))
    print('Run {} out of 50'.format(i + 1), end='\r')


Mean MSE: 52.0611413192749
Standard Deviation: 98.26574546555324
Normalizing values reduced mean MSE by 57.00208222913327 percent from part A


In [None]:
meanMSE_Part_B = np.mean(errors)
improvement = 100 * (meanMSE_Part_A - meanMSE_Part_B) / meanMSE_Part_A
stdDev_Part_B = np.std(errors)
print('Mean MSE: {}\nStandard Deviation: {}'.format(meanMSE_Part_B, stdDev_Part_B))
print('Normalizing values reduced mean MSE by {} percent from part A'.format(improvement))

Normalizing the data decreased the MSE by about 57% from Part A

C. Increase the number of epochs

Repeat Part B but use 100 epochs this time for training

In [None]:
errors = []
for i in range(50):
    # Step 1: Randomly split dataset to training and testing sets
    X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size=0.3)

    # Step 2: Train the model on the training data using 50 epochs
    model.fit(X_train, y_train, verbose=0, epochs=100)

    # Step 3: Evaluate the model on the test data and compute the mean squared error
    errors.append(model.evaluate(X_test, y_test, verbose=0))
    print('Run {} out of 50'.format(i + 1), end='\r')


Run  49  out of 100

In [None]:
meanMSE_Part_C = np.mean(errors)
improvement = 100 * (meanMSE_Part_B - meanMSE_Part_C) / meanMSE_Part_B
stdDev_Part_C = np.std(errors)
print('Mean MSE: {}\nStandard Deviation: {}'.format(meanMSE_Part_C, stdDev_Part_C))
print('Increasing epochs from 50 to 100 reduced mean MSE by {} percent from part B'.format(improvement))

Mean MSE: 23.65131546020508
Standard Deviation: 2.5576829442248354
Increasing epochs from 50 to 100 reduced mean MSE by 54.57011724894991 percent from part B


Increasing the epochs from 50 to 100 reduced the MSE by about 55% from Part B

D. Increase the number of hidden layers

Repeat part B but use a neural network with three hidden layers, each with 10 nodes and a ReLU activation function

In [None]:
def three_layer_model():
        # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,))) # first layer must match number of inputs
    model.add(Dense(10, activation="relu"))
    model.add(Dense(10, activation="relu"))
    model.add(Dense(1)) # One output value

    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

In [None]:
# Run the evaluation of the new model
model = three_layer_model()
errors = []
for i in range(50):
    print('Run ', i)
    # Step 1: Randomly split dataset to training and testing sets
    X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size=0.3)

    # Step 2: Train the model on the training data using 50 epochs
    model.fit(X_train, y_train, verbose=0, epochs=50)

    # Step 3: Evaluate the model on the test data and compute the mean squared error
    errors.append(model.evaluate(X_test, y_test, verbose=0))
    print('Run {} out of 50'.format(i + 1), end='\r')

Run  0
Run  1  out of 50
Run  2  out of 50
Run  3  out of 50
Run  4  out of 50
Run  5  out of 50
Run  6  out of 50
Run  7  out of 50
Run  8  out of 50
Run  9  out of 50
Run  10 out of 50
Run  11  out of 50
Run  12  out of 50
Run  13  out of 50
Run  14  out of 50
Run  15  out of 50
Run  16  out of 50
Run  17  out of 50
Run  18  out of 50
Run  19  out of 50
Run  20  out of 50
Run  21  out of 50
Run  22  out of 50
Run  23  out of 50
Run  24  out of 50
Run  25  out of 50
Run  26  out of 50
Run  27  out of 50
Run  28  out of 50
Run  29  out of 50
Run  30  out of 50
Run  31  out of 50
Run  32  out of 50
Run  33  out of 50
Run  34  out of 50
Run  35  out of 50
Run  36  out of 50
Run  37  out of 50
Run  38  out of 50
Run  39  out of 50
Run  40  out of 50
Run  41  out of 50
Run  42  out of 50
Run  43  out of 50
Run  44  out of 50
Run  45  out of 50
Run  46  out of 50
Run  47  out of 50
Run  48  out of 50
Run  49  out of 50
Run  49  out of 50

In [None]:
meanMSE_Part_D = np.mean(errors)
improvement = 100 * (meanMSE_Part_B - meanMSE_Part_D) / meanMSE_Part_D
stdDev_Part_D = np.std(errors)
print('Mean MSE: {}\nStandard Deviation: {}'.format(meanMSE_Part_D, stdDev_Part_D))
print('Adding two more hidden layers decreased mean MSE by {} percent from part B'.format(improvement))

Mean MSE: 37.86726760864258
Standard Deviation: 25.10171780864401
Adding two more hidden layers decreased mean MSE by 37.483226562121445 percent from part B


Adding two more hidden layers reduced the error by about 37% from part B