IBM: Introduction to Deep Learning & Neural Networks Final Project

Build a Regression Model in Keras

A: Build a baseline model.
Use the Keras library to build a neural network with the following:
- One hidden layer of 10 nodes, and a ReLU activation function

- Use the adam optimizer and the mean squared error  as the loss function.

1. Randomly split the data into a training and test sets by holding 30% of the data for testing.

2. Train the model on the training data using 50 epochs.

3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.

4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

5. Report the mean and the standard deviation of the mean squared errors.

In [7]:
import pandas as pd
import numpy as np
import keras
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split

In [2]:
# Load data
concrete_data = pd.read_csv('https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0101EN/labs/data/concrete_data.csv')
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


In [3]:
# Split data to predictors and target
concrete_data_columns = concrete_data.columns

predictors = concrete_data[concrete_data_columns[concrete_data_columns != 'Strength']] # all columns except Strength
target = concrete_data['Strength'] # Strength column

In [5]:
# Get number of predictors as inputs for network

n_cols = predictors.shape[1]
n_cols

8

In [6]:
def regression_model():
    # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,))) # first layer must match number of inputs
    model.add(Dense(1)) # One output value

    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

In [9]:
# Compile the model
model = regression_model()





In [14]:
errors = []
for i in range(50):
    # Step 1: Randomly split dataset to training and testing sets
    X_train, X_test, y_train, y_test = train_test_split(predictors, target, test_size=0.3)

    # Step 2: Train the model on the training data using 50 epochs
    model.fit(X_train, y_train, verbose=0, epochs=50)

    # Step 3: Evaluate the model on the test data and compute the mean squared error
    errors.append(model.evaluate(X_test, y_test))
print('Mean MSE: {}\nStandard Deviation: {}'.format(np.mean(errors), np.std(errors)))

Mean MSE: 64.45578796386718
Standard Deviation: 20.524928457567043


B. Normalize the Data


Repeat Part A but use a normalized version of the data.

In [15]:
# Normalize predictors

predictors_norm = (predictors - predictors.mean()) / predictors.std()

In [16]:
# Re-run the model on normalized predictors

errors = []
for i in range(50):
    # Step 1: Randomly split dataset to training and testing sets
    X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size=0.3)

    # Step 2: Train the model on the training data using 50 epochs
    model.fit(X_train, y_train, verbose=0, epochs=50)

    # Step 3: Evaluate the model on the test data and compute the mean squared error
    errors.append(model.evaluate(X_test, y_test))
print('Mean MSE: {}\nStandard Deviation: {}'.format(np.mean(errors), np.std(errors)))

Mean MSE: 53.87340133666992
Standard Deviation: 31.211576387766904


Normalizing the data decreased the MSE by about 16% from Part A

C. Increase the number of epochs

Repeat Part B but use 100 epochs this time for training

In [18]:
errors = []
for i in range(50):
    # Step 1: Randomly split dataset to training and testing sets
    X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size=0.3)

    # Step 2: Train the model on the training data using 50 epochs
    model.fit(X_train, y_train, verbose=0, epochs=100)

    # Step 3: Evaluate the model on the test data and compute the mean squared error
    errors.append(model.evaluate(X_test, y_test))
print('Mean MSE: {}\nStandard Deviation: {}'.format(np.mean(errors), np.std(errors)))

Mean MSE: 37.40694549560547
Standard Deviation: 2.475810510526528


Increasing the epochs from 50 to 100 reduced the MSE by about 30% from Part B

D. Increase the number of hidden layers

Repeat part B but use a neural network with three hidden layers, each with 10 nodes and a ReLU activation function

In [19]:
def three_layer_model():
        # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,))) # first layer must match number of inputs
    model.add(Dense(10, activation="relu"))
    model.add(Dense(10, activation="relu"))
    model.add(Dense(1)) # One output value

    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

In [20]:
# Run the evaluation of the new model
model = three_layer_model()
errors = []
for i in range(50):
    print('Run ', i)
    # Step 1: Randomly split dataset to training and testing sets
    X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size=0.3)

    # Step 2: Train the model on the training data using 50 epochs
    model.fit(X_train, y_train, verbose=0, epochs=50)

    # Step 3: Evaluate the model on the test data and compute the mean squared error
    errors.append(model.evaluate(X_test, y_test))
print('Mean MSE: {}\nStandard Deviation: {}'.format(np.mean(errors), np.std(errors)))

Run  0
Run  1
Run  2
Run  3
Run  4
Run  5
Run  6
Run  7
Run  8
Run  9
Run  10
Run  11
Run  12
Run  13
Run  14
Run  15
Run  16
Run  17
Run  18
Run  19
Run  20
Run  21
Run  22
Run  23
Run  24
Run  25
Run  26
Run  27
Run  28
Run  29
Run  30
Run  31
Run  32
Run  33
Run  34
Run  35
Run  36
Run  37
Run  38
Run  39
Run  40
Run  41
Run  42
Run  43
Run  44
Run  45
Run  46
Run  47
Run  48
Run  49
Mean MSE: 32.43746513366699
Standard Deviation: 20.841952452986124


Adding two more hidden layers reduced the error by about 40% from part B