<h2>Regression Models with Keras</h2>

<h3>Objective for this Notebook<h3>    
<h5> 1. How to use the Keras library to build a regression model.</h5>
<h5> 2. Download and Clean dataset </h5>
<h5> 3. Build a Neural Network </h5>
<h5> 4. Train and Test the Network. </h5>     

## Imports and Dependencies

In [1]:
import pandas as pd
import numpy as np

# Import the Scikit-Learn train_test_split function
from sklearn.model_selection import train_test_split

# Import Keras and the packages needed to build a regression model
import keras
from keras.models import Sequential
from keras.layers import Dense

## Download and Clean Dataset



The dataset is about the compressive strength of different samples of concrete based on the volumes of the different ingredients that were used to make them. Ingredients include:

1. Cement

2. Blast Furnace Slag

3. Fly Ash

4. Water

5. Superplasticizer

6. Coarse Aggregate

7. Fine Aggregate

In [2]:
# Download the data and read it into a Pandas DataFrame.

concrete_data = pd.read_csv('https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0101EN/labs/data/concrete_data.csv')
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


So the first concrete sample has 540 cubic meter of cement, 0 cubic meter of blast furnace slag, 0 cubic meter of fly ash, 162 cubic meter of water, 2.5 cubic meter of superplaticizer, 1040 cubic meter of coarse aggregate, 676 cubic meter of fine aggregate. Such a concrete mix which is 28 days old, has a compressive strength of 79.99 MPa.

In [3]:
# How many datapoints do we have?

concrete_data.shape

(1030, 9)

So, there are approximately 1000 samples to train our model on. Because of the few samples, we have to be careful not to overfit the training data.

In [4]:
# Check the dataset for any missing values.

concrete_data.describe()
concrete_data.isnull().sum()

Cement                0
Blast Furnace Slag    0
Fly Ash               0
Water                 0
Superplasticizer      0
Coarse Aggregate      0
Fine Aggregate        0
Age                   0
Strength              0
dtype: int64

The data looks very clean and is ready to be used to build our model.

#### Split data into predictors and target

The target variable in this problem is the concrete sample strength. Therefore, our predictors will be all the other columns.

In [5]:
concrete_data_columns = concrete_data.columns

predictors = concrete_data[concrete_data_columns[concrete_data_columns != 'Strength']] # all columns except Strength
target = concrete_data['Strength'] # Strength column

In [6]:
# Quick check of the predictors and target dataframes

print(predictors.head())
print(target.head())

   Cement  Blast Furnace Slag  Fly Ash  Water  Superplasticizer  \
0   540.0                 0.0      0.0  162.0               2.5   
1   540.0                 0.0      0.0  162.0               2.5   
2   332.5               142.5      0.0  228.0               0.0   
3   332.5               142.5      0.0  228.0               0.0   
4   198.6               132.4      0.0  192.0               0.0   

   Coarse Aggregate  Fine Aggregate  Age  
0            1040.0           676.0   28  
1            1055.0           676.0   28  
2             932.0           594.0  270  
3             932.0           594.0  365  
4             978.4           825.5  360  
0    79.99
1    61.89
2    40.27
3    41.05
4    44.30
Name: Strength, dtype: float64


In [7]:
# Save the number of predictors to *n_cols* since we will need this number when building our network.

n_cols = predictors.shape[1] # number of predictors

### Feature Selection

In [8]:
# Define the feature sets
X = predictors
X[0:5]

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360


In [9]:
# Define the labels
y = target
y[0:5]

0    79.99
1    61.89
2    40.27
3    41.05
4    44.30
Name: Strength, dtype: float64

## A. Build a Baseline Neural Network

#### Use the Keras library to build a neural network with the following:

- One hidden layer of 10 nodes, and a ReLU activation function

- Use the adam optimizer and the mean squared error as the loss function.

In [12]:
# A function that defines the regression model so that we can conveniently call it to create the model.

def baseline_model():
    # create model
    baseline_model = Sequential()
    baseline_model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    baseline_model.add(Dense(1))
    
    # compile model
    baseline_model.compile(optimizer='adam', loss='mean_squared_error')
    return baseline_model

The above function creates a model that has two hidden layers, each of 50 hidden units.

In [13]:
# Call the baseline_model() function to build the model
baseline_model = baseline_model()

In [14]:
# Performing train_test_split

X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.3, random_state=4)

#print the train set for a first glance
print('Train set:', X_train.shape,  y_train.shape)
#print the test set for a first glance
print('Test set:', X_test.shape,  y_test.shape)

Train set: (721, 8) (721,)
Test set: (309, 8) (309,)


In [15]:
# Fit the model using 50 epochs

baseline_model.fit(X_train, y_train, epochs=50, verbose=2)

Epoch 1/50
23/23 - 1s - loss: 22376.4785 - 536ms/epoch - 23ms/step
Epoch 2/50
23/23 - 0s - loss: 10635.1328 - 33ms/epoch - 1ms/step
Epoch 3/50
23/23 - 0s - loss: 9189.0107 - 32ms/epoch - 1ms/step
Epoch 4/50
23/23 - 0s - loss: 8033.9702 - 31ms/epoch - 1ms/step
Epoch 5/50
23/23 - 0s - loss: 7028.8564 - 35ms/epoch - 2ms/step
Epoch 6/50
23/23 - 0s - loss: 6097.1826 - 33ms/epoch - 1ms/step
Epoch 7/50
23/23 - 0s - loss: 5242.8223 - 35ms/epoch - 2ms/step
Epoch 8/50
23/23 - 0s - loss: 4589.0649 - 34ms/epoch - 1ms/step
Epoch 9/50
23/23 - 0s - loss: 4083.3491 - 35ms/epoch - 2ms/step
Epoch 10/50
23/23 - 0s - loss: 3667.1086 - 34ms/epoch - 1ms/step
Epoch 11/50
23/23 - 0s - loss: 3319.2837 - 35ms/epoch - 2ms/step
Epoch 12/50
23/23 - 0s - loss: 2992.0264 - 33ms/epoch - 1ms/step
Epoch 13/50
23/23 - 0s - loss: 2685.9949 - 34ms/epoch - 1ms/step
Epoch 14/50
23/23 - 0s - loss: 2380.9316 - 34ms/epoch - 1ms/step
Epoch 15/50
23/23 - 0s - loss: 2103.6970 - 32ms/epoch - 1ms/step
Epoch 16/50
23/23 - 0s - loss:

<keras.callbacks.History at 0x7f2b88078be0>

In [17]:
# predict

baseline_pred = baseline_model.predict(X_test)

In [18]:
# evaluate

from sklearn.metrics import mean_squared_error

baseline_mse = mean_squared_error(y_test, baseline_pred)
print(baseline_mse)

292.82349352072555


Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

In [19]:
# step1
# X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=-.3, random_state=4)

# step 2
# baseline_model.fit(X_train, y_train, epochs=50, verbose=2)
#baseline_pred = baseline_model.predict(X_test)

# step 3
# baseline_mse = mean_squared_error(y_test, baseline_pred)
# print(baseline_mse)

In [20]:
# Iterating MSE calculation to build up the list

bl_mse_list = []
list_length = 50

for i in range(list_length):
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=4)
    baseline_model.fit(X_train, y_train, epochs=50, verbose=2)
    baseline_pred = baseline_model.predict(X_test)
    baseline_mse = mean_squared_error(y_test, baseline_pred)
    
    bl_mse_list.append(baseline_mse)
    
print(bl_mse_list)

Epoch 1/50
23/23 - 0s - loss: 276.7990 - 39ms/epoch - 2ms/step
Epoch 2/50
23/23 - 0s - loss: 270.1143 - 35ms/epoch - 2ms/step
Epoch 3/50
23/23 - 0s - loss: 264.6984 - 33ms/epoch - 1ms/step
Epoch 4/50
23/23 - 0s - loss: 259.5013 - 32ms/epoch - 1ms/step
Epoch 5/50
23/23 - 0s - loss: 253.2444 - 30ms/epoch - 1ms/step
Epoch 6/50
23/23 - 0s - loss: 247.5506 - 51ms/epoch - 2ms/step
Epoch 7/50
23/23 - 0s - loss: 242.5065 - 37ms/epoch - 2ms/step
Epoch 8/50
23/23 - 0s - loss: 236.5523 - 31ms/epoch - 1ms/step
Epoch 9/50
23/23 - 0s - loss: 231.2749 - 34ms/epoch - 1ms/step
Epoch 10/50
23/23 - 0s - loss: 227.3334 - 32ms/epoch - 1ms/step
Epoch 11/50
23/23 - 0s - loss: 221.5897 - 33ms/epoch - 1ms/step
Epoch 12/50
23/23 - 0s - loss: 215.9935 - 33ms/epoch - 1ms/step
Epoch 13/50
23/23 - 0s - loss: 212.0699 - 40ms/epoch - 2ms/step
Epoch 14/50
23/23 - 0s - loss: 206.5029 - 35ms/epoch - 2ms/step
Epoch 15/50
23/23 - 0s - loss: 200.6783 - 35ms/epoch - 2ms/step
Epoch 16/50
23/23 - 0s - loss: 196.2036 - 35ms/ep

Report the mean and the standard deviation of the mean squared errors.

In [21]:
# Mean of the Baseline Model MSE list
bl_mse_mean = np.mean(bl_mse_list)
print("The mean of the baseline model's MSE list is: ", bl_mse_mean)

# Standard deviation of the Baseline Model MSE list
bl_mse_std = np.std(bl_mse_list)
print("The standard deviation of the baseline model's MSe list is: ", bl_mse_std)

The mean of the baseline model's MSE list is:  100.79406088536294
The standard deviation of the baseline model's MSe list is:  26.192656342561907


## B. Normalize Data for New Neural Network

#### Repeat Part A but use a normalized version of the data.

In [22]:
# Normalize the data by substracting the mean and dividing by the standard deviation.

predictors_norm = (predictors - predictors.mean()) / predictors.std()
predictors_norm.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,0.862735,-1.217079,-0.279597
1,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,1.055651,-1.217079,-0.279597
2,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,3.55134
3,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,5.055221
4,-0.790075,0.678079,-0.846733,0.488555,-1.038638,0.070492,0.647569,4.976069


In [23]:
# Save the number of predictors to *n_cols* since we will need this number when building our network.

n_cols_norm = predictors_norm.shape[1] # number of predictors

In [24]:
# A function that defines the regression model so that we can conveniently call it to create the model.

def normalized_model():
    # create model
    norm_model = Sequential()
    norm_model.add(Dense(10, activation='relu', input_shape=(n_cols_norm,)))
    norm_model.add(Dense(1))
    
    # compile model
    norm_model.compile(optimizer='adam', loss='mean_squared_error')
    return norm_model

In [25]:
# Call the normalized_model() function to build the model
norm_model = normalized_model()

In [26]:
# Define the feature sets
X_norm = predictors_norm
X_norm[0:5]

# Define the labels
y_norm = target
y[0:5]

0    79.99
1    61.89
2    40.27
3    41.05
4    44.30
Name: Strength, dtype: float64

In [27]:
# Iterating MSE calculation to build up the list

norm_mse_list = []
list_length = 50

for i in range(list_length):
    X_train, X_test, y_train, y_test = train_test_split(X_norm, y_norm, test_size=0.3, random_state=4)
    norm_model.fit(X_train, y_train, epochs=50, verbose=2)
    norm_pred = norm_model.predict(X_test)
    norm_mse = mean_squared_error(y_test, norm_pred)
    
    norm_mse_list.append(norm_mse)
    
print(norm_mse_list)

Epoch 1/50
23/23 - 0s - loss: 1606.9003 - 450ms/epoch - 20ms/step
Epoch 2/50
23/23 - 0s - loss: 1591.0958 - 35ms/epoch - 2ms/step
Epoch 3/50
23/23 - 0s - loss: 1575.8619 - 32ms/epoch - 1ms/step
Epoch 4/50
23/23 - 0s - loss: 1560.5397 - 32ms/epoch - 1ms/step
Epoch 5/50
23/23 - 0s - loss: 1545.5397 - 34ms/epoch - 1ms/step
Epoch 6/50
23/23 - 0s - loss: 1530.1271 - 32ms/epoch - 1ms/step
Epoch 7/50
23/23 - 0s - loss: 1514.5359 - 32ms/epoch - 1ms/step
Epoch 8/50
23/23 - 0s - loss: 1498.2355 - 32ms/epoch - 1ms/step
Epoch 9/50
23/23 - 0s - loss: 1481.5768 - 33ms/epoch - 1ms/step
Epoch 10/50
23/23 - 0s - loss: 1464.3424 - 32ms/epoch - 1ms/step
Epoch 11/50
23/23 - 0s - loss: 1445.7241 - 33ms/epoch - 1ms/step
Epoch 12/50
23/23 - 0s - loss: 1426.8955 - 33ms/epoch - 1ms/step
Epoch 13/50
23/23 - 0s - loss: 1406.5642 - 33ms/epoch - 1ms/step
Epoch 14/50
23/23 - 0s - loss: 1385.5956 - 35ms/epoch - 2ms/step
Epoch 15/50
23/23 - 0s - loss: 1363.3666 - 33ms/epoch - 1ms/step
Epoch 16/50
23/23 - 0s - loss: 1

In [28]:
# Mean of the Normalized Model MSE list
norm_mse_mean = np.mean(norm_mse_list)
print("The mean of the normalized model's MSE list is: ", norm_mse_mean)

# Standard deviation of the Normalized Model MSE list
norm_mse_std = np.std(norm_mse_list)
print("The standard deviation of the normalized model's MSE list is: ", norm_mse_std)

The mean of the normalized model's MSE list is:  55.08146399969431
The standard deviation of the normalized model's MSE list is:  59.280710240894486


#### How does the mean of the mean squared errors compare to that from Step A?

The mean of the normalized model's MSE list is 55.08, which is nearly half that of the baseline model's mean of 100.79

The standard deviation of the normalized model's MSE list is 59.28, which is a little more than double the baseline model's standard deviation of 26.19

## C. Increase Number of Epochs for New Neural Network

#### Repeat Part B but use 100 epochs this time for training.

In [34]:
# A function that defines the regression model so that we can conveniently call it to create the model.

def more_epochs_model():
    # create model
    more_model = Sequential()
    more_model.add(Dense(50, activation='relu', input_shape=(n_cols_norm,)))
    more_model.add(Dense(1))
    
    # compile model
    more_model.compile(optimizer='adam', loss='mean_squared_error')
    return more_model

In [35]:
# Call the more_epochs_model() function to build the model
more_model = more_epochs_model()

In [36]:
# Iterating MSE calculation to build up the list

more_mse_list = []

for i in range(list_length):
    X_train, X_test, y_train, y_test = train_test_split(X_norm, y_norm, test_size=0.3, random_state=4)
    more_model.fit(X_train, y_train, epochs=50, verbose=2)
    more_pred = more_model.predict(X_test)
    more_mse = mean_squared_error(y_test, more_pred)
    
    more_mse_list.append(more_mse)
    
print(more_mse_list)

Epoch 1/50
23/23 - 0s - loss: 1603.9688 - 435ms/epoch - 19ms/step
Epoch 2/50
23/23 - 0s - loss: 1558.5231 - 35ms/epoch - 2ms/step
Epoch 3/50
23/23 - 0s - loss: 1514.4658 - 35ms/epoch - 2ms/step
Epoch 4/50
23/23 - 0s - loss: 1469.9624 - 34ms/epoch - 1ms/step
Epoch 5/50
23/23 - 0s - loss: 1424.2463 - 35ms/epoch - 2ms/step
Epoch 6/50
23/23 - 0s - loss: 1376.3610 - 34ms/epoch - 1ms/step
Epoch 7/50
23/23 - 0s - loss: 1324.7687 - 35ms/epoch - 2ms/step
Epoch 8/50
23/23 - 0s - loss: 1271.0841 - 34ms/epoch - 1ms/step
Epoch 9/50
23/23 - 0s - loss: 1213.8309 - 35ms/epoch - 2ms/step
Epoch 10/50
23/23 - 0s - loss: 1154.2468 - 35ms/epoch - 2ms/step
Epoch 11/50
23/23 - 0s - loss: 1092.3799 - 33ms/epoch - 1ms/step
Epoch 12/50
23/23 - 0s - loss: 1028.9181 - 34ms/epoch - 1ms/step
Epoch 13/50
23/23 - 0s - loss: 964.6245 - 34ms/epoch - 1ms/step
Epoch 14/50
23/23 - 0s - loss: 899.5549 - 33ms/epoch - 1ms/step
Epoch 15/50
23/23 - 0s - loss: 835.2320 - 34ms/epoch - 1ms/step
Epoch 16/50
23/23 - 0s - loss: 771.

In [39]:
# Mean of the More-Epochs Model MSE list
more_mse_mean = np.mean(more_mse_list)
print("The mean of the MSE list of the model with an increased number of epochs is: ", more_mse_mean)

# Standard deviation of the More_Epochs Model MSE list
more_mse_std = np.std(more_mse_list)
print("The standard deviation MSE list of the model with an increased number of epochs is: ", more_mse_std)

The mean of the MSE list of the model with an increased number of epochs is:  34.036314186609616
The standard deviation MSE list of the model with an increased number of epochs is:  23.01714732960161


#### How does the mean of the mean squared errors compare to that from Step B?

The mean of the MSE list of the more-epochs model is 34.04, which is a little more than half that of the normalized model's mean of 55.08

The standard deviation of the MSE list of the more-epochs model is 23.02, which is a little less than half that of the normalized model's standard deviation of 59.28

## D. Increase Number of Hidden Layers for New Neural Network

#### Repeat part B but use a neural network with the following instead:

- Three hidden layers, each of 10 nodes and ReLU activation function.

In [43]:
# A function that defines the regression model so that we can conveniently call it to create the model.

def deeper_model():
    # create model
    deeper_model = Sequential()
    deeper_model.add(Dense(10, activation='relu', input_shape=(n_cols_norm,)))
    deeper_model.add(Dense(10, activation='relu'))
    deeper_model.add(Dense(10, activation='relu'))
    deeper_model.add(Dense(1))
    
    # compile model
    deeper_model.compile(optimizer='adam', loss='mean_squared_error')
    return deeper_model

In [44]:
# Call the deeper_model() function to build the model
deeper_model = deeper_model()

In [45]:
# Iterating MSE calculation to build up the list

deeper_mse_list = []

for i in range(list_length):
    X_train, X_test, y_train, y_test = train_test_split(X_norm, y_norm, test_size=0.3, random_state=4)
    deeper_model.fit(X_train, y_train, epochs=50, verbose=2)
    deeper_pred = deeper_model.predict(X_test)
    deeper_mse = mean_squared_error(y_test, deeper_pred)
    
    deeper_mse_list.append(deeper_mse)
    
print(deeper_mse_list)

Epoch 1/50
23/23 - 1s - loss: 1576.7892 - 581ms/epoch - 25ms/step
Epoch 2/50
23/23 - 0s - loss: 1551.4052 - 41ms/epoch - 2ms/step
Epoch 3/50
23/23 - 0s - loss: 1520.4249 - 43ms/epoch - 2ms/step
Epoch 4/50
23/23 - 0s - loss: 1475.1348 - 39ms/epoch - 2ms/step
Epoch 5/50
23/23 - 0s - loss: 1408.8359 - 39ms/epoch - 2ms/step
Epoch 6/50
23/23 - 0s - loss: 1309.8101 - 40ms/epoch - 2ms/step
Epoch 7/50
23/23 - 0s - loss: 1174.4263 - 38ms/epoch - 2ms/step
Epoch 8/50
23/23 - 0s - loss: 995.0168 - 38ms/epoch - 2ms/step
Epoch 9/50
23/23 - 0s - loss: 766.8129 - 38ms/epoch - 2ms/step
Epoch 10/50
23/23 - 0s - loss: 522.9305 - 40ms/epoch - 2ms/step
Epoch 11/50
23/23 - 0s - loss: 325.4915 - 40ms/epoch - 2ms/step
Epoch 12/50
23/23 - 0s - loss: 228.2847 - 39ms/epoch - 2ms/step
Epoch 13/50
23/23 - 0s - loss: 198.9483 - 45ms/epoch - 2ms/step
Epoch 14/50
23/23 - 0s - loss: 191.3284 - 40ms/epoch - 2ms/step
Epoch 15/50
23/23 - 0s - loss: 186.6144 - 39ms/epoch - 2ms/step
Epoch 16/50
23/23 - 0s - loss: 182.6807 

In [46]:
# Mean of the Deeper Model MSE list
deeper_mse_mean = np.mean(deeper_mse_list)
print("The mean of the deeper model's MSE list is: ", deeper_mse_mean)

# Standard deviation of the Deeper Model MSE list
deeper_mse_std = np.std(deeper_mse_list)
print("The standard deviation of the deeper model's MSE list is: ", deeper_mse_std)

The mean of the deeper model's MSE list is:  42.68747416611099
The standard deviation of the deeper model's MSE list is:  19.65497153161827


How does the mean of the mean squared errors compare to that from Step B?

The mean of the MSE list of the deeper model is 42.69, which is a little less than the normalized model's mean of 55.08

The standard deviation of the MSE list of the deeper model is 19.65, which is about a third the size standard deviation of the normalized model standard deviation

The mean of the normalized model's MSE list is:  55.08146399969431
The standard deviation of the normalized model's MSE list is:  59.280710240894486