# Final Project : Introduction to Deep Learning & Neural Networks with Keras

##### Project Description

In this project, you will build a regression model using the Keras library to model the same data about concrete compressive strength that we used in labs 3.

In [1]:
# Installing Tensorflow

!pip install tensorflow

Collecting tensorboard<1.14.0,>=1.13.0 (from tensorflow)
[?25l  Downloading https://files.pythonhosted.org/packages/0f/39/bdd75b08a6fba41f098b6cb091b9e8c7a80e1b4d679a581a0ccd17b10373/tensorboard-1.13.1-py3-none-any.whl (3.2MB)
[K     |████████████████████████████████| 3.2MB 7.5MB/s eta 0:00:01�██████████████▎ | 3.0MB 7.5MB/s eta 0:00:01
Installing collected packages: tensorboard
Successfully installed tensorboard-1.13.1


In [2]:
# And installing keras

!pip install keras



In [3]:
# And making sure that I have what I need...

import keras

Using TensorFlow backend.


In [4]:
# Importing the packages I need to make this run....

import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

from keras.models import Sequential
from keras.layers import Dense

In [5]:
concrete_data = pd.read_csv('https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0101EN/labs/data/concrete_data.csv')
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


# Part A : Building a baseline model

Use the Keras library to build a neural network with the following:

- One hidden layer of 10 nodes, and a ReLU activation function

- Use the adam optimizer and the mean squared error as the loss function.

Perform the following steps:

1. Randomly split the data into a training and test sets by holding 30% of the data for testing. You can use the ```train_test_split``` helper function from Scikit-learn.

2. Train the model on the training data using 50 epochs.

3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.

4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

5. Report the mean and the standard deviation of the mean squared errors.



In [6]:
concrete_data.columns.tolist()

['Cement',
 'Blast Furnace Slag',
 'Fly Ash',
 'Water',
 'Superplasticizer',
 'Coarse Aggregate',
 'Fine Aggregate',
 'Age',
 'Strength']

In [7]:
# Split the data into feature and response

X = concrete_data[['Cement', 'Blast Furnace Slag', 'Fly Ash', 'Water', 'Superplasticizer', 
                   'Coarse Aggregate', 'Fine Aggregate', 'Age']]
y = concrete_data['Strength']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = .3)



In [8]:
# Determining the input shape for our model

n_cols = X_train.shape[1]

In [9]:
# Building our initial model

def regression_model_base():
    # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(1))
    
    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

In [10]:
model = regression_model_base()

Instructions for updating:
Colocations handled automatically by placer.


In [11]:
model.fit(X_train, y_train, validation_split=0.3, epochs=50, verbose=False)

Instructions for updating:
Use tf.cast instead.


<keras.callbacks.History at 0x7f679e933c50>

In [12]:
predictions = model.predict(X_test)

MSE = mean_squared_error(y_test, predictions)

print('The MSE for my initial model is {}'.format(MSE))

The MSE for my initial model is 162.87771191772387


## Running this 50 times
Clearly the simplest way to do this is using a for loop to 
1. resplit the data, 
2. fit the model, 
3. make the predictions, and 
4. compute the MSE
50 times, each time saving the computed MSE to a list.

In [13]:
# creating a list to store the computed MSE values
MSE_list = []

for i in range(50):
    if i%10 == 0:
        print(i)
    
    # Do a train_test_split of the data
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = .3)

    # Create a model
    model = regression_model_base()
    # And fit it on my current training data, setting verbose = False to avoid 50*50 = 2500 lines of feedback
    model.fit(X_train, y_train, validation_split=0.3, epochs=50, verbose=False)
    
    # Making predictions on my held out data
    predictions = model.predict(X_test)

    # And computing the MSE
    MSE = mean_squared_error(y_test, predictions)
    
    # And appending it to my list of MSE values so far
    MSE_list.append(MSE)

0
10
20
30
40


In [14]:
base_mean = np.mean(MSE_list)
base_std = np.std(MSE_list)

In [15]:
print('The mean of the MSE values for models of this type is {}'.format(np.around(base_mean, 4)))
print('The standard deviation of the MSE values for models of this type is {}'.format(np.around(base_std, 4)))

The mean of the MSE values for models of this type is 477.0868
The standard deviation of the MSE values for models of this type is 439.1986


# Part B : Using a normalized version of the data

Use the Keras library to build a neural network with the following:

- One hidden layer of 10 nodes, and a ReLU activation function

- Use the adam optimizer and the mean squared error as the loss function.

Perform the following steps:

1. Randomly split the data into a training and test sets by holding 30% of the data for testing. You can use the ```train_test_split``` helper function from Scikit-learn.

2. Normalize your data.

2. Train the model on the training data using 50 epochs.

3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.

4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

5. Report the mean and the standard deviation of the mean squared errors.


#### Observation:

Here we can use essentially the same set up as above, with the additional step that we need to normalize our data *using the mean and standard deviation of our __training data__* before fitting our model or making predictions.

In [16]:
def train_test_normalized(X, y, percent_test = 0.3):
    
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = percent_test)
    
    # Computing mean and standard devations for the training data
    
    X_train_mean = np.mean(X_train, axis = 0)
    X_train_std = np.std(X_train, axis = 0)
    
    y_train_mean = np.mean(y_train)
    y_train_std = np.std(y_train)
    
    # Normalizing the training data
    
    X_train_norm = (X_train - X_train_mean)/X_train_std
    y_train_norm = (y_train - y_train_mean)/y_train_std
    
    # Normalizing the test data (using the mean and standard deviation from the training data!!!)
    
    X_test_norm = (X_test - X_train_mean)/X_train_std
    y_test_norm = (y_test - y_train_mean)/y_train_std
    
    return X_train_norm, X_test_norm, y_train_norm, y_test_norm

Now, we should be able to use the model set up that we created in part A, but I'll go ahead and pull it down here to be safe.

In [17]:
# Building our second model

def regression_model_norm():
    # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(1))
    
    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

In [18]:
# Setting up the model

model = regression_model_norm()

In [19]:
# Obtaining a normalized train-test split of our data

X_train, X_test, y_train, y_test = train_test_normalized(X, y)

In [20]:
# Fitting the model

model.fit(X_train, y_train, validation_split=0.3, epochs=50, verbose=False)

<keras.callbacks.History at 0x7f63ac55eb70>

In [21]:
# Checking our predictions

predictions = model.predict(X_test)

MSE = mean_squared_error(y_test, predictions)

print('The MSE for my second model is {}'.format(MSE))

The MSE for my second model is 0.3657333248752294


### Running this 50 times

Again, the simplest way to do this is with a 50 cycle for loop, saving the calculated MSE to a list at each pass. This time our loop will:
1. Call ```train_test_normalized``` to create a normalized split of our data,
2. Call ```regression_model_norm``` to create a neural network model,
3. Fit the model on our normalized training data,
4. Make predictions on our test data, and
5. Compute the MSE, saving the computed value to a list at each step

Following this, we can compute the mean and standard deviation of the computed MSE values.

In [22]:
normed_MSE_list = []

for i in range(50):
    if i%10 == 0:
        print(i)
    
    # Do a normalized train_test_split of the data
    X_train, X_test, y_train, y_test = train_test_normalized(X, y)

    # Create a model
    model = regression_model_norm()
    # And fit it on my current training data, setting verbose = False to avoid 50*50 = 2500 lines of feedback
    model.fit(X_train, y_train, validation_split=0.3, epochs=50, verbose=False)
    
    # Making predictions on my held out data
    predictions = model.predict(X_test)

    # And computing the MSE
    MSE = mean_squared_error(y_test, predictions)
    
    # And appending it to my list of MSE values so far
    normed_MSE_list.append(MSE)

0
10
20
30
40


In [23]:
base_mean = np.mean(normed_MSE_list)
base_std = np.std(normed_MSE_list)

In [24]:
print('The mean of the MSE values for models of type B is {}'.format(np.around(base_mean, 4)))
print('The standard deviation of the MSE values for models of type B is {}'.format(np.around(base_std, 4)))

The mean of the MSE values for models of type B is 0.2932
The standard deviation of the MSE values for models of type B is 0.0478


## How does the mean of the mean squared errors compare to that from Step A?

We have the following data:

| | Part A | Part B |
|--|--------|-------|
|mean of MSE|477.0868|0.2932|
|stdev of MSE| 439.1986| 0.0478|

From this we see that both the mean and the standard deviation of the mean squared errors are significantly reduced for Part B as compared to Part A. While some of this can probably be explained by a reduction in the size of the values taken by the ```strength``` variable, this is unlikely to be the full explanation.

# Part C : Increasing the number of epochs to 100

**Note that this is identical to part B but with a longer fitting stage**

Use the Keras library to build a neural network with the following:

- One hidden layer of 10 nodes, and a ReLU activation function

- Use the adam optimizer and the mean squared error as the loss function.

Perform the following steps:

1. Randomly split the data into a training and test sets by holding 30% of the data for testing. You can use the ```train_test_split``` helper function from Scikit-learn.

2. Normalize your data.

2. Train the model on the training data using 100 epochs.

3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.

4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

5. Report the mean and the standard deviation of the mean squared errors.


#### Observation:

Here we can use essentially the same set up as above, with the additional step that we need to normalize our data *using the mean and standard deviation of our __training data__* before fitting our model or making predictions.

In [25]:
# Building our third model

def regression_model_epochs():
    # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(1))
    
    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

In [26]:
# Setting up the model

model = regression_model_epochs()

In [27]:
# Obtaining a normalized train-test split of our data

X_train, X_test, y_train, y_test = train_test_normalized(X, y)

In [28]:
# Fitting the model (for 100 epochs this time)

model.fit(X_train, y_train, validation_split=0.3, epochs=100, verbose=False)

<keras.callbacks.History at 0x7f6382811d30>

In [29]:
# Checking our predictions

predictions = model.predict(X_test)

MSE = mean_squared_error(y_test, predictions)

print('The MSE for my third model is {}'.format(MSE))

The MSE for my third model is 0.23535468624363026


### Running this 50 times

Again, the simplest way to do this is with a 50 cycle for loop, saving the calculated MSE to a list at each pass. This time our loop will:
1. Call ```train_test_normalized``` to create a normalized split of our data,
2. Call ```regression_model_epochs``` to create a neural network model,
3. Fit the model on our normalized training data for 100(!) epochs,
4. Make predictions on our test data, and
5. Compute the MSE, saving the computed value to a list at each step

Following this, we can compute the mean and standard deviation of the computed MSE values.

In [30]:
epochs_MSE_list = []

for i in range(50):
    if i%10 == 0:
        print(i)
    
    # Do a normalized train_test_split of the data
    X_train, X_test, y_train, y_test = train_test_normalized(X, y)

    # Create a model
    model = regression_model_epochs()
    # And fit it on my current training data, setting verbose = False to avoid 100*50 = 5000 lines of feedback
    model.fit(X_train, y_train, validation_split=0.3, epochs=100, verbose=False)
    
    # Making predictions on my held out data
    predictions = model.predict(X_test)

    # And computing the MSE
    MSE = mean_squared_error(y_test, predictions)
    
    # And appending it to my list of MSE values so far
    epochs_MSE_list.append(MSE)

0
10
20
30
40


In [31]:
base_mean = np.mean(epochs_MSE_list)
base_std = np.std(epochs_MSE_list)

In [32]:
print('The mean of the MSE values for models of type C is {}'.format(np.around(base_mean, 4)))
print('The standard deviation of the MSE values for models of type C is {}'.format(np.around(base_std, 4)))

The mean of the MSE values for models of type C is 0.2205
The standard deviation of the MSE values for models of type C is 0.0474


## How does the mean of the mean squared errors compare to that from Step B?

We have the following data:

| | Part B | Part C |
|--|--------|-------|
|mean of MSE|0.2932|0.2205|
|stdev of MSE| 0.0478|0.0474|

We see here that both the mean and standard deviation of the mean squared errors are somewhat smaller in Part C (with a longer training period) than than are in Part B.

# Part D : Increasing the number of hidden layers to 3

Use the Keras library to build a neural network with the following:

- Three hidden layers of 10 nodes, and a ReLU activation function

- Use the adam optimizer and the mean squared error as the loss function.

Perform the following steps:

1. Randomly split the data into a training and test sets by holding 30% of the data for testing. You can use the ```train_test_split``` helper function from Scikit-learn.

2. Normalize your data.

2. Train the model on the training data using 50 epochs.

3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.

4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

5. Report the mean and the standard deviation of the mean squared errors.


#### Observation:

Here we can use essentially the same set up as above, with the additional step that we need to normalize our data *using the mean and standard deviation of our __training data__* before fitting our model or making predictions.

In [33]:
# Building our final model

def regression_model_deeper():
    # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(10, activation = 'relu'))
    model.add(Dense(10, activation = 'relu'))
    model.add(Dense(1))
    
    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

In [34]:
# Setting up the model

model = regression_model_deeper()

In [35]:
# Obtaining a normalized train-test split of our data

X_train, X_test, y_train, y_test = train_test_normalized(X, y)

In [36]:
# Fitting the model (back to 50 epochs this time)

model.fit(X_train, y_train, validation_split=0.3, epochs=50, verbose=False)

<keras.callbacks.History at 0x7f6378608da0>

In [37]:
# Checking our predictions

predictions = model.predict(X_test)

MSE = mean_squared_error(y_test, predictions)

print('The MSE for my fourth (and final) model is {}'.format(MSE))

The MSE for my fourth (and final) model is 0.22409158733847825


### Running this 50 times

Again, the simplest way to do this is with a 50 cycle for loop, saving the calculated MSE to a list at each pass. This time our loop will:
1. Call ```train_test_normalized``` to create a normalized split of our data,
2. Call ```regression_model_deeper``` to create a neural network model,
3. Fit the model on our normalized training data for 50 epochs,
4. Make predictions on our test data, and
5. Compute the MSE, saving the computed value to a list at each step

Following this, we can compute the mean and standard deviation of the computed MSE values.

In [38]:
deeper_MSE_list = []

for i in range(50):
    
    print(i)
    
    # Do a normalized train_test_split of the data
    X_train, X_test, y_train, y_test = train_test_normalized(X, y)

    # Create a model
    model = regression_model_deeper()
    # And fit it on my current training data, setting verbose = False to avoid 50*50 = 2500 lines of feedback
    model.fit(X_train, y_train, validation_split=0.3, epochs=50, verbose=False)
    
    # Making predictions on my held out data
    predictions = model.predict(X_test)

    # And computing the MSE
    MSE = mean_squared_error(y_test, predictions)
    
    # And appending it to my list of MSE values so far
    deeper_MSE_list.append(MSE)

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49


In [39]:
base_mean = np.mean(deeper_MSE_list)
base_std = np.std(deeper_MSE_list)\

In [40]:
print('The mean of the MSE values for models of type D is {}'.format(np.around(base_mean, 4)))
print('The standard deviation of the MSE values for models of type D is {}'.format(np.around(base_std, 4)))

The mean of the MSE values for models of type D is 0.2319
The standard deviation of the MSE values for models of type D is 0.0354


## How does the mean of the mean squared errors compare to that from Step B?

We have the following data:

| | Part B | Part C | Part D |
|--|--------|-------|--------|
|mean of MSE|0.2932|0.2205|0.2319|
|stdev of MSE| 0.0478|0.0474|0.0354|

From this, we see that the mean and standard deviation of the mean squared errors in step D are smaller than those in step B, and also smaller than those in step C. In other words, we don't yet seem to be overfitting (at least not in any more significant way than previously), and adding more layers appears to improve performance more than simply training for a greater number of epochs.