## Project Objective


In this course project, you will build a regression model using the deep learning Keras library, and then you will experiment with increasing the number of training epochs and changing number of hidden layers and you will see how changing these parameters impacts the performance of the model.

## Step-by-step instructions
<h5> 1. Assignment Topic:</h5>
In this project, you will build a regression model using the Keras library to model the same data about concrete compressive strength that we used in labs 3.
    
<h5> 2. Concrete Data: </h5>
For your convenience, the data can be found here again: https://cocl.us/concrete_data. To recap, the predictors in the data of concrete strength include:

<li>Cement</li>
<li>Blast Furnace Slag</li>
<li>Fly Ash</li>
<li>Water</li>
<li>Superplasticizer</li>
<li>Coarse Aggregate</li>
<li>Fine Aggregate    </li>
    
<h5> 3. Build a Neural Network </h5>
<h5> 4. Train and Test the Network. </h5>     


## Important information
How to submit:

You will need to submit your code for each part in a Jupyter Notebook. 
Since each part builds on the previous one, you can submit the same notebook four times for grading. 
Please make sure that you:
<ul><li>use Markdown to clearly label your code for each part,</li>
<li>properly comment your code so that your peer who is grading your work is able to understand your code easily,</li>
<li>include your comments and discussion of the difference in the mean of the mean squared errors among the different parts.</li>

## Import all necessary packages

Let's start by importing the necessary libraries. Make sure these have been installed before-hand. Keras normally runs on top of a low-level library such as TensorFlow. This means that to be able to use the Keras library, you will have to install TensorFlow first and when you import the Keras library, it will be explicitly displayed what backend was used to install the Keras library.


In [1]:
import pandas as pd #we will require this for putting data into nice dataframes
import numpy as np 
import keras
import sklearn
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

Using TensorFlow backend.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  LARGE_SPARSE_SUPPORTED = LooseVersion(scipy_version) >= '0.14.0'


**Note to the Reviewer**:

With this code I satisfy the following criteria
1. The learner used the Keras library to build the regression model.

## Download and read data


The dataset is about the compressive strength of different samples of concrete based on the volumes of the different ingredients that were used to make them. Ingredients include:

<strong>1. Cement</strong>

<strong>2. Blast Furnace Slag</strong>

<strong>3. Fly Ash</strong>

<strong>4. Water</strong>

<strong>5. Superplasticizer</strong>

<strong>6. Coarse Aggregate</strong>

<strong>7. Fine Aggregate</strong>


Let's download the data and read it into a <em>pandas</em> dataframe.


In [2]:
concrete_data = pd.read_csv('https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0101EN/labs/data/concrete_data.csv')
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


So the first concrete sample has 540 cubic meter of cement, 0 cubic meter of blast furnace slag, 0 cubic meter of fly ash, 162 cubic meter of water, 2.5 cubic meter of superplaticizer, 1040 cubic meter of coarse aggregate, 676 cubic meter of fine aggregate. Such a concrete mix which is 28 days old, has a compressive strength of 79.99 MPa. 


#### Let's check how many data points we have.


In [3]:
concrete_data.shape

(1030, 9)

So, there are approximately 1000 samples to train our model on. Because of the few samples, we have to be careful not to overfit the training data.


## Data wrangling

Let us start with the data wrangling...

Let's check the dataset for any missing values.


In [4]:
concrete_data.describe()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
count,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0
mean,281.167864,73.895825,54.18835,181.567282,6.20466,972.918932,773.580485,45.662136,35.817961
std,104.506364,86.279342,63.997004,21.354219,5.973841,77.753954,80.17598,63.169912,16.705742
min,102.0,0.0,0.0,121.8,0.0,801.0,594.0,1.0,2.33
25%,192.375,0.0,0.0,164.9,0.0,932.0,730.95,7.0,23.71
50%,272.9,22.0,0.0,185.0,6.4,968.0,779.5,28.0,34.445
75%,350.0,142.95,118.3,192.0,10.2,1029.4,824.0,56.0,46.135
max,540.0,359.4,200.1,247.0,32.2,1145.0,992.6,365.0,82.6


In [5]:
concrete_data.isnull().sum()

Cement                0
Blast Furnace Slag    0
Fly Ash               0
Water                 0
Superplasticizer      0
Coarse Aggregate      0
Fine Aggregate        0
Age                   0
Strength              0
dtype: int64

The data looks very clean and is ready to be used to build our model.


## Data splitting and data wrangling continued

Let us split the data into predictors and targets (and into test and validation groups later on)

#### Split data into predictors and target


The target variable in this problem is the concrete sample strength. Therefore, our predictors will be all the other columns.


In [6]:
concrete_data_columns = concrete_data.columns

predictors = concrete_data[concrete_data_columns[concrete_data_columns != 'Strength']] # all columns except Strength
target = concrete_data['Strength'] # Strength column

<a id="item2"></a>


Let's do a quick sanity check of the predictors and the target dataframes.


In [7]:
predictors.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360


In [8]:
target.head()

0    79.99
1    61.89
2    40.27
3    41.05
4    44.30
Name: Strength, dtype: float64

## Data wrangling continued (data normalization)

Finally, the last step is to normalize the data by substracting the mean and dividing by the standard deviation. You need normalized data for many statistical methods to be applicable.


**Note to the Reviewer**:

With this code I satisfy the following criteria:

4. **The data was normalized** and 30% of the data was held out for testing.

In [9]:
predictors_norm = (predictors - predictors.mean()) / predictors.std()
predictors_norm.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,0.862735,-1.217079,-0.279597
1,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,1.055651,-1.217079,-0.279597
2,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,3.55134
3,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,5.055221
4,-0.790075,0.678079,-0.846733,0.488555,-1.038638,0.070492,0.647569,4.976069


## Splitting into test and validation sets

In [10]:
X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size=0.3, random_state=42) #random_state is set for control purposes, in real application it should not be set
n_cols = X_train.shape[1] # number of predictors
print(n_cols)

8


Let's save the number of predictors to _n_cols_ since we will need this number when building our network.

## Build a Neural Network


Let's define a function that defines our regression model for us so that we can conveniently call it to create our model.


**Note to the Reviewer**:

With this code I satisfy the following criteria
1. The learner used the Keras library to build the regression model.
2. A model consisting of three hidden layers was built, with 10 nodes in each hidden layer and are activated using the ReLU activation function.
3. The correct optimizer and loss function are correctly used as per the instructions.


In [11]:
# define regression model
def regression_model():
    # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(1))
    
    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

The above function create a model that has three hidden layers, with 10 nodes.


<a id="item4"></a>


<a id='item34'></a>


## Train and Test the Network


Let's call the function now to create our model.


In [12]:
# build the model
model = regression_model()







Next, we will train the model. (We will NOT train and test the model at the same time using the _fit_ method where will leave out 30% of the data for validation) 
<br/>We will train the model for 100 epochs.

In [13]:
# fit the model
#model.fit(predictors_norm, target, validation_split=0.3, epochs=100, verbose=2) #This is the in-built splitting function. But I have already split the dataset into training and testing set.
model.fit(X_train, y_train, epochs=100, verbose=2)



Epoch 1/100


2022-03-21 12:01:56.406877: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
2022-03-21 12:01:56.417066: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2593910000 Hz
2022-03-21 12:01:56.417724: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56503c1fc7d0 executing computations on platform Host. Devices:
2022-03-21 12:01:56.417765: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>


 - 0s - loss: 1567.4282
Epoch 2/100
 - 0s - loss: 1542.3757
Epoch 3/100
 - 0s - loss: 1506.6626
Epoch 4/100
 - 0s - loss: 1449.1003
Epoch 5/100
 - 0s - loss: 1354.3506
Epoch 6/100
 - 0s - loss: 1199.2958
Epoch 7/100
 - 0s - loss: 947.5507
Epoch 8/100
 - 0s - loss: 632.8955
Epoch 9/100
 - 0s - loss: 378.6423
Epoch 10/100
 - 0s - loss: 290.4252
Epoch 11/100
 - 0s - loss: 258.1388
Epoch 12/100
 - 0s - loss: 238.3941
Epoch 13/100
 - 0s - loss: 221.0249
Epoch 14/100
 - 0s - loss: 208.3653
Epoch 15/100
 - 0s - loss: 197.8419
Epoch 16/100
 - 0s - loss: 188.4095
Epoch 17/100
 - 0s - loss: 180.3679
Epoch 18/100
 - 0s - loss: 172.8843
Epoch 19/100
 - 0s - loss: 166.6711
Epoch 20/100
 - 0s - loss: 160.9010
Epoch 21/100
 - 0s - loss: 155.5900
Epoch 22/100
 - 0s - loss: 150.7618
Epoch 23/100
 - 0s - loss: 146.5687
Epoch 24/100
 - 0s - loss: 141.3136
Epoch 25/100
 - 0s - loss: 136.9516
Epoch 26/100
 - 0s - loss: 133.3013
Epoch 27/100
 - 0s - loss: 128.7599
Epoch 28/100
 - 0s - loss: 126.0469
Epoch 2

<keras.callbacks.History at 0x7f9887329b10>

## Test/Evaluate the Network


In [14]:
# evaluate the model using the Keras functions can be done like this, will also show you the error:
#scores = model.evaluate(X_test, y_test, verbose=0)
#print(scores)

#However, we will do it using scikitlearns's mean_squared_error function, like this:

# create predictions using our model
y_pred = model.predict(X_test)
#print(y_pred)

#compare our models predictions with the actual y values
print ("Mean Squared Error equals")
mean_squared_error(y_test, y_pred)

Mean Squared Error equals


64.52335831613927

## Multiple Runs

**Note to the Reviewer**:

With this code I satisfy the following criteria:

4. The data was normalized **and 30% of the data was held out for testing.**
5. The model was trained using 50 epochs.
6. The whole process of splitting the data into training and test sets, training the model, and evaluating it on the test data was repeated 50 times.

In [15]:
#Now we will run this process multiple times, to ensrue that our one-time run is not just a statistical fluke

#we will just store the results in a list
results_table_mean_squared_error = []


for i in range(50):
    X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size=0.3)
    n_cols = X_train.shape[1] # number of predictors
    #print(n_cols)
    model.fit(X_train, y_train, epochs=50, verbose=2)
    y_pred = model.predict(X_test)
    error_of_this_run = mean_squared_error(y_test, y_pred)
    results_table_mean_squared_error.append(error_of_this_run)
    

Epoch 1/50
 - 0s - loss: 58.6789
Epoch 2/50
 - 0s - loss: 58.0098
Epoch 3/50
 - 0s - loss: 57.6559
Epoch 4/50
 - 0s - loss: 56.4561
Epoch 5/50
 - 0s - loss: 56.2323
Epoch 6/50
 - 0s - loss: 56.0975
Epoch 7/50
 - 0s - loss: 55.8749
Epoch 8/50
 - 0s - loss: 55.4414
Epoch 9/50
 - 0s - loss: 55.2201
Epoch 10/50
 - 0s - loss: 55.6070
Epoch 11/50
 - 0s - loss: 55.5707
Epoch 12/50
 - 0s - loss: 54.5958
Epoch 13/50
 - 0s - loss: 53.9320
Epoch 14/50
 - 0s - loss: 53.6169
Epoch 15/50
 - 0s - loss: 53.3585
Epoch 16/50
 - 0s - loss: 53.4352
Epoch 17/50
 - 0s - loss: 53.5658
Epoch 18/50
 - 0s - loss: 53.1117
Epoch 19/50
 - 0s - loss: 52.5499
Epoch 20/50
 - 0s - loss: 52.4642
Epoch 21/50
 - 0s - loss: 52.3760
Epoch 22/50
 - 0s - loss: 51.7737
Epoch 23/50
 - 0s - loss: 51.6264
Epoch 24/50
 - 0s - loss: 51.8014
Epoch 25/50
 - 0s - loss: 51.0308
Epoch 26/50
 - 0s - loss: 50.6904
Epoch 27/50
 - 0s - loss: 50.9199
Epoch 28/50
 - 0s - loss: 50.1742
Epoch 29/50
 - 0s - loss: 50.0221
Epoch 30/50
 - 0s - los

## Evaluation after multiple runs

**Note to the Reviewer**:

With this code I satisfy the following criteria:

7. A discussion of the average mean squared error and how it compares with part B is included.

In [16]:
#What does our error table look like?
#print(results_table_mean_squared_error)

#Mean error
print("The average of the mean squared error of this new model, with normalized data is:")
print(np.mean(results_table_mean_squared_error))

#SD error
print("The standard deviation of the mean squared error of this new model, with normalized data is:")
print(np.std(results_table_mean_squared_error))

print("Compared to Part B, which was only trained with one hidden layer, we see slights improvements in the mean squared error, i.e. a lower values SD of the mean squared error; but similar value for the average of the mean squared error.") 

The average of the mean squared error of this new model, with normalized data is:
26.619568300079923
The standard deviation of the mean squared error of this new model, with normalized data is:
6.1038503816597025
Compared to Part B, which was only trained with one hidden layer, we see slights improvements in the mean squared error, i.e. a lower values SD of the mean squared error; but similar value for the average of the mean squared error.
