<h1 align=center><font size = 5>Regression Models with Keras</font></h1>



<h2>Introduction</h2>

In this lab, we will  use the Keras library to build models for regression problems. 
The dataset is about the compressive strength of different samples of concrete based on the volumes of the different ingredients that were used to make them. We will use the [concrete dataset]('https://cocl.us/concrete_data'). Ingredients include:

- Cement

- Blast Furnace Slag

- Fly Ash

- Water

- Superplasticizer

- Coarse Aggregate

- Fine Aggregate

Let's first load the dataset using Pandas library.

In [1]:
import pandas as pd

In [2]:
data = pd.read_csv('concrete_data.csv')
data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


In [3]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1030 entries, 0 to 1029
Data columns (total 9 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   Cement              1030 non-null   float64
 1   Blast Furnace Slag  1030 non-null   float64
 2   Fly Ash             1030 non-null   float64
 3   Water               1030 non-null   float64
 4   Superplasticizer    1030 non-null   float64
 5   Coarse Aggregate    1030 non-null   float64
 6   Fine Aggregate      1030 non-null   float64
 7   Age                 1030 non-null   int64  
 8   Strength            1030 non-null   float64
dtypes: float64(8), int64(1)
memory usage: 72.5 KB


The data looks very clean and has no missing values. ready to be used to build our model. 

Define the features and the label

In [4]:
X = data.drop('Strength', axis=1)
y = data[['Strength']]

The number of features

In [5]:
n_features = X.shape[1]
print('The number of features is '+str(n_features))

The number of features is 8


<h2>Import Keras and Packages</h2>

Let's start by importing the keras libraries and the packages that we would need to build a neural network.


In [6]:
import keras
from keras.models import Sequential
from keras.layers import Dense

import numpy as np

from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1.  <a href="https://#item31">Part A. Build a baseline model</a>
2.  <a href="https://#item32">Part B. Normalize the data</a>
3.  <a href="https://#item33">Part C. Increate the number of epochs</a>
4.  <a href="https://#item34">Part D. Increase the number of hidden layers</a>

</font>
</div>

## Part A. Build a baseline model 

Use the Keras library to build a neural network with the following:

- One hidden layer of 10 nodes, and a ReLU activation function

- Use the adam optimizer and the mean squared error  as the loss function.

In [7]:
def one_hidden_layer(n_features=8):
  model = Sequential()
  model.add(Dense(10, activation='relu', input_shape=(n_features,)))
  model.add(Dense(1))

  model.compile(optimizer='adam', loss='mean_squared_error')

  return model

model = one_hidden_layer(n_features=8)

1. Split the data into a training and test sets by holding 30% of the data for testing

In [8]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=123)

2. Train the model on the training data using 50 epochs and predict the result

In [9]:
model.fit(X_train, y_train,  epochs=50, verbose=0)

y_pred = model.predict(X_test)



3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. 
We here use the `mean_squared_error` function from `Scikit-learn` library.

In [10]:
mse = mean_squared_error(y_test, y_pred)

print(f'The mean squared error is {mse}')

The mean squared error is 456.54489462428


4. Repeat steps 1 - 3, 50 times (i.e., create a list of 50 mean squared errors).
We create a function that combine the tree steps

In [11]:
def combine_steps(data, target, model=one_hidden_layer(n_features=8), iterations=50, epochs=50, verbose=0):
  MSE = []

  for i in range(iterations):
    X_train, X_test, y_train, y_test = train_test_split(data, target, test_size=0.3, random_state=123)
    model.fit(X_train, y_train, epochs=epochs, verbose=verbose)
    y_pred = model.predict(X_test)
    MSE.append(mean_squared_error(y_test, y_pred))
    mean = np.mean(MSE)
    std = np.std(MSE)
    
    print(f'     The mean of MSE: {mean}')
    print(f'     The std  of MSE: {std}')
    print(f'             The MSE: {MSE[i]}')
    print(f'\n---------------------- Iteration {i+1} ----------------------\n')

  return MSE, mean, std

In [12]:
MSE_A, mean_A, std_A = combine_steps(X, y, model=one_hidden_layer(n_features=8), iterations=50, epochs=50, verbose=0)

     The mean of MSE: 172.75251294925647
     The std  of MSE: 0.0
             The MSE: 172.75251294925647

---------------------- Iteration 1 ----------------------

     The mean of MSE: 142.7935647833551
     The std  of MSE: 29.958948165901383
             The MSE: 112.8346166174537

---------------------- Iteration 2 ----------------------

     The mean of MSE: 130.83817548127558
     The std  of MSE: 29.735865833479025
             The MSE: 106.92739687711654

---------------------- Iteration 3 ----------------------

     The mean of MSE: 126.5726946850177
     The std  of MSE: 26.790842967695763
             The MSE: 113.77625229624401

---------------------- Iteration 4 ----------------------

     The mean of MSE: 122.32489498612904
     The std  of MSE: 25.423898651923952
             The MSE: 105.33369619057447

---------------------- Iteration 5 ----------------------

     The mean of MSE: 120.09377090459623
     The std  of MSE: 23.738894081697698
             The MSE:

## Part B. Normalize the data
Repeat Part A but use a normalized version of the data. Recall that one way to normalize the data is by subtracting the mean from the individual predictors and dividing by the standard deviation.

Let's normalize the data by substracting the mean and dividing by the standard deviation.

In [13]:

X_normalized = (X - X.mean()) / X.std()
X_normalized.describe()


Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
count,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0
mean,-4.139084e-16,-1.793603e-16,0.0,-1.379695e-16,-1.931572e-16,7.243397e-16,-4.759946e-16,4.139084e-17
std,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
min,-1.714421,-0.8564718,-0.846733,-2.798851,-1.038638,-2.211064,-2.239829,-0.707016
25%,-0.8496407,-0.8564718,-0.846733,-0.7805147,-1.038638,-0.5262618,-0.5317114,-0.612034
50%,-0.0791135,-0.6014861,-0.846733,0.1607513,0.0326992,-0.06326279,0.07383152,-0.2795973
75%,0.6586406,0.8003558,1.001791,0.4885554,0.6688058,0.7264077,0.6288606,0.1636517
max,2.476712,3.309068,2.279976,3.064159,4.351528,2.213149,2.731735,5.055221


In [14]:
# Combaine the steps
MSE_B, mean_B, std_B = combine_steps(X_normalized, y, model=one_hidden_layer(n_features=8), iterations=50, epochs=50, verbose=0)

     The mean of MSE: 303.6399840120383
     The std  of MSE: 0.0
             The MSE: 303.6399840120383

---------------------- Iteration 1 ----------------------

     The mean of MSE: 232.39680889375865
     The std  of MSE: 71.24317511827968
             The MSE: 161.15363377547897

---------------------- Iteration 2 ----------------------

     The mean of MSE: 198.6046660497535
     The std  of MSE: 75.28309573382687
             The MSE: 131.02038036174318

---------------------- Iteration 3 ----------------------

     The mean of MSE: 175.1817630686272
     The std  of MSE: 76.78903261300516
             The MSE: 104.91305412524834

---------------------- Iteration 4 ----------------------

     The mean of MSE: 157.2343833456384
     The std  of MSE: 77.49631074701256
             The MSE: 85.44486445368325

---------------------- Iteration 5 ----------------------

     The mean of MSE: 143.3205397565595
     The std  of MSE: 77.28329077667031
             The MSE: 73.75132

The mean of the MSE of the normalized data is less then the mean of the MSE of the unormalized data.

## Part C. Increate the number of epochs 

Repeat Part B but use 100 epochs this time for training.

In [15]:
# Combaine the steps
MSE_C, mean_C, std_C = combine_steps(X_normalized, y, model=one_hidden_layer(n_features=8), iterations=50, epochs=100, verbose=0)

     The mean of MSE: 165.28111827355883
     The std  of MSE: 0.0
             The MSE: 165.28111827355883

---------------------- Iteration 1 ----------------------

     The mean of MSE: 126.62737843034549
     The std  of MSE: 38.653739843213344
             The MSE: 87.97363858713214

---------------------- Iteration 2 ----------------------

     The mean of MSE: 103.71712089459737
     The std  of MSE: 45.23089876717559
             The MSE: 57.89660582310112

---------------------- Iteration 3 ----------------------

     The mean of MSE: 89.2384681782885
     The std  of MSE: 46.5109643622157
             The MSE: 45.8025100293619

---------------------- Iteration 4 ----------------------

     The mean of MSE: 79.71460409406816
     The std  of MSE: 45.75403581217123
             The MSE: 41.6191477571868

---------------------- Iteration 5 ----------------------

     The mean of MSE: 73.0057623117764
     The std  of MSE: 44.379829703935606
             The MSE: 39.46155340

The mean of the MSE using 100  epochs is less then the mean of the MSE using 50  epochs.

## Part D. Increase the number of hidden layers

Repeat part B but use a neural network with the following instead:

- Three hidden layers, each of 10 nodes and ReLU activation function.

In [16]:
def three_hidden_layer(n_features=8):
  model = Sequential()
  model.add(Dense(10, activation='relu', input_shape=(n_features,)))
  model.add(Dense(10, activation='relu'))
  model.add(Dense(10, activation='relu'))
  model.add(Dense(1))

  model.compile(optimizer='adam', loss='mean_squared_error')

  return model

In [17]:
# Combaine the steps
MSE_D, mean_D, std_D = combine_steps(X_normalized, y, model=three_hidden_layer(n_features=8), iterations=50, epochs=50, verbose=0)

     The mean of MSE: 129.14714614390184
     The std  of MSE: 0.0
             The MSE: 129.14714614390184

---------------------- Iteration 1 ----------------------

     The mean of MSE: 98.7501104411508
     The std  of MSE: 30.39703570275104
             The MSE: 68.35307473839976

---------------------- Iteration 2 ----------------------

     The mean of MSE: 81.0307321752871
     The std  of MSE: 35.26952298988094
             The MSE: 45.5919756435597

---------------------- Iteration 3 ----------------------

     The mean of MSE: 71.73040655269862
     The std  of MSE: 34.531762321214245
             The MSE: 43.8294296849332

---------------------- Iteration 4 ----------------------

     The mean of MSE: 66.1065619411916
     The std  of MSE: 32.87042135514195
             The MSE: 43.611183495163495

---------------------- Iteration 5 ----------------------

     The mean of MSE: 62.31523863915053
     The std  of MSE: 31.18104919054961
             The MSE: 43.3586221289

The mean of the MSE using three hidden layers is less then the mean of the MSE using one hidden layers.

<h2>Report the mean and the standard deviation of the mean squared errors</h2>


In [19]:
table = [ 
          ['A', mean_A, std_A],
          ['B', mean_B, std_B],
          ['C', mean_C, std_C],
          ['D', mean_D, std_D]
         ]

pd.DataFrame(table, columns = ['Part', 'mean MSE', 'std MSE']).set_index('Part')


Unnamed: 0_level_0,mean MSE,std MSE
Part,Unnamed: 1_level_1,Unnamed: 2_level_1
A,66.407658,29.983358
B,52.455017,43.092759
C,39.878994,19.662196
D,43.179171,12.977059
