<h1 align=center><font size = 6>Regression Models with Keras</font></h1>

## Introduction

Keras is a high-level API for building deep learning models. It has gained favor for its ease of use and syntactic simplicity facilitating fast development.

In this lab, Keras library will be used to build a regression model for a dataset about concrete compressive strength. The effects of the training epoch length and the structure of the hidden layer will be examined.

<strong>The dataset is about the compressive strength of different samples of concrete based on the volumes of the different ingredients that were used to make them. Ingredients include:</strong>

<strong>1. Cement</strong>

<strong>2. Blast Furnace Slag</strong>

<strong>3. Fly Ash</strong>

<strong>4. Water</strong>

<strong>5. Superplasticizer</strong>

<strong>6. Coarse Aggregate</strong>

<strong>7. Fine Aggregate</strong>

Let's start by importing the necessary libraries.

In [3]:
import pandas as pd
import numpy as np
import keras
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

Using TensorFlow backend.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  LARGE_SPARSE_SUPPORTED = LooseVersion(scipy_version) >= '0.14.0'


## Download and view the Dataset

Next step involves downloading the dataset and doing some preliminary analysis.

In [4]:
concrete_data = pd.read_csv('https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0101EN/labs/data/concrete_data.csv')
print("The shape of the dataset is:", concrete_data.shape)
print()
print("Checking for null values in dataset .....")
print()
print(concrete_data.isnull().sum())
print()
concrete_data.head()

The shape of the dataset is: (1030, 9)

Checking for null values in dataset .....

Cement                0
Blast Furnace Slag    0
Fly Ash               0
Water                 0
Superplasticizer      0
Coarse Aggregate      0
Fine Aggregate        0
Age                   0
Strength              0
dtype: int64



Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


The **features columns** and **Strength column** are extracted to form our **predictors** and **target** for training

In [10]:
concrete_data_columns = concrete_data.columns

predictors = concrete_data[concrete_data_columns[concrete_data_columns != 'Strength']] # all columns except Strength
target = concrete_data['Strength'] # Strength column
predictors.columns

Index(['Cement', 'Blast Furnace Slag', 'Fly Ash', 'Water', 'Superplasticizer',
       'Coarse Aggregate', 'Fine Aggregate', 'Age'],
      dtype='object')

To obtain off-sample dataset to test our trained model with, **30%** of the dataset is set aside using **train_test_split**

In [11]:
X_train, X_test, y_train, y_test = train_test_split(predictors, target, test_size=0.30, random_state=0)

## Build the base Neural Network

A function is defined for the regression model to facilitate easy calling to create required model.

In [12]:
# define regression model
def regression_model(n_layer, n_neurons, input_size, output_size):
    # create model
    model = Sequential()
    for i in range(n_layer+1):
        if i == 0:
            model.add(Dense(n_neurons, activation='relu', input_shape=(input_size,)))
        elif i == n_layer:
            model.add(Dense(output_size))
        else:
            model.add(Dense(n_neurons, activation='relu'))
    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

In [23]:
# Initialize some essential parameters for the model function
input_size = predictors.shape[1] # number of predictors
ouput_size = 1 # number of targets
hidden_layers = 1
nodes = 10   # number of neurons in the hidden layer
Epochs = 50

# build and fit the model
model = regression_model(hidden_layers, nodes, input_size, ouput_size)
model.fit(X_train, y_train, epochs=Epochs, verbose= False)


<keras.callbacks.History at 0x7f7f201963d0>

In [24]:
y_pred = model.predict(X_test) # Perform some predictions using the test features
print("The MSE of the base model on the test data is:", f"{mean_squared_error(y_test, y_pred):.4f}")

The MSE of the base model on the test data is: 720.8025


Performing an iterative training of the Regression model using random splitting of data.

In [32]:
MSE = []
for i in range(50):
    
    # Initialize some essential parameters for the model function
    input_size = predictors.shape[1] # number of predictors
    ouput_size = 1 # number of targets
    hidden_layers = 1
    nodes = 10   # number of neurons in the hidden layer
    Epochs = 50
    
    # Randomly split the data
    X_train, X_test, y_train, y_test = train_test_split(predictors, target, test_size=0.30, random_state=i)

    # build and fit the model
    model = regression_model(hidden_layers, nodes, input_size, ouput_size)
    model.fit(X_train, y_train, epochs=Epochs, verbose= False)
    
    y_pred = model.predict(X_test) # Perform some predictions using the test features
    
    # Append the MSE to the MSE list
    MSE.append(float(f"{mean_squared_error(y_test, y_pred):.6f}"))

# Print the mean and standard deviation of the list
print(f'Mean: {np.mean(MSE): .6f}')
print(f'Standard Deviation: {np.std(MSE): .6f}')

Mean: 285.03110949999996
Standard Deviation: 302.52007067903725


## Demonstrating the effect of normalizing our dataset

The data is normalized by subtracting the mean from the individual predictors and dividing by the standard deviation. Then, the normalized dataset is used to run the same operation as above.

In [34]:
# Normalizing our data
predictors_norm = (predictors - predictors.mean()) / predictors.std()

MSE = []
for i in range(50):
    
    # Initialize some essential parameters for the model function
    input_size = predictors.shape[1] # number of predictors
    ouput_size = 1 # number of targets
    hidden_layers = 1
    nodes = 10   # number of neurons in the hidden layer
    Epochs = 50
    
    # Randomly split the data
    X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size=0.30, random_state=i)

    # build and fit the model
    model = regression_model(hidden_layers, nodes, input_size, ouput_size)
    model.fit(X_train, y_train, epochs=Epochs, verbose= False)
    
    y_pred = model.predict(X_test) # Perform some predictions using the test features
    
    # Append the MSE to the MSE list
    MSE.append(float(f"{mean_squared_error(y_test, y_pred):.6f}"))

# Print the mean and standard deviation of the list
print(f'Mean: {np.mean(MSE): .6f}')
print(f'Standard Deviation: {np.std(MSE): .6f}')

Mean:  351.696268
Standard Deviation:  92.790012


I was expecting the mean of the **mean squared error (MSE)** to drop as the Features got normalized, but that wasn't the case. The noticable difference is with the **standard deviation (SD)**. The base model has a very high SD for the MSE, which implies that the MSE in the list varies over a wide range (This simply suggest that getting a good model with the unnormalized data will involve a high try and error), unlike the normalized dataset that has a lower SD for the MSE, therefore, its MSE's are closer to each other.

## Demonstrating the effect epoch length

The epoch length is now doubled, from **50 to 100**.

In [36]:
MSE = []
for i in range(50):
    
    # Initialize some essential parameters for the model function
    input_size = predictors.shape[1] # number of predictors
    ouput_size = 1 # number of targets
    hidden_layers = 1
    nodes = 10   # number of neurons in the hidden layer
    Epochs = 100   # Increased from 50 to 100
    
    # Randomly split the data
    X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size=0.30, random_state=i)

    # build and fit the model
    model = regression_model(hidden_layers, nodes, input_size, ouput_size)
    model.fit(X_train, y_train, epochs=Epochs, verbose= False)
    
    y_pred = model.predict(X_test) # Perform some predictions using the test features
    
    # Append the MSE to the MSE list
    MSE.append(float(f"{mean_squared_error(y_test, y_pred):.6f}"))

# Print the mean and standard deviation of the list
print(f'Mean: {np.mean(MSE): .6f}')
print(f'Standard Deviation: {np.std(MSE): .6f}')

Mean:  172.560792
Standard Deviation:  33.529968


Training for a longer Epoch did reduce the mean and SD of the MSE by a resoanable amount

## Demonstrating the effect increasing the number of hidden layer in the neural network

The epoch length is maintained at 50, while the hidden layers was increased to **3**, each having **10** nodes and ReLU activation functions.

In [37]:
MSE = []
for i in range(50):
    
    # Initialize some essential parameters for the model function
    input_size = predictors.shape[1] # number of predictors
    ouput_size = 1 # number of targets
    hidden_layers = 3
    nodes = 10   # number of neurons in the hidden layer
    Epochs = 50
    
    # Randomly split the data
    X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size=0.30, random_state=i)

    # build and fit the model
    model = regression_model(hidden_layers, nodes, input_size, ouput_size)
    model.fit(X_train, y_train, epochs=Epochs, verbose= False)
    
    y_pred = model.predict(X_test) # Perform some predictions using the test features
    
    # Append the MSE to the MSE list
    MSE.append(float(f"{mean_squared_error(y_test, y_pred):.6f}"))

# Print the mean and standard deviation of the list
print(f'Mean: {np.mean(MSE): .6f}')
print(f'Standard Deviation: {np.std(MSE): .6f}')

Mean:  128.667700
Standard Deviation:  17.923139


Increasing the number of hidden layers in the network increases the probability of the network to accurately learn the information in the dataset. So, this attributed to the significant drop in the mean and SD of the MSE.


## Log

|  Date (YYYY-MM-DD) |  Version | Changed By  |  Change Description |
|---|---|---|---|
| 2023-10-18  | 1.0  | Innocent  |  Created this notebook |

