A. Build a baseline model (5 marks) 

Use the Keras library to build a neural network with the following:

- One hidden layer of 10 nodes, and a ReLU activation function

- Use the adam optimizer and the mean squared error  as the loss function.

1. Randomly split the data into a training and test sets by holding 30% of the data for testing. You can use the 
train_test_split
helper function from Scikit-learn.

2. Train the model on the training data using 50 epochs.

3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.

4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

5. Report the mean and the standard deviation of the mean squared errors.
   
6. Repeat Part A but use a normalized version of the data. Recall that one way to normalize the data is by subtracting the mean from the individual predictors and dividing by the standard deviation.





In [31]:
!pip install numpy==1.21.4
!pip install pandas==1.3.4
!pip install keras==2.1.6



In [32]:
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split

In [33]:
import pandas as pd
import numpy as np

In [34]:
concrete_data = pd.read_csv("concrete_data.csv")
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


In [35]:
#checking the data points and features and also for missing values
print(concrete_data.shape)
print(concrete_data.isnull().sum())

(1030, 9)
Cement                0
Blast Furnace Slag    0
Fly Ash               0
Water                 0
Superplasticizer      0
Coarse Aggregate      0
Fine Aggregate        0
Age                   0
Strength              0
dtype: int64


Split Data into predictors and target

In [36]:
concrete_data_columns = concrete_data.columns

predictors = concrete_data[concrete_data_columns[concrete_data_columns != 'Strength']] # all columns except Strength
target = concrete_data['Strength'] # Strength column

Sanity check for the predictors and the target

In [37]:

print(predictors.head())

   Cement  Blast Furnace Slag  Fly Ash  Water  Superplasticizer  \
0   540.0                 0.0      0.0  162.0               2.5   
1   540.0                 0.0      0.0  162.0               2.5   
2   332.5               142.5      0.0  228.0               0.0   
3   332.5               142.5      0.0  228.0               0.0   
4   198.6               132.4      0.0  192.0               0.0   

   Coarse Aggregate  Fine Aggregate  Age  
0            1040.0           676.0   28  
1            1055.0           676.0   28  
2             932.0           594.0  270  
3             932.0           594.0  365  
4             978.4           825.5  360  


In [38]:
print(target.head())

0    79.99
1    61.89
2    40.27
3    41.05
4    44.30
Name: Strength, dtype: float64


In [39]:
predictors_norm = (predictors - predictors.mean()) / predictors.std()
predictors_norm.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,0.862735,-1.217079,-0.279597
1,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,1.055651,-1.217079,-0.279597
2,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,3.55134
3,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,5.055221
4,-0.790075,0.678079,-0.846733,0.488555,-1.038638,0.070492,0.647569,4.976069


In [40]:
n_cols = predictors_norm.shape[1] # number of predictors
n_cols

8

Build a neural network
 1) One hidden layer of 10 nodes with ReLU activation function
 2) Adam optimizer and MSE as loss function


In [41]:
# define regression model
def regression_model():
    # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(1))
    
    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

In [42]:
# Function to train and test the models for epochs from 100(default value) to 500 
def regression_model_fit(model,number_of_epochs):
    # fit the model leaving 30% of the data for validation
    # setting verbosity to 0 to avoid unnecessary info display
    return model.fit(x_train, y_train, epochs=number_of_epochs, verbose=0, validation_data=(x_test, y_test))


Build a model with 1 hidden layer

In [43]:
# build the model and its summary
model_1_hidden_layer = regression_model()
print(model_1_hidden_layer.summary())



_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_5 (Dense)              (None, 10)                90        
_________________________________________________________________
dense_6 (Dense)              (None, 1)                 11        
Total params: 101
Trainable params: 101
Non-trainable params: 0
_________________________________________________________________
None


Train the model for 50

In [44]:
mse_list = []
x_train = [] 
x_test = [] 
y_train = [] 
y_test = []
for n in range(50):
    #Randomly splitting the data into a training set (70%) and a test set (30%):  
    x_train, x_test, y_train, y_test = train_test_split(predictors_norm, target, test_size=0.3)
    result = regression_model_fit(model_1_hidden_layer, 50)
    #Find mean_squared_error as last value in history.
    mean_squared_error = result.history['val_loss'][-1]
    #Add value of mean_squared_error to mse list.
    mse_list.append(mean_squared_error)
    print('iteration #{}: mean_squared_error {}'.format(n+1, mean_squared_error))
    

iteration #1: mean_squared_error 448.98490239115597
iteration #2: mean_squared_error 161.4504210339395
iteration #3: mean_squared_error 137.72016353668903
iteration #4: mean_squared_error 103.75139314225576
iteration #5: mean_squared_error 92.3070868334724
iteration #6: mean_squared_error 76.53305406786477
iteration #7: mean_squared_error 61.44404005772859
iteration #8: mean_squared_error 57.56187234032887
iteration #9: mean_squared_error 43.94030492591241
iteration #10: mean_squared_error 43.821330178516966
iteration #11: mean_squared_error 45.50723567826848
iteration #12: mean_squared_error 31.966149592476754
iteration #13: mean_squared_error 40.80877272596637
iteration #14: mean_squared_error 37.26084552999453
iteration #15: mean_squared_error 40.11544381299065
iteration #16: mean_squared_error 36.154784279733796
iteration #17: mean_squared_error 33.706071390689
iteration #18: mean_squared_error 31.293632896201125
iteration #19: mean_squared_error 41.68263617456924
iteration #20: me

Mean and standard deviation of the MSE

In [45]:
print('The mean of the mean squared errors: {}'.format(np.mean(mse_list)))
print('The standard deviation of the mean squared errors: {}'.format(np.std(mse_list)))

The mean of the mean squared errors: 51.914009425678685
The standard deviation of the mean squared errors: 62.520642416551404


Normalizing the data did bring in a reduction of mse. however standard deviation is more compared to the baseline model. this shows the variation of several examples is large compared to its mean.