## Regression Model in Keras by Niklas

#### Import Pandas and Numpy

In [86]:
import pandas as pd
import numpy as np

#### Import the dataset and show the first rows

In [87]:
concrete_data = pd.read_csv('https://cocl.us/concrete_data')
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


#### Install Keras and Tensorflow

In [88]:
%%capture
!pip install keras==2.2.5 
!pip install tensorflow==1.15.0

#### Import Keras

Import Keras and related models, layers, and utils

In [89]:
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import to_categorical

#### Separate dataset

Divide data into predictor variables and the target (Strength of concrete) and show the first five lines

In [90]:
concrete_data_columns = concrete_data.columns

predictors = concrete_data[concrete_data_columns[concrete_data_columns != 'Strength']] # all columns except Strength
target = concrete_data['Strength'] # Strength column

In [91]:
predictors.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360


In [92]:
n_cols = predictors_norm.shape[1] # number of predictors

#### Normalize the data

Normalize the data to make sure the influence of the predictor variables is balanced

In [93]:
predictors_norm = (predictors - predictors.mean()) / predictors.std()
predictors_norm.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,0.862735,-1.217079,-0.279597
1,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,1.055651,-1.217079,-0.279597
2,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,3.55134
3,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,5.055221
4,-0.790075,0.678079,-0.846733,0.488555,-1.038638,0.070492,0.647569,4.976069


#### Import train_test_split from sklearn

Importing the split to divide the dataset into a train and test set with respective predictor and target variables. The split is set on 30% being test-data - assigned randomly.

In [94]:
from sklearn.model_selection import train_test_split

In [95]:
X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size=0.30, random_state=42)

### Regression Model

Definition of the regression model with the desired characteristics:
- One Hidden Layer with density of 10 and the ReLu activation function
- Optimizer = Adam, loss is defined with the MSE

In [96]:
# define regression model
def regression_model():
    # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(1))
    
    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

#### Building the model

Building the model and already setting up a list of scores which will be used to collect the MSE of the individual iterations.

In [97]:
# build the model
model = regression_model()
scores = []

#### Running the iterations

Running 50 iterations which 50 epochs each and attach the respective MSE to the scores list.

In [98]:
%%capture
for i in range(50):
    model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=50, verbose=2)
    scores.append(model.evaluate(X_test, y_test, verbose=0))

#### MSE for each iteration

Displaying the MSE for each iteration

In [99]:
for i in range(50):
    print('MSE for iteration #',[i],': ',"%.2f" % scores[i])

MSE for iteration # [0] :  128.19
MSE for iteration # [1] :  93.54
MSE for iteration # [2] :  70.70
MSE for iteration # [3] :  58.56
MSE for iteration # [4] :  52.39
MSE for iteration # [5] :  48.62
MSE for iteration # [6] :  40.82
MSE for iteration # [7] :  39.68
MSE for iteration # [8] :  38.49
MSE for iteration # [9] :  38.24
MSE for iteration # [10] :  37.25
MSE for iteration # [11] :  37.45
MSE for iteration # [12] :  37.32
MSE for iteration # [13] :  36.79
MSE for iteration # [14] :  36.78
MSE for iteration # [15] :  38.10
MSE for iteration # [16] :  36.73
MSE for iteration # [17] :  36.79
MSE for iteration # [18] :  36.03
MSE for iteration # [19] :  37.45
MSE for iteration # [20] :  37.97
MSE for iteration # [21] :  36.89
MSE for iteration # [22] :  37.30
MSE for iteration # [23] :  36.29
MSE for iteration # [24] :  35.81
MSE for iteration # [25] :  36.60
MSE for iteration # [26] :  37.27
MSE for iteration # [27] :  36.49
MSE for iteration # [28] :  37.12
MSE for iteration # [29

In [100]:
print ( 'The Mean of the MSE is: ',"%.2f" % np.mean(scores))

The Mean of the MSE is:  41.30


In [101]:
print ( 'The Standard Deviation of the MSE is: ',"%.2f" % np.std(scores))

The Standard Deviation of the MSE is:  15.90


### Comparison of Mean and Standard Deviation between 1 and 3 hidden layers and 50 Epochs:

#### Mean
Mean with 1 hidden layer: 53.77

Mean with 3 hidden layers: 41.30

#### Standard Deviation

Standard deviation with 1 hidden layer: 52.57

Standard deviation with 3 hidden layers: 15.90