## Regression Model in Keras by Niklas

#### Import Pandas and Numpy

In [1]:
import pandas as pd
import numpy as np

#### Import the dataset and show the first rows

In [2]:
concrete_data = pd.read_csv('https://cocl.us/concrete_data')
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


#### Install Keras and Tensorflow

In [3]:
!pip install keras==2.2.5 
!pip install tensorflow==1.15.0

Collecting keras==2.2.5
  Downloading Keras-2.2.5-py2.py3-none-any.whl (336 kB)
[K     |████████████████████████████████| 336 kB 11.7 MB/s eta 0:00:01
Installing collected packages: keras
Successfully installed keras-2.2.5
Collecting tensorflow==1.15.0
  Downloading tensorflow-1.15.0-cp37-cp37m-manylinux2010_x86_64.whl (412.3 MB)
[K     |████████████████████████████████| 412.3 MB 34 kB/s s eta 0:00:01
[?25hCollecting tensorflow-estimator==1.15.1
  Downloading tensorflow_estimator-1.15.1-py2.py3-none-any.whl (503 kB)
[K     |████████████████████████████████| 503 kB 52.9 MB/s eta 0:00:01
Collecting tensorboard<1.16.0,>=1.15.0
  Downloading tensorboard-1.15.0-py3-none-any.whl (3.8 MB)
[K     |████████████████████████████████| 3.8 MB 52.8 MB/s eta 0:00:01
Installing collected packages: tensorflow-estimator, tensorboard, tensorflow
  Attempting uninstall: tensorflow-estimator
    Found existing installation: tensorflow-estimator 2.1.0
    Uninstalling tensorflow-estimator-2.1.0:
      

#### Import Keras

Import Keras and related models, layers, and utils

In [4]:
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import to_categorical

Using TensorFlow backend.


#### Separate dataset

Divide data into predictor variables and the target (Strength of concrete) and show the first five lines

In [6]:
concrete_data_columns = concrete_data.columns

predictors = concrete_data[concrete_data_columns[concrete_data_columns != 'Strength']] # all columns except Strength
target = concrete_data['Strength'] # Strength column

In [7]:
predictors.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360


#### Normalize the data

Normalize the data to make sure the influence of the predictor variables is balanced

In [29]:
predictors_norm = (predictors - predictors.mean()) / predictors.std()
predictors_norm.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,0.862735,-1.217079,-0.279597
1,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,1.055651,-1.217079,-0.279597
2,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,3.55134
3,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,5.055221
4,-0.790075,0.678079,-0.846733,0.488555,-1.038638,0.070492,0.647569,4.976069


In [9]:
n_cols = predictors_norm.shape[1] # number of predictors

#### Import train_test_split from sklearn

Importing the split to divide the dataset into a train and test set with respective predictor and target variables. The split is set on 30% being test-data - assigned randomly.

In [10]:
from sklearn.model_selection import train_test_split

In [11]:
X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size=0.30, random_state=42)

### Regression Model

Definition of the regression model with the desired characteristics:
- One Hidden Layer with density of 10 and the ReLu activation function
- Optimizer = Adam, loss is defined with the MSE

In [12]:
# define regression model
def regression_model():
    # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(1))
    
    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

#### Building the model

Building the model and already setting up a list of scores which will be used to collect the MSE of the individual iterations.

In [13]:
# build the model
model = regression_model()
scores = []







#### Running the iterations

Running 50 iterations which 50 epochs each and attach the respective MSE to the scores list.

In [22]:
for i in range(50):
    model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=50, verbose=2)
    scores.append(model.evaluate(X_test, y_test, verbose=0))

Train on 721 samples, validate on 309 samples
Epoch 1/50
 - 0s - loss: 33.3591 - val_loss: 41.2534
Epoch 2/50
 - 0s - loss: 33.3875 - val_loss: 41.7188
Epoch 3/50
 - 0s - loss: 33.5204 - val_loss: 41.6458
Epoch 4/50
 - 0s - loss: 33.3876 - val_loss: 41.1034
Epoch 5/50
 - 0s - loss: 33.4988 - val_loss: 41.2547
Epoch 6/50
 - 0s - loss: 33.4877 - val_loss: 41.3652
Epoch 7/50
 - 0s - loss: 33.4449 - val_loss: 41.3060
Epoch 8/50
 - 0s - loss: 33.3606 - val_loss: 41.1836
Epoch 9/50
 - 0s - loss: 33.3988 - val_loss: 41.4504
Epoch 10/50
 - 0s - loss: 33.3871 - val_loss: 41.7829
Epoch 11/50
 - 0s - loss: 33.4966 - val_loss: 41.5259
Epoch 12/50
 - 0s - loss: 33.4081 - val_loss: 41.4354
Epoch 13/50
 - 0s - loss: 33.4226 - val_loss: 41.0454
Epoch 14/50
 - 0s - loss: 33.4060 - val_loss: 41.2279
Epoch 15/50
 - 0s - loss: 33.4498 - val_loss: 41.4806
Epoch 16/50
 - 0s - loss: 33.4978 - val_loss: 41.6822
Epoch 17/50
 - 0s - loss: 33.4102 - val_loss: 41.5073
Epoch 18/50
 - 0s - loss: 33.3629 - val_loss:

#### MSE for each iteration

Displaying the MSE for each iteration

In [31]:
for i in range(50):
    print('MSE for iteration #',[i],': ',"%.2f" % scores[i])

MSE for iteration # [0] :  306.67
MSE for iteration # [1] :  147.25
MSE for iteration # [2] :  117.24
MSE for iteration # [3] :  96.03
MSE for iteration # [4] :  80.94
MSE for iteration # [5] :  70.20
MSE for iteration # [6] :  63.96
MSE for iteration # [7] :  59.60
MSE for iteration # [8] :  56.78
MSE for iteration # [9] :  54.32
MSE for iteration # [10] :  52.33
MSE for iteration # [11] :  50.35
MSE for iteration # [12] :  48.94
MSE for iteration # [13] :  47.86
MSE for iteration # [14] :  46.77
MSE for iteration # [15] :  45.81
MSE for iteration # [16] :  44.73
MSE for iteration # [17] :  44.16
MSE for iteration # [18] :  43.75
MSE for iteration # [19] :  43.28
MSE for iteration # [20] :  42.92
MSE for iteration # [21] :  43.08
MSE for iteration # [22] :  42.74
MSE for iteration # [23] :  42.73
MSE for iteration # [24] :  42.29
MSE for iteration # [25] :  42.05
MSE for iteration # [26] :  42.17
MSE for iteration # [27] :  42.24
MSE for iteration # [28] :  41.98
MSE for iteration # [

In [32]:
print ( 'The Mean of the MSE is: ',"%.2f" % np.mean(scores))

The Mean of the MSE is:  48.35


In [33]:
print ( 'The Standard Deviation of the MSE is: ',"%.2f" % np.std(scores))

The Standard Deviation of the MSE is:  29.76
