#### A. Build a baseline model 

- Use the Keras library to build a neural network with the following:

    - One hidden layer of 10 nodes, and a ReLU activation function
    - Use the adam optimizer and the mean squared error as the loss function.

- 1. Randomly split the data into a training and test sets by holding 30% of the data for testing. You can use the train_test_splithelper function from Scikit-learn.

- 2. Train the model on the training data using 50 epochs.

- 3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.

- 4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

- 5. Report the mean and the standard deviation of the mean squared errors.

- 6. Submit your Jupyter Notebook with your code and comments.

In [2]:
import pandas as pd
import numpy as np
import keras
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from tqdm import tqdm

In [4]:
concrete_data = pd.read_csv('https://cocl.us/concrete_data')
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


In [5]:
# find some quick useful statistics
concrete_data.describe()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
count,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0
mean,281.167864,73.895825,54.18835,181.567282,6.20466,972.918932,773.580485,45.662136,35.817961
std,104.506364,86.279342,63.997004,21.354219,5.973841,77.753954,80.17598,63.169912,16.705742
min,102.0,0.0,0.0,121.8,0.0,801.0,594.0,1.0,2.33
25%,192.375,0.0,0.0,164.9,0.0,932.0,730.95,7.0,23.71
50%,272.9,22.0,0.0,185.0,6.4,968.0,779.5,28.0,34.445
75%,350.0,142.95,118.3,192.0,10.2,1029.4,824.0,56.0,46.135
max,540.0,359.4,200.1,247.0,32.2,1145.0,992.6,365.0,82.6


In [6]:
concrete_data.isnull().sum()

Cement                0
Blast Furnace Slag    0
Fly Ash               0
Water                 0
Superplasticizer      0
Coarse Aggregate      0
Fine Aggregate        0
Age                   0
Strength              0
dtype: int64

In [7]:
#Split the data into predictor and target variables for Regression model
concrete_data_columns = concrete_data.columns

predictors = concrete_data[concrete_data_columns[concrete_data_columns != 'Strength']] # all columns except Strength
target = concrete_data['Strength'] # Strength column
X = predictors
y = target

In [8]:
n_cols = X.shape[1]
n_cols

8

In [9]:
X.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360


In [10]:
y.head()

0    79.99
1    61.89
2    40.27
3    41.05
4    44.30
Name: Strength, dtype: float64

- Split the data into training and test sets with scikitlearn

In [11]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=42)

- Build the neural network, train using 50 epochs

In [12]:
# define regression model
def regression_model():
    # Construct the Network 
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(1))
    
    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

In [13]:
model = regression_model()

In [14]:
model.fit(X_train, y_train, epochs=50, verbose=1)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x225d7d21640>

- Evaluate the model on test data

In [15]:
model.evaluate(X_test, y_test)



551.9075927734375

In [16]:
y_pred = model.predict(X_test)
y_pred



array([[ 28.491777  ],
       [ 56.95855   ],
       [ 64.976776  ],
       [ 56.831604  ],
       [ 61.308746  ],
       [ 61.31768   ],
       [ 34.53466   ],
       [ 50.071404  ],
       [ 39.79013   ],
       [ 58.277664  ],
       [ 34.722908  ],
       [ -9.411627  ],
       [ 93.72618   ],
       [ 38.986565  ],
       [ 25.933615  ],
       [ 20.084055  ],
       [ 64.6434    ],
       [ 13.04233   ],
       [ 38.478523  ],
       [ 15.659731  ],
       [ 17.085657  ],
       [ 35.753975  ],
       [ 59.82164   ],
       [ -0.4053175 ],
       [ 24.717928  ],
       [ 48.46129   ],
       [ -4.50498   ],
       [ 22.358934  ],
       [ 25.174566  ],
       [ 37.804817  ],
       [ 13.391096  ],
       [ 56.310307  ],
       [  6.2608204 ],
       [ 22.672525  ],
       [ 15.30559   ],
       [ 22.433886  ],
       [ 38.85054   ],
       [ 47.824333  ],
       [ 20.816072  ],
       [ 34.68489   ],
       [ 12.898531  ],
       [ 26.635683  ],
       [ 22.866976  ],
       [ 81

In [17]:
MSE = mean_squared_error(y_test, y_pred)
MSE

551.9076582591566

In [18]:
mean = np.mean(MSE)
sd = np.std(MSE)
print(mean, sd)

551.9076582591566 0.0


- Create a list of 50 mean squared errors. Report the mean and standard deviation of the mean squared errors.

In [19]:
list_MSE = []
for i in tqdm(range (0, 50)):
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=i)
    model.fit(X_train, y_train, epochs=50, verbose=0)
    value_MSE = model.evaluate(X_test, y_test, verbose=0)
    y_pred = model.predict(X_test)
    MSE = mean_squared_error(y_test, y_pred)
    list_MSE.append(MSE)
    print("MSE" + str(i+1) + "=" + str(value_MSE))

mean = np.mean(list_MSE)
sd = np.std(list_MSE)

print("Mean: "+str(mean))
print("Standard Deviation: "+str(sd))

  0%|          | 0/50 [00:00<?, ?it/s]



  2%|▏         | 1/50 [00:03<02:42,  3.32s/it]

MSE1=148.09921264648438


  4%|▍         | 2/50 [00:06<02:22,  2.96s/it]

MSE2=133.16159057617188


  6%|▌         | 3/50 [00:07<01:56,  2.47s/it]

MSE3=103.08289337158203


  8%|▊         | 4/50 [00:09<01:43,  2.26s/it]

MSE4=98.10330200195312


 10%|█         | 5/50 [00:12<01:41,  2.26s/it]

MSE5=99.22122955322266


 12%|█▏        | 6/50 [00:14<01:37,  2.20s/it]

MSE6=83.41876983642578


 14%|█▍        | 7/50 [00:16<01:31,  2.13s/it]

MSE7=90.60494232177734


 16%|█▌        | 8/50 [00:18<01:26,  2.05s/it]

MSE8=70.87651824951172


 18%|█▊        | 9/50 [00:19<01:22,  2.00s/it]

MSE9=71.6685791015625


 20%|██        | 10/50 [00:21<01:19,  1.98s/it]

MSE10=65.60179901123047


 22%|██▏       | 11/50 [00:23<01:15,  1.95s/it]

MSE11=65.93724060058594


 24%|██▍       | 12/50 [00:25<01:13,  1.92s/it]

MSE12=55.377742767333984


 26%|██▌       | 13/50 [00:27<01:13,  2.00s/it]

MSE13=68.17366790771484


 28%|██▊       | 14/50 [00:29<01:11,  1.99s/it]

MSE14=54.721378326416016


 30%|███       | 15/50 [00:31<01:08,  1.95s/it]

MSE15=50.081790924072266


 32%|███▏      | 16/50 [00:33<01:06,  1.96s/it]

MSE16=46.12915802001953


 34%|███▍      | 17/50 [00:35<01:07,  2.06s/it]

MSE17=51.82428741455078


 36%|███▌      | 18/50 [00:37<01:05,  2.03s/it]

MSE18=49.01089096069336


 38%|███▊      | 19/50 [00:40<01:04,  2.07s/it]

MSE19=46.42610549926758


 40%|████      | 20/50 [00:42<01:03,  2.13s/it]

MSE20=54.770423889160156


 42%|████▏     | 21/50 [00:44<01:01,  2.13s/it]

MSE21=58.26739501953125


 44%|████▍     | 22/50 [00:46<00:58,  2.09s/it]

MSE22=47.6011848449707


 46%|████▌     | 23/50 [00:48<00:54,  2.02s/it]

MSE23=43.53953552246094


 48%|████▊     | 24/50 [00:50<00:50,  1.96s/it]

MSE24=45.79310989379883


 50%|█████     | 25/50 [00:51<00:48,  1.92s/it]

MSE25=49.750831604003906


 52%|█████▏    | 26/50 [00:54<00:47,  1.98s/it]

MSE26=52.15494918823242


 54%|█████▍    | 27/50 [00:57<00:52,  2.28s/it]

MSE27=52.92048645019531


 56%|█████▌    | 28/50 [00:59<00:48,  2.21s/it]

MSE28=44.04694366455078


 58%|█████▊    | 29/50 [01:01<00:44,  2.12s/it]

MSE29=55.796607971191406


 60%|██████    | 30/50 [01:03<00:43,  2.19s/it]

MSE30=47.88011169433594


 62%|██████▏   | 31/50 [01:05<00:42,  2.23s/it]

MSE31=50.01919174194336


 64%|██████▍   | 32/50 [01:07<00:38,  2.15s/it]

MSE32=41.92542266845703


 66%|██████▌   | 33/50 [01:09<00:34,  2.05s/it]

MSE33=46.55187225341797


 68%|██████▊   | 34/50 [01:11<00:31,  1.99s/it]

MSE34=49.713539123535156


 70%|███████   | 35/50 [01:13<00:29,  1.95s/it]

MSE35=45.1710205078125


 72%|███████▏  | 36/50 [01:15<00:26,  1.92s/it]

MSE36=51.2888298034668


 74%|███████▍  | 37/50 [01:17<00:25,  1.97s/it]

MSE37=59.766483306884766


 76%|███████▌  | 38/50 [01:18<00:23,  1.92s/it]

MSE38=51.25933837890625


 78%|███████▊  | 39/50 [01:20<00:21,  1.92s/it]

MSE39=45.59597396850586


 80%|████████  | 40/50 [01:22<00:19,  1.91s/it]

MSE40=43.05650329589844


 82%|████████▏ | 41/50 [01:24<00:17,  1.90s/it]

MSE41=54.55913162231445


 84%|████████▍ | 42/50 [01:26<00:15,  1.92s/it]

MSE42=47.73707962036133


 86%|████████▌ | 43/50 [01:28<00:13,  1.93s/it]

MSE43=45.03664016723633


 88%|████████▊ | 44/50 [01:30<00:11,  1.91s/it]

MSE44=52.2896728515625


 90%|█████████ | 45/50 [01:32<00:09,  1.88s/it]

MSE45=51.84748077392578


 92%|█████████▏| 46/50 [01:34<00:07,  1.87s/it]

MSE46=49.66443634033203


 94%|█████████▍| 47/50 [01:35<00:05,  1.86s/it]

MSE47=46.456565856933594


 96%|█████████▌| 48/50 [01:37<00:03,  1.85s/it]

MSE48=49.09062576293945


 98%|█████████▊| 49/50 [01:39<00:01,  1.88s/it]

MSE49=50.576622009277344


100%|██████████| 50/50 [01:41<00:00,  2.03s/it]

MSE50=50.984458923339844
Mean: 59.73267156194898
Standard Deviation: 22.14642378926111



