# Build a Regression Model in Keras

**A. Build a baseline model**

Use the Keras library to build a neural network with the following:

- One hidden layer of 10 nodes, and a ReLU activation function

- Use the adam optimizer and the mean squared error  as the loss function.

1. Randomly split the data into a training and test sets by holding 30% of the data for testing. You can use the train_test_splithelper function from Scikit-learn.

2. Train the model on the training data using 50 epochs.

3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.

4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

5. Report the mean and the standard deviation of the mean squared errors.

In [1]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

In [2]:
import keras
from keras.models import Sequential
from keras.layers import Dense
# from keras.utils import to_categorical
from tensorflow.keras.utils import to_categorical

In [3]:
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

In [4]:
from sklearn.metrics import mean_squared_error

# Import DATA

In [5]:
data = pd.read_csv('../input/concrete-data/concrete_data.csv')

In [6]:
data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


# X and y arrays

In [7]:
X = data.drop(['Strength'], axis=1)
y = data[['Strength']]

In [8]:
X

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360
...,...,...,...,...,...,...,...,...
1025,276.4,116.0,90.3,179.6,8.9,870.1,768.3,28
1026,322.2,0.0,115.6,196.0,10.4,817.9,813.4,28
1027,148.5,139.4,108.6,192.7,6.1,892.4,780.0,28
1028,159.1,186.7,0.0,175.6,11.3,989.6,788.9,28


In [9]:
y

Unnamed: 0,Strength
0,79.99
1,61.89
2,40.27
3,41.05
4,44.30
...,...
1025,44.28
1026,31.18
1027,23.70
1028,32.77


In [10]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# A. Model

In [11]:
X_train.shape

(721, 8)

In [12]:
def regression_model():
    # create model
    model = Sequential()
    model.add(Dense(10, input_dim=8, activation='relu'))
    model.add(Dense(1, kernel_initializer='normal'))
    # Compile model
    model.compile(loss='mean_squared_error', optimizer='adam')
    return model

In [13]:
model = regression_model()

2022-04-24 14:32:12.700594: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.


In [14]:
model.fit(X_train, y_train, epochs=50)

2022-04-24 14:32:13.046411: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x7fd405cd1c50>

In [15]:
y_pred = model.predict(X_test)

In [16]:
result = np.sqrt(mean_squared_error(y_test,y_pred))
result

10.853062862614655

In [17]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 10)                90        
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 11        
Total params: 101
Trainable params: 101
Non-trainable params: 0
_________________________________________________________________


# A. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

In [18]:
MSE_List = []
for i in range(50):
    #1-Split Data:
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
    
    model = regression_model()
    #2-Train:
    model.fit(X_train, y_train, epochs=50, verbose=0)
    
    #Prediction:
    y_pred = model.predict(X_test)
    
    #3-Evaluate_Model:
    result = np.sqrt(mean_squared_error(y_test,y_pred))
    print("{}: sqrt(mse) = {}".format(i+1,result))
    MSE_List.append(result)
    print("***_________________________________***\n\n\n")

1: sqrt(mse) = 10.664243658087049
***_________________________________***



2: sqrt(mse) = 11.605549727571963
***_________________________________***



3: sqrt(mse) = 11.565861060333079
***_________________________________***



4: sqrt(mse) = 8.419272054239261
***_________________________________***



5: sqrt(mse) = 9.664213370439887
***_________________________________***



6: sqrt(mse) = 8.533265188832463
***_________________________________***



7: sqrt(mse) = 12.152419860565182
***_________________________________***



8: sqrt(mse) = 9.700695024463416
***_________________________________***



9: sqrt(mse) = 11.14576200987155
***_________________________________***



10: sqrt(mse) = 11.398950055887488
***_________________________________***



11: sqrt(mse) = 11.118121044225038
***_________________________________***



12: sqrt(mse) = 8.515814010345107
***_________________________________***



13: sqrt(mse) = 8.891497778994072
***_________________________________***



14

In [19]:
MSE_List

[10.664243658087049,
 11.605549727571963,
 11.565861060333079,
 8.419272054239261,
 9.664213370439887,
 8.533265188832463,
 12.152419860565182,
 9.700695024463416,
 11.14576200987155,
 11.398950055887488,
 11.118121044225038,
 8.515814010345107,
 8.891497778994072,
 11.017550132112262,
 11.311376718596149,
 9.182897963962555,
 12.244606623248039,
 9.305711122670669,
 8.580787720067482,
 9.680962254226587,
 10.607101734375828,
 11.484870884493324,
 10.251451405941587,
 10.368117133674227,
 11.152523968092448,
 11.847017501787104,
 9.993613794264004,
 10.466825454793419,
 10.42490255134816,
 8.35267305999843,
 10.634486972166847,
 10.708598718815697,
 10.186694187360896,
 9.895720408259843,
 11.57569432495493,
 10.099801321115134,
 9.386244960490798,
 10.475356030998045,
 10.6805514106967,
 11.551891834454159,
 10.540729454959537,
 11.71774285589446,
 8.454966801081566,
 10.729362960506185,
 8.735691872804862,
 10.631966508796024,
 8.427162398833168,
 10.026525168707286,
 8.9692493254480

# A. Results
**Report mean square MSE and std square MSE**

In [20]:
# Calculate the mean and the standard deviation of the metric on the 50 samplings
mean_sqmse_A = np.mean(MSE_List)
std_sqmse_A  = np.std(MSE_List)

# Generate a data frame to store the results of the differents parts of this project
df_results = pd.DataFrame.from_dict({"Part": ["A"],"mean_sq_mse": [mean_sqmse_A], "std_sq_mse": [std_sqmse_A]})
df_results

Unnamed: 0,Part,mean_sq_mse,std_sq_mse
0,A,10.267254,1.092846


# **Part B**

**B. Normalize the data** 

Repeat Part A but use a normalized version of the data. Recall that one way to normalize the data is by subtracting the mean from the individual predictors and dividing by the standard deviation.

# B. Normalize the data

In [21]:
from sklearn import preprocessing

In [22]:
X = preprocessing.normalize(X)
y = preprocessing.normalize(y, axis = 0)

In [23]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# B. Model

In [24]:
X_train.shape

(721, 8)

In [25]:
model = regression_model()

In [26]:
model.fit(X_train, y_train, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x7fd3fcdc6890>

In [27]:
y_pred = model.predict(X_test)

In [28]:
result = np.sqrt(mean_squared_error(y_test,y_pred))
result

0.00863247260380448

In [29]:
model.summary()

Model: "sequential_51"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_102 (Dense)            (None, 10)                90        
_________________________________________________________________
dense_103 (Dense)            (None, 1)                 11        
Total params: 101
Trainable params: 101
Non-trainable params: 0
_________________________________________________________________


# B. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

In [30]:
MSE_List = []
for i in range(50):
    #1-Split Data:
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
    
    model = regression_model()
    #2-Train:
    model.fit(X_train, y_train, epochs=50, verbose=0)
    
    #Prediction:
    y_pred = model.predict(X_test)
    
    #3-Evaluate_Model:
    result = np.sqrt(mean_squared_error(y_test,y_pred))
    print("{}: sqrt(mse) = {}".format(i+1,result))
    MSE_List.append(result)
    print("***_________________________________***\n\n\n")

1: sqrt(mse) = 0.007909379517048195
***_________________________________***



2: sqrt(mse) = 0.008533618392796337
***_________________________________***



3: sqrt(mse) = 0.008513936453270988
***_________________________________***



4: sqrt(mse) = 0.009359720060464285
***_________________________________***



5: sqrt(mse) = 0.008619462996661604
***_________________________________***



6: sqrt(mse) = 0.00688478053385421
***_________________________________***



7: sqrt(mse) = 0.006026296978382181
***_________________________________***



8: sqrt(mse) = 0.009641193938254489
***_________________________________***



9: sqrt(mse) = 0.007082819985594648
***_________________________________***



10: sqrt(mse) = 0.008191494277055303
***_________________________________***



11: sqrt(mse) = 0.00702235609485498
***_________________________________***



12: sqrt(mse) = 0.006734853758314344
***_________________________________***



13: sqrt(mse) = 0.007015871280280298
***___________

In [31]:
MSE_List

[0.007909379517048195,
 0.008533618392796337,
 0.008513936453270988,
 0.009359720060464285,
 0.008619462996661604,
 0.00688478053385421,
 0.006026296978382181,
 0.009641193938254489,
 0.007082819985594648,
 0.008191494277055303,
 0.00702235609485498,
 0.006734853758314344,
 0.007015871280280298,
 0.008425528527530715,
 0.009138117539065183,
 0.008191048059009404,
 0.009222442974039891,
 0.008364126820533491,
 0.008170136802384879,
 0.008301839144430067,
 0.007825165135129028,
 0.008072763635128336,
 0.006490360033445576,
 0.010157307948082037,
 0.007019877545077951,
 0.008573299223209267,
 0.007026322828194117,
 0.00882040618465442,
 0.008757386429685628,
 0.007825890503823856,
 0.008735629373828355,
 0.008689390705545768,
 0.007758908795825488,
 0.006625296102282337,
 0.009545445166794819,
 0.008475494959586544,
 0.008080227269466516,
 0.006844086212712497,
 0.007749655531659444,
 0.009077001602464572,
 0.010200846853522145,
 0.007489005166258004,
 0.008633864463437816,
 0.00749927437

# B. Results
**Report mean square MSE and std square MSE**

In [32]:
# Calculate the mean and the standard deviation of the metric on the 50 samplings
mean_sqmse_B = np.mean(MSE_List)
std_sqmse_B  = np.std(MSE_List)

# Generate a data frame to store the results of the differents parts of this project
df_results = pd.DataFrame.from_dict({"Part": ["B"],"mean_sq_mse": [mean_sqmse_B], "std_sq_mse": [std_sqmse_B]})
df_results

Unnamed: 0,Part,mean_sq_mse,std_sq_mse
0,B,0.008122,0.000954


# B.Q: How does the mean of the mean squared errors compare to that from Step A?

### Because of normalized data, mean_sq_mse and std_sq_mse have both become very small.

# Part C

**C. Increate the number of epochs**

Repeat Part B but use 100 epochs this time for training.

# C.Model

In [33]:
model = regression_model()

In [34]:
model.fit(X_train, y_train, epochs=100)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

<keras.callbacks.History at 0x7fd3fc2f5510>

In [35]:
y_pred = model.predict(X_test)

In [36]:
result = np.sqrt(mean_squared_error(y_test,y_pred))
result

0.0061136355458137935

In [37]:
model.summary()

Model: "sequential_102"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_204 (Dense)            (None, 10)                90        
_________________________________________________________________
dense_205 (Dense)            (None, 1)                 11        
Total params: 101
Trainable params: 101
Non-trainable params: 0
_________________________________________________________________


# C. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

In [38]:
MSE_List = []
for i in range(50):
    #1-Split Data:
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
    
    model = regression_model()
    #2-Train:
    model.fit(X_train, y_train, epochs=100, verbose=0)
    
    #Prediction:
    y_pred = model.predict(X_test)
    
    #3-Evaluate_Model:
    result = np.sqrt(mean_squared_error(y_test,y_pred))
    print("{}: sqrt(mse) = {}".format(i+1,result))
    MSE_List.append(result)
    print("***_________________________________***\n\n\n")

1: sqrt(mse) = 0.008226750747217062
***_________________________________***



2: sqrt(mse) = 0.005933430673047532
***_________________________________***



3: sqrt(mse) = 0.0076200664032270455
***_________________________________***



4: sqrt(mse) = 0.006291693919763972
***_________________________________***



5: sqrt(mse) = 0.00692570433550958
***_________________________________***



6: sqrt(mse) = 0.0062468890485515415
***_________________________________***



7: sqrt(mse) = 0.007204559090801533
***_________________________________***



8: sqrt(mse) = 0.007376079804048551
***_________________________________***



9: sqrt(mse) = 0.005662847641679397
***_________________________________***



10: sqrt(mse) = 0.00819448454226141
***_________________________________***



11: sqrt(mse) = 0.00836682497592288
***_________________________________***



12: sqrt(mse) = 0.0066073476612303035
***_________________________________***



13: sqrt(mse) = 0.008227765087812517
***_________

In [39]:
MSE_List

[0.008226750747217062,
 0.005933430673047532,
 0.0076200664032270455,
 0.006291693919763972,
 0.00692570433550958,
 0.0062468890485515415,
 0.007204559090801533,
 0.007376079804048551,
 0.005662847641679397,
 0.00819448454226141,
 0.00836682497592288,
 0.0066073476612303035,
 0.008227765087812517,
 0.006403228644704123,
 0.008184239012582592,
 0.0064828994187938775,
 0.00811918066239141,
 0.007560446891853981,
 0.008188171841293359,
 0.006127127882741404,
 0.008404975533214948,
 0.006672755462128375,
 0.006707032277845725,
 0.0077545736537258285,
 0.007572511940778444,
 0.006875440130331893,
 0.007405614618799181,
 0.008116903313563854,
 0.008255430825468807,
 0.007239083526594434,
 0.0061360150048272895,
 0.006186995722274313,
 0.006562902495109832,
 0.005903943445420029,
 0.00699273641278961,
 0.00820740612382756,
 0.007393673716268863,
 0.00666314973136513,
 0.00861878305153046,
 0.007072404979156957,
 0.006353312639153373,
 0.00812341950553503,
 0.00693272797477989,
 0.007276765355

# C. Results
**Report mean square MSE and std square MSE**

In [40]:
# Calculate the mean and the standard deviation of the metric on the 50 samplings
mean_sqmse_C = np.mean(MSE_List)
std_sqmse_C  = np.std(MSE_List)

# Generate a data frame to store the results of the differents parts of this project
df_results = pd.DataFrame.from_dict({"Part": ["C"],"mean_sq_mse": [mean_sqmse_C], "std_sq_mse": [std_sqmse_C]})
df_results

Unnamed: 0,Part,mean_sq_mse,std_sq_mse
0,C,0.007222,0.000844


# C.Q: How does the mean of the mean squared errors compare to that from Step B?
### We had a decrease in mean_sq_mse, but not much, and std_sq_mse also increased slightly.

# Part D
**D. Increase the number of hidden layers**

Repeat part B but use a neural network with the following instead:

- Three hidden layers, each of 10 nodes and ReLU activation function.

# D. Model

In [41]:
def regression_model():
    # create model
    model = Sequential()
    model.add(Dense(10, input_dim=8, activation='relu'))
    model.add(Dense(10, input_dim=8, activation='relu'))
    model.add(Dense(10, input_dim=8, activation='relu'))
    model.add(Dense(1, kernel_initializer='normal'))
    # Compile model
    model.compile(loss='mean_squared_error', optimizer='adam')
    return model

In [42]:
model = regression_model()

In [43]:
model.fit(X_train, y_train, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x7fd3e06e6290>

In [44]:
y_pred = model.predict(X_test)

In [45]:
result = np.sqrt(mean_squared_error(y_test,y_pred))
result

0.0077174860612684824

In [46]:
model.summary()

Model: "sequential_153"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_306 (Dense)            (None, 10)                90        
_________________________________________________________________
dense_307 (Dense)            (None, 10)                110       
_________________________________________________________________
dense_308 (Dense)            (None, 10)                110       
_________________________________________________________________
dense_309 (Dense)            (None, 1)                 11        
Total params: 321
Trainable params: 321
Non-trainable params: 0
_________________________________________________________________


# D. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

In [47]:
MSE_List = []
for i in range(50):
    #1-Split Data:
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
    
    model = regression_model()
    #2-Train:
    model.fit(X_train, y_train, epochs=50, verbose=0)
    
    #Prediction:
    y_pred = model.predict(X_test)
    
    #3-Evaluate_Model:
    result = np.sqrt(mean_squared_error(y_test,y_pred))
    print("{}: sqrt(mse) = {}".format(i+1,result))
    MSE_List.append(result)
    print("***_________________________________***\n\n\n")

1: sqrt(mse) = 0.006230406989858119
***_________________________________***



2: sqrt(mse) = 0.006953670956050892
***_________________________________***



3: sqrt(mse) = 0.007687770800226865
***_________________________________***



4: sqrt(mse) = 0.006832071994583188
***_________________________________***



5: sqrt(mse) = 0.008261719655426472
***_________________________________***



6: sqrt(mse) = 0.006136855025163935
***_________________________________***



7: sqrt(mse) = 0.006801128115453124
***_________________________________***



8: sqrt(mse) = 0.006729118753173897
***_________________________________***



9: sqrt(mse) = 0.007686537358671594
***_________________________________***



10: sqrt(mse) = 0.007293690791444673
***_________________________________***



11: sqrt(mse) = 0.006817313735726845
***_________________________________***



12: sqrt(mse) = 0.007045622062330836
***_________________________________***



13: sqrt(mse) = 0.007735181311010573
***_________

In [48]:
MSE_List

[0.006230406989858119,
 0.006953670956050892,
 0.007687770800226865,
 0.006832071994583188,
 0.008261719655426472,
 0.006136855025163935,
 0.006801128115453124,
 0.006729118753173897,
 0.007686537358671594,
 0.007293690791444673,
 0.006817313735726845,
 0.007045622062330836,
 0.007735181311010573,
 0.0070157425208156035,
 0.006505712542647821,
 0.00656378787307275,
 0.007419241245878189,
 0.00702026768661621,
 0.00639379108247713,
 0.0063965466440104035,
 0.008152835061097007,
 0.007510613859611554,
 0.008313758040298313,
 0.006276500071263641,
 0.006755201659940734,
 0.007844029340675913,
 0.006780580963661937,
 0.006776437992813839,
 0.007138227303496787,
 0.006929345155786663,
 0.007968260346121098,
 0.006209845880472454,
 0.006740580229523551,
 0.00640616597365763,
 0.00683950271909451,
 0.007180853064954085,
 0.0089535318159834,
 0.008204231006794713,
 0.007367611035291861,
 0.007746633444577771,
 0.00972952919388921,
 0.006046885708886953,
 0.007321902337899094,
 0.00718939960486

# D. Result

In [49]:
# Calculate the mean and the standard deviation of the metric on the 50 samplings
mean_sqmse_D = np.mean(MSE_List)
std_sqmse_D  = np.std(MSE_List)

# Generate a data frame to store the results of the differents parts of this project
df_results = pd.DataFrame.from_dict({"Part": ["D"],"mean_sq_mse": [mean_sqmse_D], "std_sq_mse": [std_sqmse_D]})
df_results

Unnamed: 0,Part,mean_sq_mse,std_sq_mse
0,D,0.007131,0.000759


# **D.Q: How does the mean of the mean squared errors compare to that from Step B?**
### Both mean_sq_mse and std_sq_mse are reduced.

# **All Results**

In [50]:
Results = pd.DataFrame.from_dict({"Part": ["A","B", "C", "D"],
                                  "mean_sq_mse": [mean_sqmse_A, mean_sqmse_B, mean_sqmse_C, mean_sqmse_D], 
                                  "std_sq_mse": [std_sqmse_A, std_sqmse_B, std_sqmse_C, std_sqmse_D]})
Results

Unnamed: 0,Part,mean_sq_mse,std_sq_mse
0,A,10.267254,1.092846
1,B,0.008122,0.000954
2,C,0.007222,0.000844
3,D,0.007131,0.000759
