### PART A ###  

Use the Keras library to build a neural network with the following:
- One hidden layer of 10 nodes, and a ReLU activation function
- Use the adam optimizer and the mean squared error  as the loss function.

1. Randomly split the data into a training and test sets by holding 30% of the data for testing. You can use the train_test_splithelper function from Scikit-learn.

2. Train the model on the training data using 50 epochs.

3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.

4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

5. Report the mean and the standard deviation of the mean squared errors.

In [22]:
# fetching the data
import numpy as np
import pandas as pd
data = pd.read_csv('https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0101EN/labs/data/concrete_data.csv')
data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


In [23]:
# Predictor data
predictor = data.drop(['Age','Strength'], axis=1)
# target data
target = data['Strength']
print('Predictor dataset: \n', predictor[:5])
print('target dataset : \n',target[:5])


Predictor dataset: 
    Cement  Blast Furnace Slag  Fly Ash  Water  Superplasticizer  \
0   540.0                 0.0      0.0  162.0               2.5   
1   540.0                 0.0      0.0  162.0               2.5   
2   332.5               142.5      0.0  228.0               0.0   
3   332.5               142.5      0.0  228.0               0.0   
4   198.6               132.4      0.0  192.0               0.0   

   Coarse Aggregate  Fine Aggregate  
0            1040.0           676.0  
1            1055.0           676.0  
2             932.0           594.0  
3             932.0           594.0  
4             978.4           825.5  
target dataset : 
 0    79.99
1    61.89
2    40.27
3    41.05
4    44.30
Name: Strength, dtype: float64


## Let's prepare the data sets for training and validation


In [24]:
# Making the neural network
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
cols = 7 # we have 7 predictor variables

# define regression model
def regression_model():
    # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(cols,)))
    model.add(Dense(1))
    
    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error',metrics=["mean_squared_error"])
    return model

#### This loop will be run 50 times to complete the task A

In [26]:
from  sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

MSEs = []
for i in range(0,50):
    # preparing split data set for our neural network, everytime there will be different data set
    from sklearn.model_selection import train_test_split
    x_train, x_test, y_train, y_test = train_test_split(predictor.values, target.values, test_size=0.3, random_state=i)

    # training the neural network with 50 epochs
    partA_Model = regression_model()
    partA_Model.fit(x_train,y_train, epochs=50, verbose=0)
    
    # evaluating the network on the test data
    predict = partA_Model.predict(x_test)
    MSE = np.sqrt(mean_squared_error(y_test, predict))
    print("Trial ", i+1, " MSE : ", MSE)
    MSEs.append(MSE)



Trial  1  MSE :  17.581385887045826
Trial  2  MSE :  13.084057846332314
Trial  3  MSE :  14.343670799261544
Trial  4  MSE :  15.933968188705334
Trial  5  MSE :  35.48259676814836
Trial  6  MSE :  12.45726582776063
Trial  7  MSE :  27.677509255010715
Trial  8  MSE :  14.0392274471008
Trial  9  MSE :  22.387871766140314
Trial  10  MSE :  13.662487043902496
Trial  11  MSE :  12.759913877408477
Trial  12  MSE :  13.713757302375019
Trial  13  MSE :  12.541009186331689
Trial  14  MSE :  13.512013882797932
Trial  15  MSE :  17.442255344007805
Trial  16  MSE :  27.190354500264736
Trial  17  MSE :  12.515122446535504
Trial  18  MSE :  22.009590932397924
Trial  19  MSE :  12.200793941138794
Trial  20  MSE :  43.04766625684156
Trial  21  MSE :  13.02470709388921
Trial  22  MSE :  18.325302743259062
Trial  23  MSE :  13.193015637342826
Trial  24  MSE :  32.291013777478746
Trial  25  MSE :  13.54908930297515
Trial  26  MSE :  12.573097182290388
Trial  27  MSE :  17.894126958302362
Trial  28  MSE : 

In [27]:
# calculating mean and std deviation of mean_squared errors.
partA_MSE_mean = np.mean(MSEs)
partA_MSE_std = np.std(MSEs)
print(" Part A : Mean of MSEs : ", partA_MSE_mean, ", Std deviation of MSEs : ", partA_MSE_std)

 Part A : Mean of MSEs :  18.41812414919169 , Std deviation of MSEs :  9.341542598296293
