### ***Loading required data***

In [1]:
# fetching the data
import numpy as np
import pandas as pd
data = pd.read_csv('https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0101EN/labs/data/concrete_data.csv')
data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


In [2]:
# Predictors and target data
predictors = data.drop(['Age','Strength'], axis=1)
target = data['Strength']
predictors.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5


### ***Setting up functions to run through part A to part D***
* For every part we will call the function evaluate()
* This function will have argument where we can specify all the parameters we are varying through all four scenarios

### ***implementing evaluate function***
*** inputs :: ***
* `hidden_layers` (int): specifying number of hidden layers (10 node, activation function = relu)
* `normalized_data` (bool): flag that suggest if we are needing to pass normalized data
* `epochs` (int): number of epochs
* `Question` (str): Part of the question (i.e. PartA, PartB, etc..)

In [3]:
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# if keras through tensorflow, if not suited please comment this two lines
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# if only keras available
# from keras.models import Sequential
# from keras.layers import Dense

cols = 7 # we have 7 predictor variables

def evaluate(hidden_layers, normalized_data, epochs, Question):

    # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(cols,))) # first hidden layer
    # remaining hidden layers as reqruied
    remaining_hidden_layers = hidden_layers-1
    if(remaining_hidden_layers>0):
        for i in range(remaining_hidden_layers):
            model.add(Dense(10, activation='relu'))
    model.add(Dense(1))
    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error', metrics=['mean_squared_error'])

    # printing status
    print(hidden_layers, " hidden layers each with 10 nodes & 'relu' activation ||  epochs = ", epochs, " || is data normalized? : ", normalized_data)
    print("Optimizer = adam || loss function = mean_squared_error")

    MSEs = []
    for i in range(0,50):
        
        # deciding if the data is normalized or not
        if(normalized_data):
            normalized_predictors = (predictors - predictors.mean()) / predictors.std()
            x_train, x_test, y_train, y_test = train_test_split(normalized_predictors, target, test_size=0.3, random_state=i)
        else:
            x_train, x_test, y_train, y_test = train_test_split(predictors, target, test_size=0.3, random_state=i)

        # training the neural network with specified epochs
        model.fit(x_train,y_train, epochs=epochs, verbose=0)
    
        # evaluating the network on the test data
        predict = model.predict(x_test)
        MSE = np.sqrt(mean_squared_error(y_test, predict))
        if((i+1)%5==0 or i==0 or i==49):
            print('Trial No: ', i+1, ', value of MSE : ', MSE)
        MSEs.append(MSE)
        
    mean_MSEs = np.mean(MSEs)
    std_MSEs = np.std(MSEs)
    # calculating mean and std deviation of mean_squared errors.
    print(Question," : Mean of MSEs : ", mean_MSEs, ", Std deviation of MSEs : ", std_MSEs)
    return [mean_MSEs, std_MSEs]


### PART A ###  

- One hidden layer of 10 nodes, and a ReLU activation function
- Use the adam optimizer and the mean squared error  as the loss function.

1. Randomly split the data into a training and test sets by holding 30% of the data for testing. You can use the train_test_splithelper function from Scikit-learn.
2. Train the model on the training data using 50 epochs.
3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.
4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.
5. Report the mean and the standard deviation of the mean squared errors.

In [4]:
PartA_MSEs_details = evaluate(hidden_layers = 1, normalized_data = False, epochs = 50, Question = 'Part A')

Instructions for updating:
If using Keras pass *_constraint arguments to layers.
1  hidden layers each with 10 nodes & 'relu' activation ||  epochs =  50  || is data normalized? :  False
Optimizer = adam || loss function = mean_squared_error
Trial No:  1 , value of MSE :  34.628927613384924
Trial No:  5 , value of MSE :  13.968830317663851
Trial No:  10 , value of MSE :  12.866930818251198
Trial No:  15 , value of MSE :  12.39582582069461
Trial No:  20 , value of MSE :  12.887446474062443
Trial No:  25 , value of MSE :  11.94899032384469
Trial No:  30 , value of MSE :  12.348055800048865
Trial No:  35 , value of MSE :  12.738469010710407
Trial No:  40 , value of MSE :  12.04989928859077
Trial No:  45 , value of MSE :  12.202842508708086
Trial No:  50 , value of MSE :  12.366386180123483
Part A  : Mean of MSEs :  13.328870325412963 , Std deviation of MSEs :  3.5753885936607377


### PART B : Normalize the data & repeat A

- Repeat Part A but use a normalized version of the data.
- How does the mean of the mean squared errors compare to that from Step A?

In [5]:
PartB_MSEs_details = evaluate(hidden_layers = 1, normalized_data = True, epochs = 50, Question = 'Part B')

1  hidden layers each with 10 nodes & 'relu' activation ||  epochs =  50  || is data normalized? :  True
Optimizer = adam || loss function = mean_squared_error
Trial No:  1 , value of MSE :  19.819629631498128
Trial No:  5 , value of MSE :  13.032844783051335
Trial No:  10 , value of MSE :  12.064860560244904
Trial No:  15 , value of MSE :  11.403932054423871
Trial No:  20 , value of MSE :  11.669343935060057
Trial No:  25 , value of MSE :  10.703301418842505
Trial No:  30 , value of MSE :  11.146171026107487
Trial No:  35 , value of MSE :  11.55543904845848
Trial No:  40 , value of MSE :  10.9081330669073
Trial No:  45 , value of MSE :  11.500454319662447
Trial No:  50 , value of MSE :  11.550712060526466
Part B  : Mean of MSEs :  11.699416461742384 , Std deviation of MSEs :  1.3654152067168561


### PART C: Increate the number of epochs & repeat B

- Repeat Part B but use 100 epochs this time for training.
- How does the mean of the mean squared errors compare to that from Step B?

In [6]:
PartC_MSEs_details = evaluate(hidden_layers = 1, normalized_data = True, epochs = 100, Question = 'Part C')

1  hidden layers each with 10 nodes & 'relu' activation ||  epochs =  100  || is data normalized? :  True
Optimizer = adam || loss function = mean_squared_error
Trial No:  1 , value of MSE :  13.1760334665067
Trial No:  5 , value of MSE :  12.381774093666813
Trial No:  10 , value of MSE :  12.001094752845514
Trial No:  15 , value of MSE :  11.248335106835839
Trial No:  20 , value of MSE :  11.588382218424265
Trial No:  25 , value of MSE :  10.841485787762752
Trial No:  30 , value of MSE :  11.178453137047981
Trial No:  35 , value of MSE :  11.315536615658432
Trial No:  40 , value of MSE :  10.72965255764738
Trial No:  45 , value of MSE :  11.303486609340096
Trial No:  50 , value of MSE :  11.530064447536045
Part C  : Mean of MSEs :  11.447556095371015 , Std deviation of MSEs :  0.5617237509828857


### PART D: Increase the number of hidden layers & repeat B

- Repeat part B but use a neural network with the following instead:
- **3 hidden layers**, each of 10 nodes and ReLU activation function.
- How does the mean of the mean squared errors compare to that from Step B?

In [7]:
PartD_MSEs_details = evaluate(hidden_layers = 3, normalized_data = True, epochs = 50, Question = 'Part D')

3  hidden layers each with 10 nodes & 'relu' activation ||  epochs =  50  || is data normalized? :  True
Optimizer = adam || loss function = mean_squared_error
Trial No:  1 , value of MSE :  12.570255562096312
Trial No:  5 , value of MSE :  12.56080077703712
Trial No:  10 , value of MSE :  11.861553945024273
Trial No:  15 , value of MSE :  11.291174955230245
Trial No:  20 , value of MSE :  11.380890754938612
Trial No:  25 , value of MSE :  10.493324030786237
Trial No:  30 , value of MSE :  10.861750211251517
Trial No:  35 , value of MSE :  11.142698087779742
Trial No:  40 , value of MSE :  10.633514876727878
Trial No:  45 , value of MSE :  11.396905498438093
Trial No:  50 , value of MSE :  11.157080148410849
Part D  : Mean of MSEs :  11.263319024897207 , Std deviation of MSEs :  0.6058315865135082


### ***Final Results & Analysis***

In [11]:
# let's compare all four results
Final_results = pd.DataFrame({'Question':['Part A', 'Part B', 'Part C', 'Part D'], 'mean of MSE':[0.0,0.0,0.0,0.0], 'std dev of MSE':[0.0,0.0,0.0,0.0]})


results = [PartA_MSEs_details, PartB_MSEs_details, PartC_MSEs_details, PartD_MSEs_details]
for i,result in enumerate(results):
    Final_results.loc[i,['mean of MSE']] = result[0]
    Final_results.loc[i,['std dev of MSE']] = result[1]
    
print(Final_results)

  Question  mean of MSE  std dev of MSE
0   Part A    13.328870        3.575389
1   Part B    11.699416        1.365415
2   Part C    11.447556        0.561724
3   Part D    11.263319        0.605832


### ***Answer***
- As you can see from the table above, moving from PART A to PART D, the mean squared error is decreasing. Meaning we are gaining more and more accuracy. Also, up to some extent stadard deviation is decreasing as well.
- Over all PART A has the least accuracy and PART D has the highest, with slightly more std dev than PART C

-  Question  mean of MSE  std dev of MSE
-  Part A    13.328870        3.575389
-  Part B    11.699416        1.365415
-  Part C    11.447556        0.561724
-  Part D    11.263319        0.605832