### ◉research question
**Improving Newral Network by modifying hyper parameter and layers**
* I will test how the results are changing by changing a series of model patterns using keras.
* The dataset is the volumes of construction materials which is used to compose concrete, such as cement and water.
* The strength of concrete is the target value, and I will predict and research the y-hat value by regression model and mean_squared_error value.
* This dataset also includes age of concrete mix even though age is important value for strength.

### ◉the region and the domain category that this data sets are about
**USA, construction materials**


In [None]:
import numpy as np
import pandas as pd
import tensorflow as tf

In [None]:
#### Download the concrete data
concrete_data = pd.read_csv('../input/us-concrete-data/concrete_data.csv')
print(concrete_data.shape)
print(concrete_data.head()) # By the way, unit is 'cubic meter' and days old of concrete mix, and unit of strength is MPa.
concrete_data.describe()

In [None]:
concrete_data.isnull().sum() # Looks very clean data.

In [None]:
df = concrete_data
cols = df.columns
X = df[cols[cols != 'Strength']]
y = df['Strength']

In [None]:
# To get reproducible results I'm setting random seed
np.random.seed(1)
tf.random.set_seed(1)
import keras
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from keras.models import Sequential
from keras.layers import Dense
test_size = 0.3
def random_data_split(X, y, seed):
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=test_size, random_state=seed)
    return X_train, X_test, y_train, y_test

## Step 1. Without Normalization

In [None]:
# baseline model (One hidden layer 10 nodes, 50 epochs)
mse_list = []
predicted_list = {}
def create_baseline_model():
    baseline_model = Sequential()
    baseline_model.add(Dense(10, activation='relu', input_shape=(X.shape[1],)))
    baseline_model.add(Dense(1))
    baseline_model.compile(optimizer='adam', loss='mean_squared_error')
    return baseline_model

# collect 50 mse values.
for i in range(50):
    if (i + 1) % 10 == 0:
        print('Now {} times calculating.'.format(i + 1))
    model = create_baseline_model()
    X_train, X_test, y_train, y_test = random_data_split(X, y, i)
    model.fit(X_train, y_train, epochs=50, verbose=0)
    y_hats = model.predict(X_test)
    mse = mean_squared_error(y_test, y_hats)
    mse_list.append(mse)
    predicted_list[i] = {'y_test': y_test, 'y_hats': y_hats}

# Calculate mean and standard deviation of mse values
mse_mean = np.mean(mse_list)
mse_std = np.std(mse_list)
print('Mean value of MSE:{:.2f}, Standard Deviation value of MSE:{:.2f}.'.format(mse_mean, mse_std), 'First three MSE values:', mse_list[0:3])
print('Acctual Value samples', predicted_list[0]['y_test'].values[0:3], 'Predicted Value samples', np.around(predicted_list[0]['y_hats'].flatten()[0:3], decimals=2))

## Step 2. With Normalization

In [None]:
# In this time, I'm not doing any Normalization. So I'll Normalize continuous values (by subtracting the mean from the individual predictors and dividing by the standard deviation).
X_norm = (X - X.mean()) / X.std()
X_norm.head(3)

In [None]:
# Now I'm getting ready to examin how normalization can improve the baseline model (One hidden layer 10 nodes, 50 epochs)
mse_list_norm = []
predicted_list_norm = {}
# collect 50 mse values.
for i in range(50):
    if (i + 1) % 10 == 0:
        print('Now {} times calculating.'.format(i + 1))
    model = create_baseline_model()
    X_train, X_test, y_train, y_test = random_data_split(X_norm, y, i)
    model.fit(X_train, y_train, epochs=50, verbose=0)
    y_hats = model.predict(X_test)
    mse = mean_squared_error(y_test, y_hats)
    mse_list_norm.append(mse)
    predicted_list_norm[i] = {'y_test': y_test, 'y_hats': y_hats}

# Calculate mean and standard deviation of mse values
mse_mean_norm = np.mean(mse_list_norm)
mse_std_norm = np.std(mse_list_norm)
print('Mean value of MSE:{:.2f} and Standard Deviation value of MSE:{:.2f}.'.format(mse_mean_norm, mse_std_norm), 'First three MSE values:', mse_list_norm[0:3])
print('Acctual Value samples', predicted_list_norm[0]['y_test'].values[0:3], 'Predicted Value samples', np.around(predicted_list_norm[0]['y_hats'].flatten()[0:3], decimals=2))

##### ■In my case, after I applied Normalization to X, I can get MSE value of 368 over the same baseline model.<br>So mean of 50 MSE values are only slightly getting better from 369 to 368.<br>But Standard Deviation is reducing dynamically from 404 to 109.<br>So I checked histgram of 50 MSE values. Look at following charts.

In [None]:
import matplotlib.pyplot as plt
plt.figure()
fig, ((ax1), (ax2)) = plt.subplots(1, 2, sharex=True, sharey=True)
ax1.hist(mse_list, alpha=0.5, bins=20, color='r', label='baseline model')
ax2.hist(mse_list_norm, alpha=0.5, bins=5, color='b', label='After applied Normalization')
ax1.legend()
ax2.legend()
ax1.set_xlabel('Mean Squared Values of baseline model')
ax2.set_xlabel('Mean Squared Values after the Normalization')
ax1.set_ylabel('Standard Deviation value of MSE')
fig= plt.gcf()
fig.set_size_inches(10, 5.5)
plt.show()

### In this charts, I created histgram of each mean squared values. By looking at above charts, we can recognize MSE values are no longer over the 1000 after being applied Normalization.<br>By using Normalization it seems we can get narrow range of Mean Squared Error values.<br>In other words, doing Normlization to continuous values are stabilizing Mean Squared Values.<br>In this time, I'm calculating MSE value 50 times, so applying Normalization is very important if I calculate only 1 MSE value. And usually we calculate MSE value only one time.
**To recap, Normalization is essential to get correct Mean Square Value.**

## Step 3. Increase Epochs

In [None]:
# Then I will increase epoch values to 100 and look at how models are improved by increasing epoch.
# Now I'm getting ready to examin how increasing epochs can improve the normalized model (One hidden layer 10 nodes, 100 epochs)
epochs = 100
mse_list_double_epoch = []
predicted_list_double_epoch = {}
# collect 50 mse values.
for i in range(50):
    if (i + 1) % 10 == 0:
        print('Now {} times calculating.'.format(i + 1))
    model = create_baseline_model()
    X_train, X_test, y_train, y_test = random_data_split(X_norm, y, i)
    model.fit(X_train, y_train, epochs=epochs, verbose=0)
    y_hats = model.predict(X_test)
    mse = mean_squared_error(y_test, y_hats)
    mse_list_double_epoch.append(mse)
    predicted_list_double_epoch[i] = {'y_test': y_test, 'y_hats': y_hats}

# Calculate mean and standard deviation of mse values
mse_mean_double_epoch = np.mean(mse_list_double_epoch)
mse_std_double_epoch = np.std(mse_list_double_epoch)
print('Mean value of MSE:{:.2f} and Standard Deviation value of MSE:{:.2f}.'.format(mse_mean_double_epoch, mse_std_double_epoch), 'First three MSE values:', mse_list_double_epoch[0:3])
print('Acctual Value samples', predicted_list_double_epoch[0]['y_test'].values[0:3], 'Predicted Value samples', np.around(predicted_list_double_epoch[0]['y_hats'].flatten()[0:3], decimals=2))

##### ■By increasing epochs to 100, mean of 50 Mean Squared value is improving to 168 from 368. There is no wonder though.

## Step 4. Increase NN Layers and Set Epochs Back to Step 2.

In [None]:
# Then I will increase hidden layers to three but set epochs back to 50 same as B.
# Now I'm getting ready to examin how increasing hidden layers can improve the normalized model (Three hidden layer 10 nodes, 50 epochs)
epochs = 50
mse_list_three_layers = []
predicted_list_three_layers = {}

def create_three_layer_model():
    baseline_model = Sequential()
    baseline_model.add(Dense(10, activation='relu', input_shape=(X.shape[1],)))
    baseline_model.add(Dense(10, activation='relu'))
    baseline_model.add(Dense(10, activation='relu'))
    baseline_model.add(Dense(1))
    baseline_model.compile(optimizer='adam', loss='mean_squared_error')
    return baseline_model

# collect 50 mse values.
for i in range(50):
    if (i + 1) % 10 == 0:
        print('Now {} times calculating.'.format(i + 1))
    model = create_three_layer_model()
    X_train, X_test, y_train, y_test = random_data_split(X_norm, y, i)
    model.fit(X_train, y_train, epochs=epochs, verbose=0)
    y_hats = model.predict(X_test)
    mse = mean_squared_error(y_test, y_hats)
    mse_list_three_layers.append(mse)
    predicted_list_three_layers[i] = {'y_test': y_test, 'y_hats': y_hats}

# Calculate mean and standard deviation of mse values
mse_mean_three_layers = np.mean(mse_list_three_layers)
mse_std_three_layers = np.std(mse_list_three_layers)
print('Mean value of MSE:{:.2f} and Standard Deviation value of MSE:{:.2f}.'.format(mse_mean_three_layers, mse_std_three_layers), 'First three MSE values:', mse_list_three_layers[0:3])
print('Acctual Value samples', predicted_list_three_layers[0]['y_test'].values[0:3], 'Predicted Value samples', np.around(predicted_list_three_layers[0]['y_hats'].flatten()[0:3], decimals=2))

##### ■By increasing hidden layer to three from one, mean of 50 Mean Squared value is improving to 124 from 368. Even though epochs are same as step 2, there is a big change to MSE value to be improved.