# C. Increate the number of epochs

Repeat Part B but use 100 epochs this time for training.

# Load data

We will be playing around with the same dataset that we used in the videos.

<strong>The dataset is about the compressive strength of different samples of concrete based on the volumes of the different ingredients that were used to make them. Ingredients include:</strong>

<strong>1. Cement</strong>

<strong>2. Blast Furnace Slag</strong>

<strong>3. Fly Ash</strong>

<strong>4. Water</strong>

<strong>5. Superplasticizer</strong>

<strong>6. Coarse Aggregate</strong>

<strong>7. Fine Aggregate</strong>


In [1]:
import pandas as pd
import numpy as np

In [2]:
# Avoid warnings to improve readability
import warnings
import logging

def warn(*args, **kwargs):
    pass
warnings.warn = warn
 
logging.getLogger('tensorflow').disabled = True
logging.getLogger('keras').disabled = True

In [3]:
import keras

from keras.models import Sequential
from keras.layers import Dense

Using TensorFlow backend.


In [4]:
from sklearn.model_selection import train_test_split

from sklearn.metrics import mean_squared_error

In [5]:
data = pd.read_csv('https://cocl.us/concrete_data')
data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


In [6]:
data.shape

(1030, 9)

In [7]:
data.describe()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
count,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0
mean,281.167864,73.895825,54.18835,181.567282,6.20466,972.918932,773.580485,45.662136,35.817961
std,104.506364,86.279342,63.997004,21.354219,5.973841,77.753954,80.17598,63.169912,16.705742
min,102.0,0.0,0.0,121.8,0.0,801.0,594.0,1.0,2.33
25%,192.375,0.0,0.0,164.9,0.0,932.0,730.95,7.0,23.71
50%,272.9,22.0,0.0,185.0,6.4,968.0,779.5,28.0,34.445
75%,350.0,142.95,118.3,192.0,10.2,1029.4,824.0,56.0,46.135
max,540.0,359.4,200.1,247.0,32.2,1145.0,992.6,365.0,82.6


# Build a model 

Use the Keras library to build a neural network with the following:

- One hidden layer of 10 nodes, and a ReLU activation function

- Use the adam optimizer and the mean squared error  as the loss function.

In [8]:
cols = data.columns

predictors = data[cols[cols != 'Strength']] 
predictors_norm = (predictors - predictors.mean()) / predictors.std()
target = data['Strength']

In [9]:
EPOCHS = 100

In [10]:
def regression_model(n_cols):
    """Create the model."""
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(1))

    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

In [11]:
model = regression_model(predictors.shape[1])

## Split data

Randomly split the data into a training and test sets by holding 30% of the data for testing. You can use the train_test_split

helper function from Scikit-learn.

In [12]:
pred_train, pred_test, target_train, target_test = train_test_split(predictors_norm, target, test_size=0.3, random_state=42)

## Train the model

Train the model on the training data using 50 epochs.

In [13]:
model.fit(pred_train, target_train, validation_data=(pred_test, target_test), epochs=EPOCHS, verbose=0)

2024-05-11 20:11:26.283605: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
2024-05-11 20:11:26.291370: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2394310000 Hz
2024-05-11 20:11:26.291988: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5634ca163680 executing computations on platform Host. Devices:
2024-05-11 20:11:26.292035: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>


<keras.callbacks.History at 0x7f8612c7af50>

## Evaluate 

Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.

In [14]:
mean_squared_error(target_train, model.predict(pred_train))

174.56792648764676

##  Repeat steps 1 - 3

Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

In [15]:
def experiment(x):
    model = regression_model(predictors.shape[1])
    pred_train, pred_test, target_train, target_test = train_test_split(predictors, 
                                                                        target, test_size=0.3, random_state=42)
    model.fit(pred_train, target_train, validation_data=(pred_test, target_test), 
              epochs=EPOCHS, verbose=0)
    print("·" if (x % 10) != 9 else "|", end="")    # a simple progress bar
    return mean_squared_error(target_train, model.predict(pred_train))

In [None]:
mean_squared_errors = np.array([experiment(x) for x in range(50)])
mean_squared_errors

·········|·········|·········|·········|·

## Report and submit 
Report the mean and the standard deviation of the mean squared errors.

Submit your Jupyter Notebook with your code and comments.

In [None]:
mean_squared_errors.mean(), mean_squared_errors.std()