### Neural Network on Concrete Dataset

The concrete dataset contains measurements of various properties of concrete mixtures and the strength of the concrete produced from those mixtures. The goal of the neural network is to predict the strength of concrete based on these properties. The dataset contains 1030 samples and 8 input features, including the amount of cement, slag, fly ash, water, superplasticizer, coarse aggregate, fine aggregate, and age of the concrete.






In [1]:
# Import necessary libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from keras.models import Sequential
from keras.layers import Dense, Dropout, BatchNormalization

In [2]:
# Loading Data Files
concrete = pd.read_csv(r"C:\MachineLearning\Deep_Learning_ConcreteData\concrete.csv")
print(concrete.head())

   cement   slag    ash  water  superplastic  coarseagg  fineagg  age  \
0   141.3  212.0    0.0  203.5           0.0      971.8    748.5   28   
1   168.9   42.2  124.3  158.3          10.8     1080.8    796.2   14   
2   250.0    0.0   95.7  187.4           5.5      956.9    861.2   28   
3   266.0  114.0    0.0  228.0           0.0      932.0    670.0   28   
4   154.8  183.4    0.0  193.3           9.1     1047.4    696.7   28   

   strength  
0     29.89  
1     23.51  
2     29.22  
3     45.85  
4     18.29  


Normalizing the values by scaling them proportionally within the range [0, 1] and applying normalization to a dataframe using the apply method.


In [3]:
##custom normalization function
def normalize(x):
    return (x - x.min()) / (x.max() - x.min())

In [4]:
#apply normalization to entire data frame
concrete_norm = concrete.apply(normalize)

In [5]:
#confirm that the range is now between zero and one
print(concrete_norm['strength'].describe())

count    1030.000000
mean        0.417191
std         0.208119
min         0.000000
25%         0.266351
50%         0.400087
75%         0.545721
max         1.000000
Name: strength, dtype: float64


In [6]:
'compared to the original minimum and maximum'
print(concrete['strength'].describe())

count    1030.000000
mean       35.817961
std        16.705742
min         2.330000
25%        23.710000
50%        34.445000
75%        46.135000
max        82.600000
Name: strength, dtype: float64


In [7]:
#create training and test data
concrete_train = concrete_norm.iloc[:773, :]
concrete_test = concrete_norm.iloc[773:1030, :]

The code below defines a simple neural network model with one hidden layer and an output layer for regression tasks. 
The model uses the ReLU activation function in the hidden layer and linear activation in the output layer. It is compiled with the mean squared error loss function and the Adam optimizer.

In [8]:
from keras.models import Sequential
from keras.layers import Dense

def neuralnet():
    # Define a simple neural network with one hidden layer
    model = Sequential()
    model.add(Dense(units=64, activation='relu', input_dim=8))
    model.add(Dense(units=1, activation='linear'))

    # Compile the model with a mean squared error loss function and an optimizer
    model.compile(loss='mean_squared_error', optimizer='adam')

    return model


 The code below defines a function that builds and trains a neural network model based on a given formula and dataset. The model has one hidden layer with a customizable number of neurons, uses ReLU activation in the hidden layer, linear activation in the output layer, and is trained using mean squared error loss and the Adam optimizer.

In [9]:
def neuralnet(formula, data, hidden):
    # Parse the formula string to extract the response and predictor variables
    response, predictors = dmatrices(formula, data)

    # Create a sequential model with one hidden layer
    model = Sequential()
    model.add(Dense(units=hidden, activation='relu', input_dim=predictors.shape[1]))
    model.add(Dense(units=1, activation='linear'))

    # Compile the model with a mean squared error loss function and an optimizer
    model.compile(loss='mean_squared_error', optimizer='adam')

    # Train the model on the predictor and response variables
    model.fit(predictors, response, epochs=100, batch_size=10)

    return model


I implemented a simple Artificial Neural Network (ANN) model using the neuralnet package in R. The neuralnet function takes a formula and a dataset as inputs and trains a neural network model based on the provided data

The hidden parameter in the neuralnet function specifies the number of hidden neurons in the neural network. In this case, we are using only a single hidden neuron, which means that the model is relatively simple and may not be able to capture complex relationships between the predictors and the target variable.

By training this simple ANN model, we can compare its performance with the other machine learning models we have used on the same dataset, such as the Random Forest and XGBoost models. We can then determine which model provides the best predictive accuracy for the concrete strength dataset.





In [10]:
# Simple ANN with only a single hidden neuron
from patsy import dmatrices
import numpy as np
np.random.seed(12345) # to guarantee repeatable results
concrete_model = neuralnet(formula = "strength ~ cement + slag + ash + water + superplastic + coarseagg + fineagg + age",
data = concrete_train, hidden = 1)


Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

Below we can observe on how it improved the performance of the model by increasing the number of hidden layers, the number of neurons in each layer, and adding other advanced techniques like dropout and batch normalization.

Here's an example of a more complex architecture with two hidden layers, 64 and 32 neurons respectively, and dropout and batch normalization layers:

In [11]:
# Split into predictors and target
predictors = concrete_norm.iloc[:,:-1]
target = concrete_norm.iloc[:, -1]


# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(predictors, target, test_size=0.3, random_state=42)

# Normalize predictors
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Define model architecture
model = Sequential()
model.add(Dense(64, input_shape=(8,), activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.2))
model.add(Dense(32, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.2))
model.add(Dense(1, activation='linear'))

# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')

# Train model
model.fit(X_train, y_train, epochs=50, batch_size=32, verbose=1)

# Evaluate model on test set
loss = model.evaluate(X_test, y_test, verbose=0)
print('Test loss:', loss)

# Make predictions
y_pred = model.predict(X_test)

# Print some predictions and actual values
for i in range(5):
    print('Prediction:', y_pred[i], 'Actual:', y_test.iloc[i])

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Test loss: 0.011509106494486332
Prediction: [0.48274022] Actual: 0.4604459947676592
Prediction: [0.40026683] Actual: 0.4522237448610939
Prediction: [0.45731598] Actual: 0.5137660396162951
Prediction: [0.61192757] Actual: 0.4107387566961505
Prediction: [0.36257684] Actual: 0.4623146879282422


This architecture has two hidden layers with 64 and 32 neurons respectively. Both hidden layers include batch normalization layers and dropout layers to prevent overfitting. The output layer has a single neuron with a linear activation function to predict the concrete strength.

I also added batch normalization layers after each hidden layer to help with internal covariate shift, which is a common problem in deep neural networks. Batch normalization normalizes the inputs to a layer for each batch, helping to stabilize the learning process and reduce overfitting.

Finally, I used a linear activation function for the output layer since we are predicting a continuous variable (concrete strength).

By using a more complex architecture with batch normalization and dropout layers, I improved the performance of the neural network on the concrete dataset.

The model aims to minimize the difference between its predictions and the actual values.

Interpreting the result:

The decreasing loss during training indicates that the model is gradually improving its performance on the training data.
The predictions shown are examples of how the model performs on unseen data. By comparing the predicted values with the actual values, you can assess the model's accuracy. In this case, the predictions are relatively close to the actual values, but the exact interpretation would depend on the specific context and problem being solved.
To further evaluate the model's performance, we can use additional evaluation metrics, such as mean absolute error (MAE) or root mean squared error (RMSE), and assess its performance on a separate validation or test dataset.