**Predicting compressive strength of concrete mixture using neural networks**



This cell is used to get the code from google drive and move to project directory (required)

In [1]:
from google.colab import drive
drive.mount('/content/drive')
import os
os.chdir('/content/drive/MyDrive/concrete_ai')
cwd = os.getcwd()
print('current directory : ', cwd)

Mounted at /content/drive
current directory :  /content/drive/MyDrive/concrete_ai


Importing the python packages (required)

In [2]:
# importing neccessary packages
import os
import data
import numpy as np
import pandas as pd
from utils import utils
import tensorflow as tf
import printing_callback as callback

**Setting up the network structure**

this cell sets up the neural network structure global variables
the variable are:


1.   NUM_EPOCHS : the number of trainning epochs
2.   SAVE_INPUT : saves the prepared input
3.   SAVE_OUTPUT : saves the input + the predicions
4.   SAVE_WEIGHTS : saves the weights of the model after the trainning finishes
5.   LOAD_WEIGHTS : loads the weights of the previously saved models
6.   INPUT_TITLE : the file that contains the input data as excel file
7.   OUTPUT_TITLE : the file that the output will be saved to
8.  PRINT_PRDECTIONS : decides if should print random 20 predictions against their real data



In [None]:
# setting up the network structure
NUM_EPOCHS = 30000
SAVE_INPUT = False
SAVE_OUTPUT = True
SAVE_WEIGHTS = False
LOAD_WEIGHTS = False
INPUT_TITLE = 'amj_data3'
OUTPUT_TITLE = 'amj_data'
PRINT_PRDECTIONS = True

**Model structure**

The models is divided into three models :

1.   Model1 : contains the sump and hardened density of concrete
2.   Model2 : contains the concrete compressive strength after 7 days
3.   Model3 : contains the concrete compressive strength after 28 days

every one of these models are independednt from the other

each one will have a sepereate cell for setting the data and trainning

you will have to run the setting the data for the model you want to train first

**Model 1 (slump and hardened density)**

In [None]:
train1, test1, validation1 = data.getFinalData('data_files/' + INPUT_TITLE + '.xlsx')
train1, test1, validation1 = data.prepareMultipleData(train1, test1, validation1, [16,17])
print('\n train1:', train1.shape)
xr1 = train1[:,0:14]
yr1 = train1[:,14:16]
xt1 = test1[:,0:14]
yt1 = test1[:,14:16]
xv1 = validation1[:,0:14]
yv1 = validation1[:,14:16]
utils.exceptionIfNan(train1)
utils.exceptionIfNan(test1)
utils.exceptionIfNan(validation1)
print('data 1 ready with train:', train1.shape, 'and test:', test1.shape, 'and validation:', 
validation1.shape)
if(SAVE_INPUT):
    data.saveData(train1, 'train1.xlsx')

loading the data ...
data loaded with size:  (1063, 18) 


there is Nan in data


shuffling the data ...
data shuffled


dividing the data ...
training data ready with size: (745, 18)
test data ready with size: (160, 18)
validation data ready with size: (158, 18)

 train1: (742, 16)
data 1 ready with train: (742, 16) and test: (160, 16) and validation: (158, 16)


**Model 2 (compressive strength after 7 days)**

In [None]:
train2, test2, validation2 = data.getFinalData('data_files/' + INPUT_TITLE + '.xlsx')
train2, test2, validation2 = data.prepareMultipleData(train2, test2, validation2, [14, 15, 17])
xr2 = train2[:,0:9]
yr2 = train2[:,14:15]
xt2 = test2[:,0:9]
yt2 = test2[:,14:15]
xv2 = validation2[:,0:9]
yv2 = validation2[:,14:15]
utils.exceptionIfNan(train2)
utils.exceptionIfNan(test2)
utils.exceptionIfNan(validation2)
print('data 2 ready with train:', train2.shape, 'and test:', test2.shape, 'and validation:', 
validation2.shape)
if(SAVE_INPUT):
    data.saveData(train2, 'train2.xlsx')

loading the data ...
data loaded with size:  (1063, 18) 


there is Nan in data


shuffling the data ...
data shuffled


dividing the data ...
training data ready with size: (745, 18)
test data ready with size: (160, 18)
validation data ready with size: (158, 18)
data 2 ready with train: (742, 15) and test: (159, 15) and validation: (154, 15)


**Model 3 (compressive strength after 28 days)**

In [None]:
train3, test3, validation3 = data.getFinalData('data_files/' + INPUT_TITLE + '.xlsx')
train3, test3, validation3 = data.prepareMultipleData(train3, test3, validation3, [14, 15, 16])
xr3 = train3[:,0:14]
yr3 = train3[:,14:15]
xt3 = test3[:,0:14]
yt3 = test3[:,14:15]
xv3 = validation3[:,0:14]
yv3 = validation3[:,14:15]
utils.exceptionIfNan(train3)
utils.exceptionIfNan(test3)
utils.exceptionIfNan(validation3)
print('data 3 ready with train:', train3.shape, 'and test:', test3.shape, 'and validation:', 
validation3.shape)
if(SAVE_INPUT):
    data.saveData(train3, 'train3.xlsx')

loading the data ...
data loaded with size:  (1063, 18) 


there is Nan in data


shuffling the data ...
data shuffled


dividing the data ...
training data ready with size: (745, 18)
test data ready with size: (160, 18)
validation data ready with size: (158, 18)
data 3 ready with train: (361, 15) and test: (77, 15) and validation: (73, 15)


**Trainning and testing the models**

**Model 1 (slump and hardened density)**

In [None]:
model1 = utils.newSeqentialModel(14, 2)
 
if(LOAD_WEIGHTS):
    print('loading model weights ...')
    output_dir = os.path.join(os.getcwd(), "saved_wights")
    try:
      model1.load_weights(filepath=os.path.join(output_dir, OUTPUT_TITLE + "1.h5"))
      print('model weights loaded')
    except OSError:
      print('no previous weights found')
    except ValueError:
      print('previous weights are different from current') 
 
print('\n\nTrainning model 1')
model1.compile(loss='mean_squared_error', optimizer='adam',
metrics=[tf.keras.metrics.MeanAbsolutePercentageError()])
model1.fit(xr1, yr1, epochs=NUM_EPOCHS, batch_size=32, validation_data=(xv1, yv1),
verbose=0, callbacks=[callback.LossAndErrorPrintingCallback()])
 
_, accuracy_test_1 = model1.evaluate(xt1, yt1)
print('\n\nmodel 1 trained')
print('\nAccuracy on test data: %.2f' % (accuracy_test_1))
 
test_predictions = np.around(model1.predict(xt1), 1)
 
if (PRINT_PRDECTIONS):
    for i in range(20):
        print(' test: predicted:', test_predictions[i], 'real data:', yt1[i])
 
if (SAVE_OUTPUT):
    result = np.concatenate([xt1, yt1, test_predictions], axis=1)
    data.saveData(result, 'results/' + OUTPUT_TITLE + '1.xlsx')
 
if (SAVE_WEIGHTS):
    print('\nsaving model weights ...')
    output_dir = os.path.join(os.getcwd(), "saved_wights")
    model1.save_weights(filepath=os.path.join(output_dir, OUTPUT_TITLE + "1.h5"))
    print('model weights saved')

last output :

trainning: mse: 1322.47 %: 12.97

validation: mse : 1383.56 %: 12.97

test data: %: 12.97

converged: false

data type: amj_data

**Model 2 (compressive strength after 7 days)**

In [None]:
model2 = utils.newSeqentialModel(9, 1)
 
if(LOAD_WEIGHTS):
    print('loading model weights ...')
    output_dir = os.path.join(os.getcwd(), "saved_wights")
    try:
      model2.load_weights(filepath=os.path.join(output_dir, OUTPUT_TITLE + "2.h5"))
      print('model weights loaded')
    except OSError:
      print('no previous weights found')
    except ValueError:
      print('previous weights are different from current') 
 
print('\n\nTrainning model 2')
model2.compile(loss='mean_squared_error', optimizer='adam',
metrics=[tf.keras.metrics.MeanAbsolutePercentageError()])
model2.fit(xr2, yr2, epochs=NUM_EPOCHS, batch_size=32, validation_data=(xv2, yv2), 
verbose=0, callbacks=[callback.LossAndErrorPrintingCallback()])
 
_, accuracy_test_2 = model2.evaluate(xt2, yt2)
print('\n\nmodel 2 trained')
print('\nAccuracy on test data: %.2f' % (accuracy_test_2))
 
test_predictions = np.around(model2.predict(xt2), 1)
 
if (PRINT_PRDECTIONS):
    for i in range(20):
        print(' test: predicted:', test_predictions[i], 'real data:', yt2[i])
 
if (SAVE_OUTPUT):
    result = np.concatenate([xt2, yt2, test_predictions], axis=1)
    data.saveData(result, 'results/' + OUTPUT_TITLE + '2.xlsx')
 
if (SAVE_WEIGHTS):
    print('\nsaving model weights ...')
    output_dir = os.path.join(os.getcwd(), "saved_wights")
    model2.save_weights(filepath=os.path.join(output_dir, OUTPUT_TITLE + "2.h5"))
    print('model weights saved')

last output :

trainning: mse: 29.52 %: 17.00

validation: mse : 25.40 %: 17.00

test data: %: 17.00

converged: false

data type: amj_data

Model 3 (compressive strength after 28 days)

In [None]:
model3 = utils.newSeqentialModel(14, 1)

if(LOAD_WEIGHTS):
	print('loading model weights ...')
	output_dir = os.path.join(os.getcwd(), "saved_wights")
  try:
    model3.load_weights(filepath=os.path.join(output_dir, OUTPUT_TITLE  +"3.h5"))
    print('model weights loaded')
  except OSError:
    print('no previous weights found')
  except ValueError:
    print('previous weights are different from current') 

print('\n\nTrainning model 3')
model3.compile(loss='mean_squared_error', optimizer='adam', 
metrics=[tf.keras.metrics.MeanAbsolutePercentageError()])
model3.fit(xr3, yr3, epochs=NUM_EPOCHS, batch_size=64, validation_data=(xv3, yv3), 
verbose=0, callbacks=[callback.LossAndErrorPrintingCallback()])

_, accuracy_test_3 = model3.evaluate(xt3, yt3)
print('\n\nmodel 3 trained')
print('\nAccuracy on test data: %.2f' % (accuracy_test_3))

test_predictions = np.around(model3.predict(xt3), 1)

if (PRINT_PRDECTIONS):
	for i in range(20):
		print(' test: predicted:', test_predictions[i], 'real data:', yt3[i])

if (SAVE_OUTPUT):
	result = np.concatenate([xt3, yt3, test_predictions], axis=1)
	data.saveData(result, 'results/' + OUTPUT_TITLE + '3.xlsx')

if (SAVE_WEIGHTS):
	print('\nsaving model weights ...')
	output_dir = os.path.join(os.getcwd(), "saved_wights")
	model3.save_weights(filepath=os.path.join(output_dir, OUTPUT_TITLE + "3.h5"))
	print('model weights saved')

last output : 

trainning: mse: 17.32      %: 8.05

validation: mse : 56.69     %: 15.29

test data: %: 17.19

converged: true

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression,  Lasso, Ridge
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score 
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor

In [None]:
def mean_absolute_percentage_error(y_true, y_pred,
                                   sample_weight=None,
                                   multioutput='uniform_average'):
    epsilon = np.finfo(np.float64).eps
    mape = np.abs(y_pred - y_true) / np.maximum(np.abs(y_true), epsilon)
    output_errors = np.average(mape,
                               weights=sample_weight, axis=0)
    if isinstance(multioutput, str):
        if multioutput == 'raw_values':
            return output_errors
        elif multioutput == 'uniform_average':
            multioutput = None
    return np.average(output_errors, weights=multioutput)

In [None]:
import warnings
warnings.filterwarnings('ignore',category = DeprecationWarning)
warnings.filterwarnings('ignore',category = UserWarning)
warnings.filterwarnings('ignore',category = RuntimeWarning)
warnings.filterwarnings('ignore',category = FutureWarning)

In [None]:
pd.set_option('display.max_rows',100000)
pd.set_option('display.max_columns',1000)

In [None]:
# data are xr3, yr3, xv3, yv3, xt3, yt3
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
#x_train = sc.fit_transform(xr2)
#x_test = sc.transform(xt2)
x_train = xr2
x_test = xt2
y_train = yr2
y_test = yt2

In [None]:
lr = LinearRegression()
# Linear Regression
 
lasso = Lasso()
# Lasso Regression
 
ridge = Ridge()
# Ridge Regression

In [None]:
lr.fit(x_train, y_train)
# fitting the linear regression model
y_pred_lr = lr.predict(x_test)
# predicting the test with linear regression model
 
print("Model\t\t\t RMSE \t\t MSE \t\t MAE \t\t R2 \t\t MPE")
print("""LinearRegression \t {:.2f} \t\t {:.2f} \t\t {:.2f} \t\t {:.2f} \t\t {:2f}""".format(
            np.sqrt(mean_squared_error(y_test, y_pred_lr)),mean_squared_error(y_test, y_pred_lr),
            mean_absolute_error(y_test, y_pred_lr), r2_score(y_test, y_pred_lr), 
            mean_absolute_percentage_error(y_test, y_pred_lr)))
 
for i in range(20):
    print(' test: predicted:', y_pred_lr[i], 'real data:', y_test[i])

Model			 RMSE 		 MSE 		 MAE 		 R2 		 MPE
LinearRegression 	 5.26 		 27.67 		 4.52 		 0.06 		 0.163566
 test: predicted: [30.20533658] real data: [38.1]
 test: predicted: [30.23658658] real data: [32.8]
 test: predicted: [29.95533658] real data: [27.3]
 test: predicted: [29.81471158] real data: [22.4]
 test: predicted: [29.61158658] real data: [31.3]
 test: predicted: [29.75221158] real data: [32.]
 test: predicted: [29.97096158] real data: [31.2]
 test: predicted: [29.58033658] real data: [20.5]
 test: predicted: [29.68971158] real data: [27.4]
 test: predicted: [30.23658658] real data: [31.8]
 test: predicted: [29.18971158] real data: [34.1]
 test: predicted: [30.14283658] real data: [24.6]
 test: predicted: [28.93971158] real data: [18.3]
 test: predicted: [30.50221158] real data: [26.3]
 test: predicted: [29.73658658] real data: [28.1]
 test: predicted: [29.78346158] real data: [32.9]
 test: predicted: [29.56471158] real data: [22.1]
 test: predicted: [29.86158658] real data: [30.9]

In [None]:
from sklearn.neural_network import MLPClassifier

clf = MLPClassifier(solver='lbfgs', alpha=1e-5,
...                     hidden_layer_sizes=(5, 2), random_state=1)


In [None]:
train4, test4, validation4 = data.getFinalData('data_files/' + INPUT_TITLE + '.xlsx')
train4, test4, validation4 = data.prepareMultipleData(train4, test4, validation4, [])
xr4 = train4[:,0:8]
yr4 = train4[:,8:9]
xt4 = test4[:,0:8]
yt4 = test4[:,8:9]
xv4 = validation4[:,0:8]
yv4 = validation4[:,8:9]
utils.exceptionIfNan(train4)
utils.exceptionIfNan(test4)
utils.exceptionIfNan(validation4)
print('data world ready with train:', train4.shape, 'and test:', test4.shape, 'and validation:', 
validation4.shape)
if(SAVE_INPUT):
  data.saveData(train4, 'train4.xlsx')
 
model4 = utils.newSeqentialModel(8, 1)
 
if(LOAD_WEIGHTS):
  print('loading model weights ...')
  output_dir = os.path.join(os.getcwd(), "saved_wights")
  model4.load_weights(filepath=os.path.join(output_dir, OUTPUT_TITLE  +"4.h5"))
  print('model weights loaded')
  try:
    model4.load_weights(filepath=os.path.join(output_dir, OUTPUT_TITLE + "4.h5"))
    print('model weights loaded')
  except OSError:
    print('no previous weights found')
  except ValueError:
    print('previous weights are different from current') 
 
print('\n\nTrainning model world')
model4.compile(loss='mean_squared_error', optimizer='adam', 
metrics=[tf.keras.metrics.MeanAbsolutePercentageError()])
model4.fit(xr4, yr4, epochs=NUM_EPOCHS, batch_size=64, validation_data=(xv4, yv4), 
verbose=0, callbacks=[callback.LossAndErrorPrintingCallback()])
 
_, accuracy_test_4 = model4.evaluate(xt4, yt4)
print('\n\nmodel world trained')
print('\nAccuracy on test data: %.2f' % (accuracy_test_4))
 
test_predictions = np.around(model4.predict(xt4), 1)
 
if (PRINT_PRDECTIONS):
    for i in range(20):
        print(' test: predicted:', test_predictions[i], 'real data:', yt4[i])
 
if (SAVE_OUTPUT):
    result = np.concatenate([xt4, yt4, test_predictions], axis=1)
    data.saveData(result, 'results/' + OUTPUT_TITLE + '4.xlsx')
 
if (SAVE_WEIGHTS):
    print('\nsaving model weights ...')
    output_dir = os.path.join(os.getcwd(), "saved_wights")
    model4.save_weights(filepath=os.path.join(output_dir, OUTPUT_TITLE + "4.h5"))
    print('model weights saved')

loading the data ...
data loaded with size:  (1028, 9) 


data is good


shuffling the data ...
data shuffled


dividing the data ...
training data ready with size: (720, 9)
test data ready with size: (155, 9)
validation data ready with size: (153, 9)
data world ready with train: (720, 9) and test: (155, 9) and validation: (153, 9)


Trainning model world
epoch 0 : loss : 1546.64 , MPE :  100.72. validation loss : 1646.75 , MPE :  100.43
epoch 100 : loss :  366.75 , MPE :   69.76. validation loss :  399.76 , MPE :   69.67
epoch 200 : loss :  275.30 , MPE :   63.13. validation loss :  284.57 , MPE :   63.13
epoch 300 : loss :  275.30 , MPE :   62.08. validation loss :  284.71 , MPE :   62.07
epoch 400 : loss :  275.30 , MPE :   61.55. validation loss :  284.33 , MPE :   61.55
epoch 500 : loss :  275.37 , MPE :   61.23. validation loss :  284.48 , MPE :   61.23
epoch 600 : loss :   70.45 , MPE :   57.78. validation loss :   89.82 , MPE :   57.75
epoch 700 : loss :   36.59 , MPE :   52.20

**Testing model 7 days with grade 25**

In [None]:
train2, test2, validation2 = data.getFinalData('data_files/omer_data.xlsx')
train2, test2, validation2 = data.prepareMultipleData(train2, test2, validation2, [13, 15, 16])
CURRENT_GRADE = 30
 
#train2 = train2[train2[:,5] == CURRENT_GRADE]
#train2 = np.delete(train2, [5, 9, 10], 1)
#test2 = test2[test2[:,5] == CURRENT_GRADE]
#test2 = np.delete(test2,[5, 9, 10], 1)
#validation2 = validation2[validation2[:,5] == CURRENT_GRADE]
#validation2 = np.delete(validation2, [5, 9, 10], 1)
 
#train2 = train2[train2[:,14] >= 60]
#train2 = np.delete(train2, [0, 1], 1)
#test2 = test2[test2[:,14] >= 69]
#test2 = np.delete(test2, [0, 1], 1)
#validation2 = validation2[validation2[:,14] >= 60]
#validation2 = np.delete(validation2, [0, 1], 1)
 
tmp1 = train2.shape[1] - 1
tmp2 = train2.shape[1]
 
xr2 = train2[:,0:tmp1]
yr2 = train2[:,tmp1:tmp2]
xt2 = test2[:,0:tmp1]
yt2 = test2[:,tmp1:tmp2]
xv2 = validation2[:,0:tmp1]
yv2 = validation2[:,tmp1:tmp2]
 
utils.exceptionIfNan(train2)
utils.exceptionIfNan(test2)
utils.exceptionIfNan(validation2)
print('data 2 ready with train:', train2.shape, 'and test:', test2.shape, 'and validation:', 
validation2.shape)
if(SAVE_INPUT):
    data.saveData(train2, 'train2.xlsx')
 
model2 = utils.newSeqentialModel(xr2.shape[1], yr2.shape[1])
 
if(LOAD_WEIGHTS):
    print('loading model weights ...')
    output_dir = os.path.join(os.getcwd(), "saved_wights")
    try:
      model2.load_weights(filepath=os.path.join(output_dir, OUTPUT_TITLE + "2.h5"))
      print('model weights loaded')
    except OSError:
      print('no previous weights found')
    except ValueError:
      print('previous weights are different from current') 
 
print('\n\nTrainning model 2 grade', CURRENT_GRADE)
model2.compile(loss='mean_squared_error', optimizer='adam', 
metrics=[tf.keras.metrics.MeanAbsolutePercentageError()])
model2.fit(xr2, yr2, epochs=60000, batch_size=32, validation_data=(xv2, yv2), 
verbose=0, callbacks=[callback.LossAndErrorPrintingCallback()])
 
_, accuracy_test_2 = model2.evaluate(xt2, yt2)
print('\n\nmodel 2 trained')
print('\nAccuracy on test data: %.2f' % (accuracy_test_2))
 
test_predictions = np.around(model2.predict(xt2), 1)
 
if (PRINT_PRDECTIONS):
    for i in range(20):
        print(' test: predicted:', test_predictions[i], 'real data:', yt2[i])
 
if (SAVE_OUTPUT):
    result = np.concatenate([xt2, yt2, test_predictions], axis=1)
    data.saveData(result, 'results/omer_data_density.xlsx')
 
if (SAVE_WEIGHTS):
    print('\nsaving model weights ...')
    output_dir = os.path.join(os.getcwd(), "saved_wights")
    model2.save_weights(filepath=os.path.join(output_dir, OUTPUT_TITLE + "2.h5"))
    print('model weights saved')

loading the data ...
data loaded with size:  (1300, 17) 


there is Nan in data


shuffling the data ...
data shuffled


dividing the data ...
training data ready with size: (910, 17)
test data ready with size: (195, 17)
validation data ready with size: (195, 17)
data 2 ready with train: (910, 14) and test: (195, 14) and validation: (195, 14)


Trainning model 2 grade 30
epoch 0 : loss : 5848202.50 , MPE :  100.00. validation loss : 5857171.00 , MPE :  100.00
epoch 100 : loss : 4999305.50 , MPE :   97.28. validation loss : 5000571.50 , MPE :   97.26
epoch 200 : loss : 3357534.25 , MPE :   91.00. validation loss : 3355280.00 , MPE :   90.96
epoch 300 : loss : 1653213.62 , MPE :   82.30. validation loss : 1649983.88 , MPE :   82.25
epoch 400 : loss : 435467.25 , MPE :   71.81. validation loss : 433555.44 , MPE :   71.75
epoch 500 : loss : 16802.52 , MPE :   60.52. validation loss : 16653.91 , MPE :   60.47
epoch 600 : loss : 1028.21 , MPE :   50.73. validation loss :  861.63 , MPE :   50