In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from collections import Counter
import itertools

from IPython.display import clear_output
from IPython.display import HTML

from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsRegressor
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import Lasso
from sklearn.linear_model import Ridge
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV

import xgboost
from xgboost import XGBRegressor

In [3]:
import sys
sys.path.append('../judeml')

from judeml.regression.automate import judeml_execute as regressor

  from tqdm.autonotebook import tqdm


# Formulating the Optimum Concrete Mix Using Machine Learning
By **Oonre Advincula-Go**

## Executive Summary

Concrete is all around us and is an integral part of our world. 

Using data from the UCI ML Repository that contains information on the ingredients in concrete mixes as well as their tested compressive strengths, we want to predict the compressive strength of a new concrete mix based only from its ingredients.

Exploratory data analysis was done - pair plots were made and it was determined that among the ingredients of a concrete mix, cement has the greatest effect on concrete compressive strength. Different regression algorithms were benchmarked using Jude's Automated System and the best algorithm was determine to be Gradient Boosting. Since GBM was already determined to provide the best accuracy, XGBoost was used and provided even better results than GBM. The highest accuracy obtained using XGBoost was 93.66% (averaged over 10 trials with different train-test splits). 

In order to formulate the optimum concrete mix, the prices for each ingredient were be canvassed, and a brute-force iteration method was tried, where all possible concrete mixes were to be entered into the model, and the predicted strength of each taken. The optimum concrete mix is the formulation that yields a desired strength for a minimum price. However, this method failed due to the sheer number of possible permutations - iterating through them all would take several lifetimes. Thus, the data was revisited and the optimum concrete mix was taken from the samples.

To obtain a compressive strength of 40 MPa (this is the threshold of high-strength concrete), the optimum concrete mix was found to be:

* cement: 139.6kg
* slag: 209.4kg
* flyash: 0.0kg
* water: 192.0kg
* superplasticizer: 0.0kg
* coarseaggregate: 1047.0kg
* fineaggregate: 806.9kg

which obtained a strength of 44.7 MPa, for a price of P2304.50 per cubic meter (yields 19.40 KPa per peso).

## 1. Introduction

Take a look outside. Do you see buildings all around you? If you are reading this, you are likely in Makati City, so chances are, you do. Whether short, or tall, chances are a vast majority of these structures are built chiefly from concrete. The very floor you step on, though it may be laid with tiles or some other material, is underlain by concrete. The walls that surround you, the columns that hold them up, even the driveways and roads outside are all made of concrete. Needless to say, concrete is integral to human life.

Concrete is a mixture of sand, gravel, crushed rock, or other aggregates held together in a
rocklike mass with a paste of cement and water. Sometimes one or more admixtures are added
to change certain characteristics of the concrete such as its workability, durability, and time of
hardening. Concrete has a high compressive strength and a very
low tensile strength. (McCormac & Brown, 2014)

### 1.1 Components of a Concrete Mixture

<img src='concrete_mix.jpg' width='500px'>

Since we now know that concrete is composed of a variety of materials, which altogether make up the concrete mix. Among these ingredients, only cement, water, and aggregates are essential, but proportioning these ingredients differently can yield a wide range of results. Concrete takes about 28 days of hardening to achieve most of its full strength. Though it still continues to harden after that, strength gained after 28 days is negligible and marginal.

#### Ingredients
* Cement - usually *Portland cement*, a limestone-based powder that serves as a binder. As a trivia, Portland cement is not named for the city of Portland, Oregon in the United States, but for a kind of quarry stone known as *Portland stone* from the Isle of Portland, Dorset, UK. Portland cement is thought to resemble this stone.
* Water - when mixed with Portland cement, the two undergo a chemical reaction known as *hydration*, in which the major compounds in cement form chemical bonds with water molecules.
* Aggregates - particulate material that provide the strength and hardness to the concrete.
    * Coarse Aggregate - gravel
    * Fine Aggregate - sand
* Admixtures - chemicals that are added to the concrete mix to impart some desired properties or modifications. Some admixtures are used to speed up concrete curing or hardening, some are used to slow down the same process, others are used to reduce the required amount of water for the mix, and yet others are used to make the concrete mix highly fluid and easily workable (easy to mix). This last type of admixture I have mentioned is known as a *superplasticizer*. Admixtures are optional.

### 1.2 Compressive Strength

Concrete is used because of its high *compressive strength* - the ability to bear crushing load without giving way. For concrete, compressive strength is usually measured in MegaPascals (MPa), which is weight in Newtons that concrete can bear per square millimeter of its area.

To measure the strength of a concrete mix, some amount of the freshly poured concrete is placed in a cylindrical mold, which is then hardened, typically for 28 days. The concrete cylinder is then taken out of its mold and crushed. The pressure required to crush the cylinder is logged as the concrete's compressive strength. One can immediately see the disadvantage in measuring concrete strength by this method - it will take 28 days to know the strength of the concrete, by which time it will be too late to easily rectify any mistakes if the concrete is found to be substandard.

### 1.3 Objectives of the Study

As stated above, the current method of measuring concrete strength is disadvantageous mainly because of the 28-day delay. As an example, I want to pour concrete for the 12th floor of my building. I do so, and also take some of the mix and mold it into cylinders. 28 days later, I find out that the concrete crushed at 30 MPa, when the design called for 40 MPa. I have several courses of action at this point, none of which are easy to implement. These choices include retrofitting at best, and in a worst-case scenario, demolishing the newly-built level and redoing the work.

1. Using data on the ingredients of a concrete mix, we want to be able to reliably predict the compressive strength of a concrete mix, without waiting 28 days just to crush it.
2. By creating an accurate regression model, we want to optimize the cheapest possible concrete mix that can achieve a desired compressive strength (an example is 40 MPa, which is the minimum for high-strength concrete).

### 1.4 Limitation of the Study

Concrete strength is not solely dependent on its ingredients - there are multiple factors and externalities that can affect the final strength, such as poor mixing practices and the ambient temperature, among others. There will always be error unexplainable by the data.

### 1.5 Summary of the Methodology

1. Data Reading and Preprocessing
    * The data will be opened and checked for completeness, if it will need cleaning and preprocessing.
    * The features will be analyzed and discussed.
2. Exploratory Data Analysis
    * Pair plots will be done to determine relationships between variables, or between variables and the target.
    * Correlations to determine if features are independent and whether some of them may be dropped.
3. Machine Learning
    * Different regression algorithms will be benchmarked using Jude's Automated System to determine the best algorithm to use.
    * The hyperparameters of the best algorithm will be tuned using GridSearchCV.
    * The objective of this section is to predict the compressive strength of a concrete mix to as high an accuracy as possible.
4. Formulating the Optimum Concrete Mix
    * Prices for each ingredient will be canvassed
    * A brute-force iteration method will be tried, where all possible concrete mixes will be entered into the model, and the predicted strength of each taken. The optimum concrete mix is the formulation that yields a desired strength for a minimum price. However, this method may fail due to the sheer number of possible permutations.
    

## 2. Data

### 2.1 Reading the Data

In [None]:
path = 'concrete_data.csv'

In [None]:
df_concrete = pd.read_csv(path)
df_concrete.head()

In [None]:
df = df_concrete.copy()

In [None]:
df.shape

We have a 1030 x 9 DataFrame, with the target column being `csMPa`.

In [None]:
df.isna().sum()

Fortunately for us, the data is already clean, and there are no null values or placeholders. Thus, the data can be used directly (no need for preprocessing).

### 2.2 Data Features

Data was taken from Kaggle but is originally from the UCI Machine Learning Repository.

#### Features

The features of the data describe the number of kilograms of each ingredient per cubic meter of a concrete mixture.


| Name | Data Type | Measurement | Description |
| ------------- | ------------- | ----- | ----- |
| Cement (component 1) | quantitative | kg in a m3 mixture | Input Variable |
| Blast Furnace Slag (component 2) | quantitative | kg in a m3 mixture | Input Variable |
| Fly Ash (component 3) | quantitative | kg in a m3 mixture | Input Variable |
| Water (component 4) | quantitative | kg in a m3 mixture | Input Variable |
| Superplasticizer (component 5) | quantitative | kg in a m3 mixture | Input Variable |
| Coarse Aggregate (component 6) | quantitative | kg in a m3 mixture | Input Variable |
| Fine Aggregate (component 7) | quantitative | kg in a m3 mixture | Input Variable |
| Age | quantitative | Day (1~365) | Input Variable |
| Concrete compressive strength | quantitative | MPa | Output Variable |


Note that we now have two additional components not discussed earlier: Blast Furnace Slag and Fly Ash. Blast Furnace Slag is a by-product of iron/steel smelting, and, being cementitious, is thought to increase the strength of concrete if added.

Fly Ash is a particulate by-product of coal combustion. It is thought to have pozzolanic (cementitious) properties as well, adding to the strength of concrete. However, some dismiss fly ash as a filler or extender.

In [None]:
df_concrete.describe()

We see that the data distributions vary per column - some ingredients are added liberally to mixes such as coarse aggregate, and some are present in minimal amounts such as the superplasticizer. This is expected, since aggregate (sand and gravel) should make up the bulk of the concrete because it provides the concrete's bulk and hardness.

## 3. EDA

As with most analytics projects, an exploratory data analysis or EDA will be performed. EDA is very important to any data project as it allows a preliminary look not just into the data structure, but also into the data itself. We might learn about the distribution of each variable, or relationships between variables. These sometimes give us insights that can be helpful when performing Machine Learning.

### 3.1 The Target

The variable of interest in this dataset is the compressive strength - column `csMPa` in the DataFrame.

In [None]:
plt.figure(dpi=120)
df['csMPa'].plot.hist(bins=15, color='green')
plt.ylabel('Distribution of Compressive Strength')
plt.title('Strength in MPa')

The spread is quite significant - we have some weak concrete at as low as 10 MPa or less, and some very high strength mixes at more than 80 MPa,  which is the very upper limit of obtainable strength with concrete.

### 3.2 Pair Plots

In order to obtain a visualization of the relationships between variables as well as the distributions, it is appropriate to construct pair-wise plots, or "pair plots".

In [None]:
prp = sns.pairplot(df)

Just by looking at the pair plots, it is difficult to spot relationships, as most of the features seem to be uncorrelated. However, we do see a weak positive trend between `cement` and `csMPa`.

In [None]:
plt.figure(figsize=(10, 9))
sns.heatmap((df.corr()), annot=True, annot_kws={"size": 16}, center=0, square=True, fmt='.2f', cmap='RdGy_r')
plt.show()

Strong relationships:
* `water` and `superplasticizer` - superplasticizer makes concrete more fluid and reduces the required water content. Thus, this is expected.
* `csMPa` and `cement` - moderately strong positive correlation. More cement means stronger concrete. This is also expected.

### 3.3 EDA Conclusion

Features need not be dropped since they are not too highly correlated so as to justify manual feature reduction.

## 4. Preparation of Data for Machine Learning

### 4.1 Feature Engineering

In [None]:
df.head()

The data is already clean, so we can proceed directly to feature engineering. As stated earlier, since Franco has already done the water/cement ratio, we will perform some other feature engineering in the interest of originality.

As stated in the introduction, aging cement beyond 28 days has a negligible effect on its strength. Thus, to reduce noise, we will replace the age of all samples above 28 days with 28.

In [None]:
df_28 = df.copy()
df_28['age'] = df_28['age'].apply(lambda x: 28 if x > 28 else x)

### 4.2 Preparation of the Target Column and the Feature DataFrame

In [None]:
target = df['csMPa']
df_features = df.drop(columns='csMPa')

In [None]:
target_28 = df_28['csMPa']
df_features_28 = df_28.drop(columns='csMPa')

## 5. Machine Learning Using JUDAS

**JUDAS** is Jude's automated system, created by Jude Teves, an ace student of the MSDS 2019 batch. We credit him for his work and make us of the automated learner in order to ascertain the best model for us to use.

### 5.1 JUDAS for Original Data

In [None]:
rgs = JudasRegressor()
trials=10

params = [
    {'model': 'knn', 'trials': trials, 'k': range(1,30), 'scaler': MinMaxScaler()},
    {'model': 'linear', 'trials': trials},
    {'model': 'lasso', 'trials': trials},
    {'model': 'ridge', 'trials': trials},
#     {'model': 'svm', 'trials': trials}, # takes too long!!!
#     {'model': 'svm-rbf', 'trials': trials}, # low accuracy
#     {'model': 'svm-poly', 'trials': trials}, # takes too long!!!
    {'model': 'ensemble-decisiontree', 'trials': trials, 'maxdepth': range(1,20)},
    {'model': 'ensemble-randomforest', 'trials': trials, 'n_est': range(1,20)},
    {'model': 'ensemble-gbm', 'trials': trials, 'maxdepth': range(1,10)},
]
rgs.automate(df_features, target, params)
rgs.score()

In [None]:
rgs.plot_accuracy()

### 5.2 JUDAS for 28-day Limited Concrete

We now use JUDAS on the concrete who age had been limited to 28 days.

In [None]:
rgs = JudasRegressor()
trials=10

params = [
    {'model': 'knn', 'trials': trials, 'k': range(1,30), 'scaler': MinMaxScaler()},
    {'model': 'linear', 'trials': trials},
    {'model': 'lasso', 'trials': trials},
    {'model': 'ridge', 'trials': trials},
#     {'model': 'svm', 'trials': trials}, # takes too long!!!
#     {'model': 'svm-rbf', 'trials': trials}, # low accuracy
#     {'model': 'svm-poly', 'trials': trials}, # takes too long!!!
    {'model': 'ensemble-decisiontree', 'trials': trials, 'maxdepth': range(1,20)},
    {'model': 'ensemble-randomforest', 'trials': trials, 'n_est': range(1,20)},
    {'model': 'ensemble-gbm', 'trials': trials, 'maxdepth': range(1,10)},
]
rgs.automate(df_features_28, target_28, params)
rgs.score()

Consistent with our expectations we got dramatically improved results for the kNN and Linear Models, but contrary to our expectations, worse results for the ensemble models. This is a testament to the power of ensemble models, as they were able to capture the marginal effect of concrete hardening after 28 days.

Moving forward, we will use the original, unmodified data.

## 6. Machine Learning Using XGBoost

### 6.1 XGBoost on Original Data

We got a high $R^2$ accuracy of 92.52% with the Gradient Boosting Regressor. But can we go higher than that? To attempt to do so, we will make use of XGBoost, an improved implementation of the GBM.

In [None]:
param_grid = {
    'learning_rate': [0.01, 0.1, 0.15],
    'max_depth': [3, 4, 5, 7, 10],
    'n_estimators': [100, 250, 500, 750]
}

estxg = XGBRegressor(random_state=0, silent=True)

gs_cv_xg = GridSearchCV(estxg, param_grid, cv=3, verbose=1).fit(df_features, target)
print(gs_cv_xg.best_params_)

In [None]:
n_trials = 10
xgb_scores = []
for i in range(n_trials):
    X_train, X_test, y_train, y_test = train_test_split(df_features, target, test_size=0.25, random_state=i)
    xgbr = XGBRegressor(learning_rate=0.15, max_depth=3, n_estimators=750, random_state=0, silent=True)
    xgbr.fit(X_train, y_train)
    xgb_scores.append(xgbr.score(X_test, y_test))

In [None]:
mean_xgb_score = np.mean(xgb_scores)
print('Average score of XGBoost over {} trials: {:.2f}%'.format(n_trials, mean_xgb_score*100))

Averaged over 10 trials, we were able to obtain an accuracy of **93.66%**, better than with GBM.

### 6.2 Applying Learnings: Testing the Water/Cement Ratio

At this point, I will compute the water-cement ratio in order to obtain a higher accuracy. However, I will cap my best accuracy at 93.66% to avoid grabbing credits. Acknowledgments to Ray Franco Rivera for the idea.

In [None]:
df_features_eng = df_features.copy()
df_features_eng['wc_ratio'] = df_features_eng['water'] / df_features_eng['cement']
df_features_eng = df_features_eng.drop(columns=['water', 'cement'])

In [None]:
n_trials = 10
xgb_scores = []
for i in range(n_trials):
    X_train, X_test, y_train, y_test = train_test_split(df_features_eng, target, test_size=0.25, random_state=i)
    xgbr_e = XGBRegressor(learning_rate=0.15, max_depth=3, n_estimators=750, random_state=0, silent=True)
    xgbr_e.fit(X_train, y_train)
    xgb_scores.append(xgbr_e.score(X_test, y_test))
mean_xgb_score = np.mean(xgb_scores)
print('Average score of XGBoost over {} trials: {:.2f}%'.format(n_trials, mean_xgb_score*100))

Indeed, doing this feature engineering slightly improves the accuracy. This is because simply adding more cement to the mix will not necessarily increase the strength, since every mass of concrete added needs to have a corresponding water mass added as well to hydrate it.

### 6.3 Final Results of the Machine Learning

From our Machine Learning runs with JUDAS and XGBoost, we were able to predict the strength of a concrete mix to an $R^2$ accuracy of **93.66%**.

The following model was used:
* XGBoosting Regressor (average over 10 trials) with the following hyperparameters:
    * `learning_rate` = 0.15
    * `max_depth` = 3
    * `n_estimators` = 750
    
When following in Franco's lead and performing feature engineering (obtaining the water/cement ratio), we were able to bump the accuracy up slightly to **94.20%**.

#### Feature Importances (Original Data)

In [None]:
fig, ax = plt.subplots(1, 1, dpi=125)
xgboost.plot_importance(xgbr, ax=ax)
plt.show()

#### Feature Importances (With Water-Cement Ratio)

In [None]:
fig, ax = plt.subplots(1, 1, dpi=125)
xgboost.plot_importance(xgbr_e, ax=ax)
plt.show()

Age is the most important feature, since concrete hardens over time. The water-concrete ratio is also important, as water and concrete collectively serve as the binder for the concrete, keeping it together. Notably, fly ash is the least important feature. This is a significant finding, as some cements market themselves as containing fly ash. Since fly ash contributes little to the compressive strength of cement, it might be functioning as a filler or extender for concrete. Thus, people should be prudent when purchasing cement that is advertised to have fly ash content.

## 7. Formulating the Optimum Concrete Mix

In section 6, we were able to create a highly accurate model that can predict the strength of a concrete mix with an accuracy of 94.20%.

For this section, our objective is to use the model that we have created to formulate the most economical concrete mix that can still achieve a certain compressive strength. By doing so, we can be able to help construction companies save on concrete costs and build more efficiently.

### 7.1 Ingredient Prices

Prices were canvassed to the best of the author's ability on August 2019.


| Name | Estimated Price per kg PHP |
| ------------- | ------------- |
| Cement (component 1) | 5.50 |
| Blast Furnace Slag (component 2) | 0.80 |
| Fly Ash (component 3) | 1.80 |
| Water (component 4) | 0.00 |
| Superplasticizer (component 5) | 150.00 |
| Coarse Aggregate (component 6) | 0.73 |
| Fine Aggregate (component 7) | 0.76 |


### 7.2 Mixing Round 1: Brute Force

One option for us to optimize an economical concrete mix is to iterate over the entire range of possible values for each of the ingredients using `itertools`. This is a brute force method; we iterate over all possible concrete mixes and get the price and compressive strength for each. The optimum concrete mix for a strength of, say, 40 MPa, will be the mix with the minimum price that has a strength of at least 40 MPa.

In [None]:
cp = 5.5
sp = .8
fp = 1.8
wp = 0.
pp = 150.
gp = 0.73
sp = 0.76

In [None]:
cement = np.arange(102, 540+1)
slag = np.arange(0, 359+1)
flyash = np.arange(0, 200+1)
water = np.arange(122, 247+1)
superplasticizer = np.arange(0, 32+1)
coarseaggregate = np.arange(801, 1145+1)
fineaggregate = np.arange(594, 992+1)
age = (28,)

In [None]:
total_iters = len(cement)*len(slag)*len(flyash)*len(water)*len(superplasticizer)*len(coarseaggregate)*len(fineaggregate)*len(age)
total_iters

In [None]:
def price(x):
    return x[0]*cp+x[1]*sp+x[2]*fp+x[3]*wp+x[4]*pp+x[5]*gp+x[6]*sp

def strength(x):
    comp_str = xgbr_e.predict(pd.DataFrame({'slag': [x[1]], 
                             'flyash': [x[2]], 
                             'superplasticizer': [x[4]], 'coarseaggregate': [x[5]],
                            'fineaggregate': [x[6]], 'age': [28], 'wc_ratio': [x[3]/x[0]]}))[0]
    return comp_str

In [None]:
# i = 1
# lowest_price = 50000
# for p in itertools.product(cement, slag, flyash, water, superplasticizer, coarseaggregate, fineaggregate, age):
#     x = p
#     cstr_ = strength(x)
#     price_ = price(x)
#     if not i % 1000:
#         print('Trial {} of {}'.format(i, total_iters))
#         print('--------------')
#         print('Progress: {:.2f}%'.format(i/total_iters*100))
#         clear_output(wait=True)
#     if cstr_ < 40.:
#         i += 1
#         continue
#     else:
#         if price_ < lowest_price:
#             best_mix = x
#             global_price = price_
#             global_strength = cstr_
#     i += 1
    

Although the code is ready, we see that there are 18181912114119600 total permutations, and iterating through all of them will take several lifetimes, if not an eternity. Even increasing the step of arange barely helps. Thus, we will not proceed with this analysis.

### 7.3 Mixing Round 2: Using Available Samples

We were unable to produce the optimum concrete mix through brute-force iteration. However, we already have a sample of about 1000 different concrete mixes in our data. Thus, we will get the most optimal among these samples.

Defining a function that calculates the price of each concrete mix:

In [None]:
def get_prices(row):
    price = 0
    price += row['cement']*cp + row['slag']*sp + row['flyash']*fp
    price += row['water']*wp + row['superplasticizer']*pp
    price += row['coarseaggregate']*gp + row['fineaggregate']*sp
    return price

In [None]:
prices = df.apply(lambda row: get_prices(row), axis=1)
df['price_php'] = prices
df.head()

Below, we define a function that calculates the economy of a mix i.e. unit compressive strength provided per peso. Since the values will be very small, we will multiply this value by 1000, in effect taking the KiloPascals (KPa) provided per peso.

In [None]:
def get_econ(row):
    econ = row['csMPa']*1000 / row['price_php']
    return econ
econ = df.apply(lambda row: get_econ(row), axis=1)
df['csKPa_per_php'] = econ
df.head()

In [None]:
plt.figure(figsize=(8, 5), dpi=115)
plt.plot(df['csMPa'], df['csKPa_per_php'], '.')
plt.title('Economy of a Concrete Mix (compressive strength provided per PHP) versus Price in PHP', fontsize=12)
plt.ylabel('Compressive Strength (KPa) provided per PHP cost')
plt.xlabel('Compressive Strength (MPa)')
plt.show()

We notice two things - as the compressive strength increases, the economy also improves, meaning that higher-strength concrete mixes are cheaper per unit strength provided. However, the data is also heteroskedastic - the spread of the data increases as compressive strength increases. This means that many inefficient mixes exist, and it is our task to identify the most economical mix per strength range.

In [None]:
ini_lower_bound = 30
ini_upper_bound = 35
for i in range(11):
    lower_bound = ini_lower_bound + i*5
    upper_bound = ini_upper_bound + i*5
    opti = df[(df['csMPa'] >= lower_bound) & (df['csMPa'] < upper_bound)].sort_values('csKPa_per_php', ascending=False)
    display(HTML('<h4>7.3.{} Optimum Concrete Mix for Strength Range {} to {} MPa'.format(i+1, lower_bound, upper_bound)))
    display(opti.head(1))
    display(HTML('The ingredients for the optimum concrete Mix in the strength range {} to {} MPa are:'.format(lower_bound, upper_bound)))
    column_str = '<ul>'
    for col in opti.columns:
        if col != 'csMPa' and col != 'price_php' and col != 'csKPa_per_php' and col != 'age':
            column_str += '<li><b>{}</b>: {}kg</li>'.format(col, opti[col].iloc[0])
    column_str += '</ul>'
    display(HTML(column_str))

## 8. Conclusion and Recommendation

In attempting to predict the strength of a concrete mix solely from its ingredients, we were able to obtain an $R^2$ accuracy of **93.66%**, averaged over 10 trials, using XGBoost. This means that concrete compressive strength is quite predictable as early as when it is mixed. Most of the error may be attributable to inconsistencies in the mixing process - concrete strength is not solely dependent on its ingredients, since it can be affected by poor mixing, the ambient temperature, and other factors. Nevertheless, a 94% accuracy is quite high and we can confidently say that it is possible to reliably predict the strength of a concrete mix. This can be very useful, especially because the prevailing method for computing concrete strength (not testing but computing) is very tedious and prone to inaccuracy.

However, I would not advocate this method as a substitute for proper materials testing such as actual measuring of concrete strength by crushing. This is because many factors affect the strength of concrete, particularly the mixing procedure. It is possible to formulate a concrete mix that has an expected strength of 70 MPa, and only obtain 50 MPa in actuality.

## 9. Acknowledgments

* Jude Teves, for his work on the automated system JUDAS.
* Radney Racela for his work on improving JUDAS especially with the aesthetic redesign using `tqdm`.
* Ray Franco Rivera, for his idea on feature engineering. I did not his method for my final accuracy of 93.66%, but still did so to investigate its effect on the accuracy and sure enough, it bumped the accuracy up to 94.20%.
* Prof. Christopher Monterola for the lecture notebooks.

## 10. References

* Design of Reinforced Concrete, Tenth Edition. 2016. Jack C. McCormac, Russell H. Brown. John Wiley & Sons.<br />
* https://www.cement.org/cement-concrete-applications/concrete-materials/chemical-admixtures

### 10.1 Images

https://www.giatecscientific.com/education/concrete-mix-design-just-got-easier/