# Building a Regression Model with Keras: Predicting Concrete Compressive Strength from Multivariate Data

Author: [Arnau Gómez](https://www.arnaugomez.com)

Peer-graded Assignment: Build a Regression Model in Keras

Exercise A: Build a baseline model (5 marks) 

## Initial setup

Import dependencies

In [None]:
# Python version: 3.11.9
%pip install --upgrade pip
%pip install --upgrade pandas==2.2.3 keras==3.7.0 tensorflow==2.18.0 numpy==2.0.2

Import dependencies

In [7]:
import pandas as pd
import numpy as np
import keras

import warnings
warnings.simplefilter('ignore', FutureWarning)

Import data

In [None]:
filepath='https://cocl.us/concrete_data'
concrete_data = pd.read_csv(filepath)


concrete_data.head()

Split data into predictors and target

In [None]:
predictors = concrete_data.drop(columns=['Strength'])
predictors.head()

In [None]:
target = concrete_data['Strength']
target.head()

Normalize the data

In [None]:
predictors_norm = (predictors - predictors.mean()) / predictors.std()
predictors_norm.head()

Compute number of input features

In [None]:
n_predictors = predictors_norm.shape[1]

## Build the keras model

In [None]:
from keras.models import Sequential
from keras.layers import Dense, Input

# define regression model
def regression_model():
    # create model
    model = Sequential()
    model.add(Input(shape=(n_predictors,)))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(1))
    
    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

# build the model
model = regression_model()

# display model summary
model.summary()


## Train and test the model

To train and test it once:

In [None]:
model.fit(predictors_norm, target, validation_split=0.3, epochs=50, verbose=2)

Train and test it 50 times:

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Initialize list of mean squared errors
mse_list = []

# Repeat the process 50 times
for _ in range(50):
    # Split the data
    X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size=0.3, random_state=None)
    
    # Build the model
    model = regression_model()
    
    # Train the model
    model.fit(X_train, y_train, epochs=50, verbose=0)
    
    # Predict on the test data
    y_pred = model.predict(X_test)
    
    # Compute mean squared error
    mse = mean_squared_error(y_test, y_pred)
    mse_list.append(mse)


## Report the mean and standard deviation of the mean squared errors

In [None]:
mean_mse = np.mean(mse_list)
std_mse = np.std(mse_list)
print(f'Mean MSE: {mean_mse}')
print(f'Standard Deviation of MSE: {std_mse}')