# Regression Models with Keras

## Objectives for this Notebook
* How to use the Keras library to build a regression model
* Download and clean the data set
* Build a neural network
* Train and test the network

## Install required libraries

In [None]:
!pip install numpy==2.0.2
!pip install pandas==2.2.2
!pip install tensorflow_cpu==2.18.0

In [None]:
import os
os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

In [None]:
import pandas as pd
import numpy as np
import keras

import warnings
warnings.simplefilter('ignore', FutureWarning)

## Download and Clean dataset

In [None]:
filepath='https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0101EN/labs/data/concrete_data.csv'
concrete_data = pd.read_csv(filepath)

concrete_data.head()

In [None]:
# check num of data points
concrete_data.shape

In [None]:
# check dataset for missing values
concrete_data.describe()

In [None]:
concrete_data.isnull().sum()

## Split data into predictors and target

In [None]:
concrete_data_columns = concrete_data.columns

In [None]:
predictors = concrete_data[concrete_data_columns[concrete_data_columns != 'Strength']] # all columns except Strength
target = concrete_data['Strength'] # Strength column

In [None]:
# sanity check
print(predictors.head())
print(target.head())


## Normalize data
Finally, the last step is to normalize the data by substracting the mean and dividing by the standard deviation.

In [None]:
predictors_norm = (predictors - predictors.mean()) / predictors.std()
predictors_norm.head()

In [None]:
# save the number of predictors to n_cols since we will need this number when building our network.
n_cols = predictors_norm.shape[1] # number of predictors

## Import Keras Packages

In [None]:
# to build regression model
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Input

## Build a NN

In [None]:
# define regression model
def regression_model():
    # create model
    model = Sequential()
    model.add(Input(shape=(n_cols,)))
    model.add(Dense(50, activation='relu'))
    model.add(Dense(50, activation='relu'))
    model.add(Dense(1))
    
    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

## Train and Test network

In [None]:
# build model
model = regression_model()

In [None]:
# Next, we will train and test the model at the same time using the fit method. 
#   We will leave out 30% of the data for validation and we will train the model 
#   for 100 epochs.
# fit the model
model.fit(predictors_norm, target, validation_split=0.3, epochs=100, verbose=2)

* Adding more hidden layers to the model increases its capacity to learn and represent complex relationships within the data. This allows the model to better identify, as a result, the model becomes more effective at fitting the training data and potentially improving its predictions.
* By reducing the proportion of data set aside for validation and using a larger portion for training, the model has access to more examples to learn from. This additional training data helps the model improve its understanding of the underlying trends, which can lead to better overall performance.