<a href="https://colab.research.google.com/github/habiclaude88/Deep_learning/blob/main/Regression_NN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##**Regression with the Keras Deep Learning Library in Python**

you will discover how to develop and evaluate neural network models using Keras for a regression problem.

After completing this step-by-step tutorial, you will know:

How to load a CSV dataset and make it available to Keras

How to create a neural network model with Keras for a regression problem

How to use scikit-learn with Keras to evaluate models using cross-validation

How to perform data preparation in order to improve skill with Keras models

How to tune the network topology of models with Keras

**Problem Description**

The problem that we will look at in this tutorial is the Boston house price dataset.

You can download this dataset and save it to your current working directly with the file name housing.csv (update: download data from here).

The dataset describes 13 numerical properties of houses in Boston suburbs and is concerned with modeling the price of houses in those suburbs in thousands of dollars. As such, this is a regression predictive modeling problem. Input attributes include crime rate, the proportion of nonretail business acres, chemical concentrations, and more.

This is a well-studied problem in machine learning. It is convenient to work with because all the input and output attributes are numerical, and there are 506 instances to work with.

Reasonable performance for models evaluated using Mean Squared Error (MSE) is around 20 in thousands of dollars squared (or $4,500 if you take the square root). This is a nice target to aim for with our neural network model.

In [None]:
!pip install tensorflow

In [None]:
!pip install scikeras

**1. Develop a Baseline Neural Network Model**

you will create a baseline neural network model for the regression problem.

Let’s start by including all the functions and objects you will need for this

In [12]:
import pandas as pd
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from scikeras.wrappers import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

You can now load your dataset from a file in the local directory.

The dataset is, in fact, not in CSV format in the UCI Machine Learning Repository. The attributes are instead separated by whitespace. You can load this easily using the pandas library. Then split the input (X) and output (Y) attributes, making them easier to model with Keras and scikit-learn.

In [14]:
# load dataset
dataframe = pd.read_csv("housing.csv", delim_whitespace=True, header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:13]
Y = dataset[:,13]

You can create Keras models and evaluate them with scikit-learn using handy wrapper objects provided by the Keras library. This is desirable, because scikit-learn excels at evaluating models and will allow you to use powerful data preparation and model evaluation schemes with very few lines of code.

The Keras wrappers require a function as an argument. This function you must define is responsible for creating the neural network model to be evaluated.

Below, you will define the function to create the baseline model to be evaluated. It is a simple model with a single, fully connected hidden layer with the same number of neurons as input attributes (13). The network uses good practices such as the rectifier activation function for the hidden layer. No activation function is used for the output layer because it is a regression problem, and you are interested in predicting numerical values directly without transformation.

The efficient ADAM optimization algorithm is used, and a mean squared error loss function is optimized. This will be the same metric you will use to evaluate the performance of the model. It is a desirable metric because taking the square root gives an error value you can directly understand in the context of the problem (thousands of dollars).

In [15]:
# define base model
def baseline_model():
	# create model
	model = Sequential()
	model.add(Dense(13, input_shape=(13,), kernel_initializer='normal', activation='relu'))
	model.add(Dense(1, kernel_initializer='normal'))
	# Compile model
	model.compile(loss='mean_squared_error', optimizer='adam')
	return model

The Keras wrapper object used in scikit-learn as a regression estimator is called KerasRegressor. You create an instance and pass it both the name of the function to create the neural network model and some parameters to pass along to the fit() function of the model later, such as the number of epochs and batch size. Both of these are set to sensible defaults.

The final step is to evaluate this baseline model. You will use 10-fold cross validation to evaluate the model.

In [20]:
estimator = KerasRegressor(model=baseline_model, epochs=100, batch_size=5, verbose=0)
kfold = KFold(n_splits=10)
results = cross_val_score(estimator, X, Y, cv=kfold, scoring='neg_mean_squared_error')
print("Results: %.2f (%.2f) MSE" % (results.mean(), results.std()))

Results: -37.00 (26.92) MSE


In [19]:
# Regression Example With Boston Dataset: Baseline
from pandas import read_csv
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from scikeras.wrappers import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
# load dataset
dataframe = read_csv("housing.csv", delim_whitespace=True, header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:13]
Y = dataset[:,13]
# define base model
def baseline_model():
	# create model
	model = Sequential()
	model.add(Dense(13, input_shape=(13,), kernel_initializer='normal', activation='relu'))
	model.add(Dense(1, kernel_initializer='normal'))
	# Compile model
	model.compile(loss='mean_squared_error', optimizer='adam')
	return model
# evaluate model
estimator = KerasRegressor(model=baseline_model, epochs=100, batch_size=5, verbose=0)
kfold = KFold(n_splits=10)
results = cross_val_score(estimator, X, Y, cv=kfold, scoring='neg_mean_squared_error')
print("Baseline: %.2f (%.2f) MSE" % (results.mean(), results.std()))

Baseline: -32.26 (27.19) MSE


Running this code gives you an estimate of the model’s performance on the problem for unseen data.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.

Note: The mean squared error is negative because scikit-learn inverts so that the metric is maximized instead of minimized. You can ignore the sign of the result.

The result reports the mean squared error, including the average and standard deviation (average variance) across all ten folds of the cross validation evaluation.

**2. Modeling the Standardized Dataset**

Regression Tutorial with the Keras Deep Learning Library in Python
by Jason Brownlee on June 9, 2016 in Deep Learning
Tweet Tweet  Share
Last Updated on August 5, 2022

Keras is a deep learning library that wraps the efficient numerical libraries Theano and TensorFlow.

In this post, you will discover how to develop and evaluate neural network models using Keras for a regression problem.

After completing this step-by-step tutorial, you will know:

How to load a CSV dataset and make it available to Keras
How to create a neural network model with Keras for a regression problem
How to use scikit-learn with Keras to evaluate models using cross-validation
How to perform data preparation in order to improve skill with Keras models
How to tune the network topology of models with Keras
Kick-start your project with my new book Deep Learning With Python, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

Jun/2016: First published
Update Mar/2017: Updated for Keras 2.0.2, TensorFlow 1.0.1 and Theano 0.9.0
Update Mar/2018: Added alternate link to download the dataset as the original appears to have been taken down
Update Apr/2018: Changed nb_epoch argument to epochs
Update Sep/2019: Updated for Keras 2.2.5 API
Update Jul/2022: Update for TensorFlow 2.x syntax with SciKeras
Regression Tutorial with Keras Deep Learning Library in Python
Regression tutorial with Keras deep learning library in Python
Photo by Salim Fadhley, some rights reserved.

1. Problem Description
The problem that we will look at in this tutorial is the Boston house price dataset.

You can download this dataset and save it to your current working directly with the file name housing.csv (update: download data from here).

The dataset describes 13 numerical properties of houses in Boston suburbs and is concerned with modeling the price of houses in those suburbs in thousands of dollars. As such, this is a regression predictive modeling problem. Input attributes include crime rate, the proportion of nonretail business acres, chemical concentrations, and more.

This is a well-studied problem in machine learning. It is convenient to work with because all the input and output attributes are numerical, and there are 506 instances to work with.

Reasonable performance for models evaluated using Mean Squared Error (MSE) is around 20 in thousands of dollars squared (or $4,500 if you take the square root). This is a nice target to aim for with our neural network model.

Need help with Deep Learning in Python?
Take my free 2-week email course and discover MLPs, CNNs and LSTMs (with code).

Click to sign-up now and also get a free PDF Ebook version of the course.

Start Your FREE Mini-Course Now


2. Develop a Baseline Neural Network Model
In this section, you will create a baseline neural network model for the regression problem.

Let’s start by including all the functions and objects you will need for this tutorial.

import pandas as pd
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from scikeras.wrappers import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
...
You can now load your dataset from a file in the local directory.

The dataset is, in fact, not in CSV format in the UCI Machine Learning Repository. The attributes are instead separated by whitespace. You can load this easily using the pandas library. Then split the input (X) and output (Y) attributes, making them easier to model with Keras and scikit-learn.

...
# load dataset
dataframe = pd.read_csv("housing.csv", delim_whitespace=True, header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:13]
Y = dataset[:,13]
You can create Keras models and evaluate them with scikit-learn using handy wrapper objects provided by the Keras library. This is desirable, because scikit-learn excels at evaluating models and will allow you to use powerful data preparation and model evaluation schemes with very few lines of code.

The Keras wrappers require a function as an argument. This function you must define is responsible for creating the neural network model to be evaluated.

Below, you will define the function to create the baseline model to be evaluated. It is a simple model with a single, fully connected hidden layer with the same number of neurons as input attributes (13). The network uses good practices such as the rectifier activation function for the hidden layer. No activation function is used for the output layer because it is a regression problem, and you are interested in predicting numerical values directly without transformation.

The efficient ADAM optimization algorithm is used, and a mean squared error loss function is optimized. This will be the same metric you will use to evaluate the performance of the model. It is a desirable metric because taking the square root gives an error value you can directly understand in the context of the problem (thousands of dollars).

If you are new to Keras or deep learning, see this Keras tutorial.

...
# define base model
def baseline_model():
	# create model
	model = Sequential()
	model.add(Dense(13, input_shape=(13,), kernel_initializer='normal', activation='relu'))
	model.add(Dense(1, kernel_initializer='normal'))
	# Compile model
	model.compile(loss='mean_squared_error', optimizer='adam')
	return model
The Keras wrapper object used in scikit-learn as a regression estimator is called KerasRegressor. You create an instance and pass it both the name of the function to create the neural network model and some parameters to pass along to the fit() function of the model later, such as the number of epochs and batch size. Both of these are set to sensible defaults.

The final step is to evaluate this baseline model. You will use 10-fold cross validation to evaluate the model.

...
kfold = KFold(n_splits=10)
results = cross_val_score(estimator, X, Y, cv=kfold, scoring='neg_mean_squared_error')
print("Results: %.2f (%.2f) MSE" % (results.mean(), results.std()))
After tying this all together, the complete example is listed below.

# Regression Example With Boston Dataset: Baseline
from pandas import read_csv
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from scikeras.wrappers import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
# load dataset
dataframe = read_csv("housing.csv", delim_whitespace=True, header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:13]
Y = dataset[:,13]
# define base model
def baseline_model():
	# create model
	model = Sequential()
	model.add(Dense(13, input_shape=(13,), kernel_initializer='normal', activation='relu'))
	model.add(Dense(1, kernel_initializer='normal'))
	# Compile model
	model.compile(loss='mean_squared_error', optimizer='adam')
	return model
# evaluate model
estimator = KerasRegressor(model=baseline_model, epochs=100, batch_size=5, verbose=0)
kfold = KFold(n_splits=10)
results = cross_val_score(estimator, X, Y, cv=kfold, scoring='neg_mean_squared_error')
print("Baseline: %.2f (%.2f) MSE" % (results.mean(), results.std()))
Running this code gives you an estimate of the model’s performance on the problem for unseen data.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.

Note: The mean squared error is negative because scikit-learn inverts so that the metric is maximized instead of minimized. You can ignore the sign of the result.

The result reports the mean squared error, including the average and standard deviation (average variance) across all ten folds of the cross validation evaluation.

Baseline: -32.65 (23.33) MSE

3. Modeling the Standardized Dataset
An important concern with the Boston house price dataset is that the input attributes all vary in their scales because they measure different quantities.

It is almost always good practice to prepare your data before modeling it using a neural network model.

Continuing from the above baseline model, you can re-evaluate the same model using a standardized version of the input dataset.

You can use scikit-learn’s Pipeline framework to perform the standardization during the model evaluation process within each fold of the cross validation. This ensures that there is no data leakage from each test set cross validation fold into the training data.

The code below creates a scikit-learn pipeline that first standardizes the dataset and then creates and evaluates the baseline neural network model.

In [21]:
# evaluate model with standardized dataset
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(model=baseline_model, epochs=50, batch_size=5, verbose=0)))
pipeline = Pipeline(estimators)
kfold = KFold(n_splits=10)
results = cross_val_score(pipeline, X, Y, cv=kfold, scoring='neg_mean_squared_error')
print("Standardized: %.2f (%.2f) MSE" % (results.mean(), results.std()))

Standardized: -30.84 (28.24) MSE


In [22]:
# Regression Example With Boston Dataset: Standardized
from pandas import read_csv
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from scikeras.wrappers import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
# load dataset
dataframe = read_csv("housing.csv", delim_whitespace=True, header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:13]
Y = dataset[:,13]
# define base model
def baseline_model():
	# create model
	model = Sequential()
	model.add(Dense(13, input_shape=(13,), kernel_initializer='normal', activation='relu'))
	model.add(Dense(1, kernel_initializer='normal'))
	# Compile model
	model.compile(loss='mean_squared_error', optimizer='adam')
	return model
# evaluate model with standardized dataset
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(model=baseline_model, epochs=50, batch_size=5, verbose=0)))
pipeline = Pipeline(estimators)
kfold = KFold(n_splits=10)
results = cross_val_score(pipeline, X, Y, cv=kfold, scoring='neg_mean_squared_error')
print("Standardized: %.2f (%.2f) MSE" % (results.mean(), results.std()))

Standardized: -27.82 (24.87) MSE


Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.

Running the example provides an improved performance over the baseline model without standardized data, dropping the error.

A further extension of this section would be to similarly apply a rescaling to the output variable, such as normalizing it to the range of 0-1 and using a Sigmoid or similar activation function on the output layer to narrow output predictions to the same range.

**3. Tune the Neural Network Topology**

Many concerns can be optimized for a neural network model.

Perhaps the point of biggest leverage is the structure of the network itself, including the number of layers and the number of neurons in each layer.

you will evaluate two additional network topologies in an effort to further improve the performance of the model. You will look at both a deeper and a wider network topology.

Evaluate a Deeper Network Topology

One way to improve the performance of a neural network is to add more layers. This might allow the model to extract and recombine higher-order features embedded in the data.

you will evaluate the effect of adding one more hidden layer to the model. This is as easy as defining a new function to create this deeper model, copied from your baseline model above. You can then insert a new line after the first hidden layer—in this case, with about half the number of neurons.

In [23]:
# define the model
def larger_model():
	# create model
	model = Sequential()
	model.add(Dense(13, input_shape=(13,), kernel_initializer='normal', activation='relu'))
	model.add(Dense(6, kernel_initializer='normal', activation='relu'))
	model.add(Dense(1, kernel_initializer='normal'))
	# Compile model
	model.compile(loss='mean_squared_error', optimizer='adam')
	return model

You can evaluate this network topology in the same way as above, while also using the standardization of the dataset shown above to improve performance.



In [24]:
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(model=larger_model, epochs=50, batch_size=5, verbose=0)))
pipeline = Pipeline(estimators)
kfold = KFold(n_splits=10)
results = cross_val_score(pipeline, X, Y, cv=kfold, scoring='neg_mean_squared_error')
print("Larger: %.2f (%.2f) MSE" % (results.mean(), results.std()))

Larger: -24.06 (26.84) MSE


Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.

Running this model shows a further improvement in performance from 28 down to 24 thousand squared dollars.

In [25]:
# Regression Example With Boston Dataset: Standardized and Larger
from pandas import read_csv
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from scikeras.wrappers import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
# load dataset
dataframe = read_csv("housing.csv", delim_whitespace=True, header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:13]
Y = dataset[:,13]
# define the model
def larger_model():
	# create model
	model = Sequential()
	model.add(Dense(13, input_shape=(13,), kernel_initializer='normal', activation='relu'))
	model.add(Dense(6, kernel_initializer='normal', activation='relu'))
	model.add(Dense(1, kernel_initializer='normal'))
	# Compile model
	model.compile(loss='mean_squared_error', optimizer='adam')
	return model
# evaluate model with standardized dataset
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(model=larger_model, epochs=50, batch_size=5, verbose=0)))
pipeline = Pipeline(estimators)
kfold = KFold(n_splits=10)
results = cross_val_score(pipeline, X, Y, cv=kfold, scoring='neg_mean_squared_error')
print("Larger: %.2f (%.2f) MSE" % (results.mean(), results.std()))

Larger: -22.73 (24.95) MSE


**4. Evaluate a Wider Network Topology**

Another approach to increasing the representational capability of the model is to create a wider network.

In this section, you will evaluate the effect of keeping a shallow network architecture and nearly doubling the number of neurons in the one hidden layer.

Again, all you need to do is define a new function that creates your neural network model. Here, you will increase the number of neurons in the hidden layer compared to the baseline model from 13 to 20.

In [26]:
# define wider model
def wider_model():
	# create model
	model = Sequential()
	model.add(Dense(20, input_shape=(13,), kernel_initializer='normal', activation='relu'))
	model.add(Dense(1, kernel_initializer='normal'))
	# Compile model
	model.compile(loss='mean_squared_error', optimizer='adam')
	return model

You can evaluate the wider network topology using the same scheme as above:

In [27]:
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(model=wider_model, epochs=100, batch_size=5, verbose=0)))
pipeline = Pipeline(estimators)
kfold = KFold(n_splits=10)
results = cross_val_score(pipeline, X, Y, cv=kfold, scoring='neg_mean_squared_error')
print("Wider: %.2f (%.2f) MSE" % (results.mean(), results.std()))

Wider: -22.92 (23.93) MSE


Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.

Building the model reveals a further drop in error to about 21 thousand squared dollars. This is not a bad result for this problem.

In [28]:
# Regression Example With Boston Dataset: Standardized and Wider
from pandas import read_csv
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from scikeras.wrappers import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
# load dataset
dataframe = read_csv("housing.csv", delim_whitespace=True, header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:13]
Y = dataset[:,13]
# define wider model
def wider_model():
	# create model
	model = Sequential()
	model.add(Dense(20, input_shape=(13,), kernel_initializer='normal', activation='relu'))
	model.add(Dense(1, kernel_initializer='normal'))
	# Compile model
	model.compile(loss='mean_squared_error', optimizer='adam')
	return model
# evaluate model with standardized dataset
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(model=wider_model, epochs=100, batch_size=5, verbose=0)))
pipeline = Pipeline(estimators)
kfold = KFold(n_splits=10)
results = cross_val_score(pipeline, X, Y, cv=kfold, scoring='neg_mean_squared_error')
print("Wider: %.2f (%.2f) MSE" % (results.mean(), results.std()))

Wider: -22.58 (24.47) MSE


It might have been hard to guess that a wider network would outperform a deeper network on this problem. The results demonstrate the importance of empirical testing in developing neural network models.



**Through this notebook, you learned how to develop and evaluate neural network models, including:**

How to load data and develop a baseline model
How to lift performance using data preparation techniques like standardization
How to design and evaluate networks with different varying topologies on a problem