In [1]:
"""
The MIT License (MIT)
Copyright (c) 2021 NVIDIA
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
"""


'\nThe MIT License (MIT)\nCopyright (c) 2021 NVIDIA\nPermission is hereby granted, free of charge, to any person obtaining a copy of\nthis software and associated documentation files (the "Software"), to deal in\nthe Software without restriction, including without limitation the rights to\nuse, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software is furnished to do so,\nsubject to the following conditions:\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\nTHE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS\nFOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR\nCOPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER\nIN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OU

This code example demonstrates how to use a neural network to solve a regression problem, using the Boston housing dataset. More context for this code example can be found in the section "Programming Example: Predicting House Prices with a DNN" in Chapter 6 in the book Learning Deep Learning by Magnus Ekman (ISBN: 9780137470358).


Unlike MNIST, the Boston Housing dataset is not included with PyTorch, so we retrieve it using scikit-learn instead. This is done by calling the load_boston() function. We then retrieve the inputs and targets as NumPy arrays by calling the get() method. We explicitly split them up into a training set and a test set using the scikit-learn function train_test_split().

We convert the NumPy arrays to np.float32 and reshape them to ensure that the datatype and dimensions later match what PyTorch expects. 

We standardize both the training and test data by using the mean and standard deviation from the training data. The parameter axis=0 ensures that we compute the mean and standard deviation for each input variable separately. The resulting mean (and standard deviation) is a vector of means instead of a single value. That is, the standardized value of the nitric oxides concentration is not affected by the values of the per capita crime rate or any of the other variables.

Finally we create Dataset objects. To do that we need to first convert the NumPy arrays to PyTorch tensors. That is done by calling torch.from_numpy().


In [2]:
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
import torch
import torch.nn as nn
from torch.utils.data import TensorDataset, DataLoader
import numpy as np
from utilities import train_model

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
EPOCHS = 500
BATCH_SIZE = 16

# Read and standardize the data.
boston_housing = load_boston()
data = boston_housing.get('data')
target = boston_housing.get('target')

raw_x_train, raw_x_test, y_train, y_test = train_test_split(
    data, target, test_size=0.2, random_state=0)

# Convert to same precision as model.
raw_x_train = raw_x_train.astype(np.float32)
raw_x_test = raw_x_test.astype(np.float32)
y_train = y_train.astype(np.float32)
y_test = y_test.astype(np.float32)
y_train = np.reshape(y_train, (-1, 1))
y_test = np.reshape(y_test, (-1, 1))

x_mean = np.mean(raw_x_train, axis=0)
x_stddev = np.std(raw_x_train, axis=0)
x_train = (raw_x_train - x_mean) / x_stddev
x_test = (raw_x_test - x_mean) / x_stddev

# Create Dataset objects.
trainset = TensorDataset(torch.from_numpy(x_train),
                         torch.from_numpy(y_train))
testset = TensorDataset(torch.from_numpy(x_test),
                        torch.from_numpy(y_test))



    The Boston housing prices dataset has an ethical problem. You can refer to
    the documentation of this function for further details.

    The scikit-learn maintainers therefore strongly discourage the use of this
    dataset unless the purpose of the code is to study and educate about
    ethical issues in data science and machine learning.

    In this special case, you can fetch the dataset from the original
    source::

        import pandas as pd
        import numpy as np

        data_url = "http://lib.stat.cmu.edu/datasets/boston"
        raw_df = pd.read_csv(data_url, sep="\s+", skiprows=22, header=None)
        data = np.hstack([raw_df.values[::2, :], raw_df.values[1::2, :2]])
        target = raw_df.values[1::2, 2]

    Alternative datasets include the California housing dataset (i.e.
    :func:`~sklearn.datasets.fetch_california_housing`) and the Ames housing
    dataset. You can load the datasets as follows::

        from sklearn.datasets import fetch_california_ho

We then create the model. The code looks follows the same pattern as c5e1_mnist_learning. We define our network to have two hidden layers, so we are now officially doing DL! The two hidden layers in our network implementation have 64 ReLU neurons each, where the first layer is declared to have 13 inputs to match the dataset. The output layer consists of a single neuron with a linear activation function. We use MSE as the loss function and use the Adam optimizer.

Instead of implementing the training loop below, we have broken it out into a separate function train_model(). Its implementation can be found in the file utilities.py. It is very similar to the training loop in c5e1_mnist_learning but has some additional logic to be able to handle both classification and regression problems. In particular, it takes a parameter "metric". If we work on a classification problem it should be set to "acc" and the function will compute accuracy. If we work on a regression problem it should be set to "mae" and the function will compute mean absolute error instead. 


In [3]:
# Create model.
model = nn.Sequential(
    nn.Linear(13, 64),
    nn.ReLU(),
    nn.Linear(64, 64),
    nn.ReLU(),
    nn.Linear(64, 1)
)

# Initialize weights.
for module in model.modules():
    if isinstance(module, nn.Linear):
        nn.init.xavier_uniform_(module.weight)
        nn.init.constant_(module.bias, 0.0)

# Loss function and optimizer
optimizer = torch.optim.Adam(model.parameters())
loss_function = nn.MSELoss()

# Train model.
train_model(model, device, EPOCHS, BATCH_SIZE, trainset, testset,
            optimizer, loss_function, 'mae')


Epoch 1/500 loss: 546.9197 - mae: 20.9555 - val_loss: 461.9453 - val_mae: 18.0829
Epoch 2/500 loss: 411.9062 - mae: 17.7855 - val_loss: 291.5147 - val_mae: 13.8396
Epoch 3/500 loss: 204.3760 - mae: 11.6711 - val_loss: 110.8371 - val_mae: 7.5676
Epoch 4/500 loss: 71.3781 - mae: 6.2936 - val_loss: 67.6985 - val_mae: 5.5637
Epoch 5/500 loss: 42.8666 - mae: 4.8622 - val_loss: 47.7606 - val_mae: 4.6687
Epoch 6/500 loss: 28.1102 - mae: 3.8497 - val_loss: 41.4424 - val_mae: 4.3219
Epoch 7/500 loss: 22.6704 - mae: 3.3795 - val_loss: 36.8758 - val_mae: 4.0785
Epoch 8/500 loss: 19.6967 - mae: 3.0958 - val_loss: 34.8340 - val_mae: 3.9260
Epoch 9/500 loss: 17.7896 - mae: 2.9589 - val_loss: 32.4342 - val_mae: 3.7649
Epoch 10/500 loss: 16.2874 - mae: 2.8384 - val_loss: 31.0403 - val_mae: 3.6430
Epoch 11/500 loss: 16.2171 - mae: 2.7227 - val_loss: 30.4943 - val_mae: 3.5763
Epoch 12/500 loss: 14.4438 - mae: 2.6891 - val_loss: 28.1008 - val_mae: 3.4454
Epoch 13/500 loss: 14.2651 - mae: 2.5924 - val_los

Epoch 115/500 loss: 3.6154 - mae: 1.3856 - val_loss: 17.3756 - val_mae: 2.7734
Epoch 116/500 loss: 3.5476 - mae: 1.3875 - val_loss: 18.0872 - val_mae: 2.7365
Epoch 117/500 loss: 3.5188 - mae: 1.3853 - val_loss: 17.2831 - val_mae: 2.7551
Epoch 118/500 loss: 3.4627 - mae: 1.3577 - val_loss: 18.0332 - val_mae: 2.7253
Epoch 119/500 loss: 3.5598 - mae: 1.3722 - val_loss: 18.0033 - val_mae: 2.7241
Epoch 120/500 loss: 3.6051 - mae: 1.3640 - val_loss: 17.1118 - val_mae: 2.7068
Epoch 121/500 loss: 3.5954 - mae: 1.3887 - val_loss: 17.2637 - val_mae: 2.7270
Epoch 122/500 loss: 3.2701 - mae: 1.3315 - val_loss: 17.7829 - val_mae: 2.7374
Epoch 123/500 loss: 3.2123 - mae: 1.3194 - val_loss: 17.4302 - val_mae: 2.7408
Epoch 124/500 loss: 3.3444 - mae: 1.3332 - val_loss: 17.3600 - val_mae: 2.7546
Epoch 125/500 loss: 3.2211 - mae: 1.3204 - val_loss: 17.3057 - val_mae: 2.7260
Epoch 126/500 loss: 3.1949 - mae: 1.3061 - val_loss: 18.2013 - val_mae: 2.7292
Epoch 127/500 loss: 3.3310 - mae: 1.3104 - val_loss:

Epoch 222/500 loss: 1.6380 - mae: 0.9319 - val_loss: 16.8099 - val_mae: 2.6429
Epoch 223/500 loss: 1.8234 - mae: 0.9008 - val_loss: 17.0846 - val_mae: 2.6312
Epoch 224/500 loss: 1.9565 - mae: 1.0108 - val_loss: 16.8621 - val_mae: 2.6993
Epoch 225/500 loss: 1.8635 - mae: 1.0120 - val_loss: 16.6436 - val_mae: 2.6708
Epoch 226/500 loss: 1.6148 - mae: 0.8914 - val_loss: 17.0847 - val_mae: 2.6564
Epoch 227/500 loss: 1.6671 - mae: 0.8711 - val_loss: 17.0899 - val_mae: 2.6523
Epoch 228/500 loss: 1.8199 - mae: 0.9777 - val_loss: 17.3131 - val_mae: 2.8235
Epoch 229/500 loss: 1.7454 - mae: 0.9847 - val_loss: 17.1705 - val_mae: 2.6591
Epoch 230/500 loss: 1.5652 - mae: 0.8976 - val_loss: 17.0048 - val_mae: 2.6726
Epoch 231/500 loss: 1.5177 - mae: 0.8772 - val_loss: 16.7530 - val_mae: 2.6780
Epoch 232/500 loss: 1.5968 - mae: 0.9148 - val_loss: 17.4589 - val_mae: 2.7022
Epoch 233/500 loss: 1.6109 - mae: 0.9135 - val_loss: 17.4183 - val_mae: 2.7037
Epoch 234/500 loss: 1.4892 - mae: 0.8650 - val_loss:

Epoch 328/500 loss: 1.1025 - mae: 0.7480 - val_loss: 16.8399 - val_mae: 2.6157
Epoch 329/500 loss: 0.9369 - mae: 0.6569 - val_loss: 16.4885 - val_mae: 2.5932
Epoch 330/500 loss: 0.9981 - mae: 0.7067 - val_loss: 16.9801 - val_mae: 2.5890
Epoch 331/500 loss: 1.0281 - mae: 0.7112 - val_loss: 16.7484 - val_mae: 2.6312
Epoch 332/500 loss: 1.0253 - mae: 0.7275 - val_loss: 16.3172 - val_mae: 2.5376
Epoch 333/500 loss: 0.9528 - mae: 0.6850 - val_loss: 16.3623 - val_mae: 2.5312
Epoch 334/500 loss: 0.9368 - mae: 0.6565 - val_loss: 16.7330 - val_mae: 2.5781
Epoch 335/500 loss: 0.8862 - mae: 0.6668 - val_loss: 16.5890 - val_mae: 2.5631
Epoch 336/500 loss: 0.8962 - mae: 0.6589 - val_loss: 16.5327 - val_mae: 2.5762
Epoch 337/500 loss: 0.8597 - mae: 0.6344 - val_loss: 17.1152 - val_mae: 2.6148
Epoch 338/500 loss: 0.8488 - mae: 0.6277 - val_loss: 16.6641 - val_mae: 2.5707
Epoch 339/500 loss: 0.9339 - mae: 0.6606 - val_loss: 16.9340 - val_mae: 2.5915
Epoch 340/500 loss: 0.8840 - mae: 0.6517 - val_loss:

Epoch 438/500 loss: 0.6076 - mae: 0.5445 - val_loss: 16.8745 - val_mae: 2.5755
Epoch 439/500 loss: 0.5875 - mae: 0.5188 - val_loss: 16.9878 - val_mae: 2.5652
Epoch 440/500 loss: 0.6411 - mae: 0.5440 - val_loss: 16.7971 - val_mae: 2.5521
Epoch 441/500 loss: 0.6926 - mae: 0.5772 - val_loss: 16.9898 - val_mae: 2.5898
Epoch 442/500 loss: 0.5815 - mae: 0.5367 - val_loss: 16.9364 - val_mae: 2.5919
Epoch 443/500 loss: 0.6261 - mae: 0.5493 - val_loss: 16.4666 - val_mae: 2.5403
Epoch 444/500 loss: 0.5794 - mae: 0.5275 - val_loss: 16.7398 - val_mae: 2.5623
Epoch 445/500 loss: 0.6776 - mae: 0.5553 - val_loss: 17.1562 - val_mae: 2.6015
Epoch 446/500 loss: 0.6715 - mae: 0.5821 - val_loss: 16.8018 - val_mae: 2.5479
Epoch 447/500 loss: 0.5744 - mae: 0.5273 - val_loss: 16.5439 - val_mae: 2.5650
Epoch 448/500 loss: 0.5957 - mae: 0.5330 - val_loss: 16.4461 - val_mae: 2.5150
Epoch 449/500 loss: 0.7115 - mae: 0.6196 - val_loss: 17.5795 - val_mae: 2.6957
Epoch 450/500 loss: 0.6148 - mae: 0.5735 - val_loss:

[0.6350741650049503, 2.655510263783591]

After the training is done, we use our model to predict the price for all test examples and then print out the first four predictions and the correct values so we can get an idea of how correct the model is.


In [4]:
# Print first 4 predictions.
inputs = torch.from_numpy(x_test)
inputs = inputs.to(device)
outputs = model(inputs)
for i in range(0, 4):
    print('Prediction: %4.2f' % outputs.data[i].item(),
         ', true value: %4.2f' % y_test[i].item())


Prediction: 22.79 , true value: 22.60
Prediction: 30.48 , true value: 50.00
Prediction: 27.84 , true value: 23.00
Prediction: 8.47 , true value: 8.30
