<a href="https://colab.research.google.com/github/anandababugudipudi/Boston-House-Prices-PyTorch-Linear-Regression/blob/main/Boston_House_Prices_PyTorch_Linear_Regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**Boston House Prices Prediction using Linear Regression using PyTorch**

###**Linear Regression** 
Linear Regression is a predictive modeling technique that finds a relationship between independent variable(s) and dependent variable(s). The independent variable(iv)’s can be categorical or continuous, while dependent variable(dv)s are continuous. Underlying function mapping iv’s and dv’s can be linear, quadratic, polynomial or other non-linear functions(like sigmoid function in logistic regression).

> Regression techniques are heavily used in making real estate price prediction, financial forecasting, predicting traffic arrival time (ETA).

###**Importing the necessary packages**

In [None]:
import numpy as np
import pandas as pd
import torch 
from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader
import torch.nn.functional as F
import matplotlib.pyplot as plt

###**Importing the Dataset**
The Boston House prices dataset consists of 506 samples with 13 features with prices ranging from 5.0 to 50.0. Each record in the database describes a Boston suburb or town. The data was drawn from the Boston Standard Metropolitan Statistical Area (SMSA) in 1970. The attributes are deﬁned as follows (taken from the UCI Machine Learning Repository


In [None]:
col_names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV']
# Reading the csv file into numpy array
data = pd.read_csv('./boston-house-prices.csv', header = None, delimiter = r"\s+", names = col_names)
data.head()

Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT,MEDV
0,0.00632,18.0,2.31,0,0.538,6.575,65.2,4.09,1,296.0,15.3,396.9,4.98,24.0
1,0.02731,0.0,7.07,0,0.469,6.421,78.9,4.9671,2,242.0,17.8,396.9,9.14,21.6
2,0.02729,0.0,7.07,0,0.469,7.185,61.1,4.9671,2,242.0,17.8,392.83,4.03,34.7
3,0.03237,0.0,2.18,0,0.458,6.998,45.8,6.0622,3,222.0,18.7,394.63,2.94,33.4
4,0.06905,0.0,2.18,0,0.458,7.147,54.2,6.0622,3,222.0,18.7,396.9,5.33,36.2


In [None]:
# Separating the features and labels in the dataset
features = data.drop('MEDV', axis = 1)
labels = data['MEDV']
print(features)
print(labels)

        CRIM    ZN  INDUS  CHAS    NOX  ...  RAD    TAX  PTRATIO       B  LSTAT
0    0.00632  18.0   2.31     0  0.538  ...    1  296.0     15.3  396.90   4.98
1    0.02731   0.0   7.07     0  0.469  ...    2  242.0     17.8  396.90   9.14
2    0.02729   0.0   7.07     0  0.469  ...    2  242.0     17.8  392.83   4.03
3    0.03237   0.0   2.18     0  0.458  ...    3  222.0     18.7  394.63   2.94
4    0.06905   0.0   2.18     0  0.458  ...    3  222.0     18.7  396.90   5.33
..       ...   ...    ...   ...    ...  ...  ...    ...      ...     ...    ...
501  0.06263   0.0  11.93     0  0.573  ...    1  273.0     21.0  391.99   9.67
502  0.04527   0.0  11.93     0  0.573  ...    1  273.0     21.0  396.90   9.08
503  0.06076   0.0  11.93     0  0.573  ...    1  273.0     21.0  396.90   5.64
504  0.10959   0.0  11.93     0  0.573  ...    1  273.0     21.0  393.45   6.48
505  0.04741   0.0  11.93     0  0.573  ...    1  273.0     21.0  396.90   7.88

[506 rows x 13 columns]
0      24.0
1  

In [None]:
# Converting the pandas dataframes into PyTorch Tensors
# From here onwards our features are inputs and labels are targets
inputs = torch.tensor(features.values).float()
targets = torch.tensor(labels.values).float()
print(inputs)
print(targets)

tensor([[6.3200e-03, 1.8000e+01, 2.3100e+00,  ..., 1.5300e+01, 3.9690e+02,
         4.9800e+00],
        [2.7310e-02, 0.0000e+00, 7.0700e+00,  ..., 1.7800e+01, 3.9690e+02,
         9.1400e+00],
        [2.7290e-02, 0.0000e+00, 7.0700e+00,  ..., 1.7800e+01, 3.9283e+02,
         4.0300e+00],
        ...,
        [6.0760e-02, 0.0000e+00, 1.1930e+01,  ..., 2.1000e+01, 3.9690e+02,
         5.6400e+00],
        [1.0959e-01, 0.0000e+00, 1.1930e+01,  ..., 2.1000e+01, 3.9345e+02,
         6.4800e+00],
        [4.7410e-02, 0.0000e+00, 1.1930e+01,  ..., 2.1000e+01, 3.9690e+02,
         7.8800e+00]])
tensor([24.0000, 21.6000, 34.7000, 33.4000, 36.2000, 28.7000, 22.9000, 27.1000,
        16.5000, 18.9000, 15.0000, 18.9000, 21.7000, 20.4000, 18.2000, 19.9000,
        23.1000, 17.5000, 20.2000, 18.2000, 13.6000, 19.6000, 15.2000, 14.5000,
        15.6000, 13.9000, 16.6000, 14.8000, 18.4000, 21.0000, 12.7000, 14.5000,
        13.2000, 13.1000, 13.5000, 18.9000, 20.0000, 21.0000, 24.7000, 30.8000,
    

In [None]:
# Shape of inputs and targets
print(f"Shape of inputs is {inputs.shape}")
print(f"Shape of labels is {targets.shape}")

Shape of inputs is torch.Size([506, 13])
Shape of labels is torch.Size([506])


###**Dataset and DataLoader**

We'll create a `TensorDataset`, which allows access to rows from `inputs` and `targets` as tuples, and provides standard APIs for working with many different types of datasets in PyTorch.

In [None]:
# Define Dataset
train_ds = TensorDataset(inputs, targets)
train_ds[0:3]

(tensor([[6.3200e-03, 1.8000e+01, 2.3100e+00, 0.0000e+00, 5.3800e-01, 6.5750e+00,
          6.5200e+01, 4.0900e+00, 1.0000e+00, 2.9600e+02, 1.5300e+01, 3.9690e+02,
          4.9800e+00],
         [2.7310e-02, 0.0000e+00, 7.0700e+00, 0.0000e+00, 4.6900e-01, 6.4210e+00,
          7.8900e+01, 4.9671e+00, 2.0000e+00, 2.4200e+02, 1.7800e+01, 3.9690e+02,
          9.1400e+00],
         [2.7290e-02, 0.0000e+00, 7.0700e+00, 0.0000e+00, 4.6900e-01, 7.1850e+00,
          6.1100e+01, 4.9671e+00, 2.0000e+00, 2.4200e+02, 1.7800e+01, 3.9283e+02,
          4.0300e+00]]), tensor([24.0000, 21.6000, 34.7000]))

The `TensorDataset` allows us to access a small section of the training data using the array indexing notation (`[0:3]` in the above code). It returns a tuple with two elements. The first element contains the input variables for the selected rows, and the second contains the targets.

We'll also create a `DataLoader`, which can split the data into batches of a predefined size while training. It also provides other utilities like shuffling and random sampling of the data.

In [None]:
# Define the Dataloader
batch_size = 32
train_dl = DataLoader(train_ds, batch_size, shuffle = True)

We can use the loader in a `for` loop. In each iteration, the data loader returns one batch of data with the given batch size. If `shuffle` is set to `True`, it shuffles the training data before creating batches. Shuffling helps randomize the input to the optimization algorithm, leading to a faster reduction in the loss.

In [None]:
# Here we are printing one batch of data
for inp, tgt in train_dl:
  print(inp)
  print(tgt)
  break

tensor([[6.2739e-01, 0.0000e+00, 8.1400e+00, 0.0000e+00, 5.3800e-01, 5.8340e+00,
         5.6500e+01, 4.4986e+00, 4.0000e+00, 3.0700e+02, 2.1000e+01, 3.9562e+02,
         8.4700e+00],
        [1.4932e-01, 2.5000e+01, 5.1300e+00, 0.0000e+00, 4.5300e-01, 5.7410e+00,
         6.6200e+01, 7.2254e+00, 8.0000e+00, 2.8400e+02, 1.9700e+01, 3.9511e+02,
         1.3150e+01],
        [1.3428e+00, 0.0000e+00, 1.9580e+01, 0.0000e+00, 6.0500e-01, 6.0660e+00,
         1.0000e+02, 1.7573e+00, 5.0000e+00, 4.0300e+02, 1.4700e+01, 3.5389e+02,
         6.4300e+00],
        [9.2323e+00, 0.0000e+00, 1.8100e+01, 0.0000e+00, 6.3100e-01, 6.2160e+00,
         1.0000e+02, 1.1691e+00, 2.4000e+01, 6.6600e+02, 2.0200e+01, 3.6615e+02,
         9.5300e+00],
        [2.0746e-01, 0.0000e+00, 2.7740e+01, 0.0000e+00, 6.0900e-01, 5.0930e+00,
         9.8000e+01, 1.8226e+00, 4.0000e+00, 7.1100e+02, 2.0100e+01, 3.1843e+02,
         2.9680e+01],
        [1.4455e-01, 1.2500e+01, 7.8700e+00, 0.0000e+00, 5.2400e-01, 6.1720e+00,

###**Creating a Linear Model**

Instead of initializing the weights & biases manually, we can define the model using the `nn.Linear` class from PyTorch, which does it automatically.

In [None]:
# Define the model
model = torch.nn.Linear(13, 1)
print(model.weight)
print(model.bias)

Parameter containing:
tensor([[ 0.2189,  0.2017, -0.1480,  0.2069,  0.0198,  0.0188,  0.0324, -0.1464,
          0.2042, -0.1922,  0.0617,  0.1403,  0.0756]], requires_grad=True)
Parameter containing:
tensor([0.0802], requires_grad=True)


In [None]:
# Parameters
list(model.parameters()) # Returns a list containing all the weights and bias matrices present in the model

[Parameter containing:
 tensor([[ 0.2189,  0.2017, -0.1480,  0.2069,  0.0198,  0.0188,  0.0324, -0.1464,
           0.2042, -0.1922,  0.0617,  0.1403,  0.0756]], requires_grad=True),
 Parameter containing:
 tensor([0.0802], requires_grad=True)]

In [None]:
# Generate predictions
preds = model(inputs)
preds

tensor([[ 5.3482e+00],
        [ 1.2379e+01],
        [ 1.0860e+01],
        [ 1.5199e+01],
        [ 1.5981e+01],
        [ 1.5705e+01],
        [ 1.5621e+00],
        [ 3.1651e+00],
        [ 2.6501e+00],
        [ 1.1588e+00],
        [ 2.5540e+00],
        [ 2.2442e+00],
        [ 2.1634e-01],
        [-1.2190e-01],
        [-1.5643e+00],
        [-4.2921e-01],
        [-2.5874e+00],
        [-3.1739e-01],
        [-1.5657e+01],
        [-3.2917e-01],
        [-5.7346e-01],
        [ 7.2315e-01],
        [ 1.8786e+00],
        [ 1.8267e+00],
        [ 1.2412e+00],
        [-1.1765e+01],
        [-1.5035e+00],
        [-1.1157e+01],
        [ 9.7414e-02],
        [-1.1909e+00],
        [-2.9721e+00],
        [-1.1158e+00],
        [-2.0781e+01],
        [-3.3910e+00],
        [-1.8566e+01],
        [ 6.0614e+00],
        [ 3.2614e+00],
        [ 5.0466e+00],
        [ 4.3326e+00],
        [ 2.3976e+01],
        [ 2.3613e+01],
        [ 9.8640e+00],
        [ 9.7625e+00],
        [ 1

###**Loss Function**

Instead of defining a loss function manually, we can use the built-in loss function `mse_loss`.

In [None]:
# Define the loss function and compute the loss
loss_fn = F.mse_loss
loss = loss_fn(preds, targets)
print(loss)

tensor(3464.1326, grad_fn=<MseLossBackward>)


  This is separate from the ipykernel package so we can avoid doing imports until


###**Optimizer**

Instead of manually manipulating the model's weights & biases using gradients, we can use the optimizer `optim.SGD`. SGD is short for "Stochastic Gradient Descent". The term _stochastic_ indicates that samples are selected in random batches instead of as a single group.

In [None]:
# Define Optimizer
opt = torch.optim.SGD(model.parameters(), lr = 1e-5)
print(opt)

SGD (
Parameter Group 0
    dampening: 0
    lr: 1e-05
    momentum: 0
    nesterov: False
    weight_decay: 0
)


In [None]:
#1. Geenrate Predictions
preds = model(inputs)

# 2. Calculate loss
loss = loss_fn(preds, targets)

# 3. Compute gradients
loss.backward()

# 4. Update parameters using gradients
opt.step()

# 5. Reset the gradients to zero
opt.zero_grad()

print(loss)

tensor(nan, grad_fn=<MseLossBackward>)


  """


Note that `model.parameters()` is passed as an argument to `optim.SGD` so that the optimizer knows which matrices should be modified during the update step. Also, we can specify a learning rate that controls the amount by which the parameters are modified.

###**Train the model**

We are now ready to train the model. We'll follow the same process to implement gradient descent:

1. Generate predictions

2. Calculate the loss

3. Compute gradients w.r.t the weights and biases

4. Adjust the weights by subtracting a small quantity proportional to the gradient

5. Reset the gradients to zero

The only change is that we'll work batches of data instead of processing the entire training data in every iteration. Let's define a utility function `fit` that trains the model for a given number of epochs.

In [None]:
# Utility Function to train the model
def fit(num_epochs, model, loss_fn, opt, train_dl):
  # Repeat for given number of epochs
  for epoch in range(num_epochs):
    # Train with batces of data
    for inp, tgt in train_dl:
      # 1. Geenrate Predictions
      preds = model(inp)

      # 2. Calculate loss
      loss = loss_fn(preds, tgt)

      # 3. Compute gradients
      loss.backward()

      # 4. Update parameters using gradients
      opt.step()

      # 5. Reset the gradients to zero
      opt.zero_grad()

    # Print the progress
    if (epoch + 1) % 10 == 0:
      print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {round(loss.item(), 4)}")

Some things to note above:

* We use the data loader defined earlier to get batches of data for every iteration.

* Instead of updating parameters (weights and biases) manually, we use `opt.step` to perform the update and `opt.zero_grad` to reset the gradients to zero.

* We've also added a log statement that prints the loss from the last batch of data for every 10th epoch to track training progress. `loss.item` returns the actual value stored in the loss tensor.

Let's train the model for 100 epochs.

In [None]:
# Calling the Training function of our dataset
fit(100, model, loss_fn, opt, train_dl)

  # This is added back by InteractiveShellApp.init_path()
  # This is added back by InteractiveShellApp.init_path()


Epoch [10/100], Loss: nan
Epoch [20/100], Loss: nan
Epoch [30/100], Loss: nan
Epoch [40/100], Loss: nan
Epoch [50/100], Loss: nan
Epoch [60/100], Loss: nan
Epoch [70/100], Loss: nan
Epoch [80/100], Loss: nan
Epoch [90/100], Loss: nan
Epoch [100/100], Loss: nan


In [None]:
# Testing for one input
model(torch.tensor([[6.3200e-03, 1.8000e+01, 2.3100e+00, 0.0000e+00, 5.3800e-01, 6.5750e+00,
        6.5200e+01, 4.0900e+00, 1.0000e+00, 2.9600e+02, 1.5300e+01, 3.9690e+02,
        4.9800e+00]]))

tensor([[nan]], grad_fn=<AddmmBackward>)