# [DNN] Regression with a Dense Network (DNN)
***

A Simple regression with a Dense Neural Network (DNN) 

## Objectives:
***
- Predicts housing prices from a set of house features. 
- Understanding the principle and the architecture of a regression with a dense neural network  

The **Boston Housing Dataset** consists of price of houses in various places in Boston.Alongside with price, the dataset also provide theses informations : 
 - CRIM: This is the per capita crime rate by town
 - ZN: This is the proportion of residential land zoned for lots larger than 25,000 sq.ft
 - INDUS: This is the proportion of non-retail business acres per town
 - CHAS: This is the Charles River dummy variable (this is equal to 1 if tract bounds river; 0 otherwise)
 - NOX: This is the nitric oxides concentration (parts per 10 million)
 - RM: This is the average number of rooms per dwelling
 - AGE: This is the proportion of owner-occupied units built prior to 1940
 - DIS: This is the weighted distances to five Boston employment centers
 - RAD: This is the index of accessibility to radial highways
 - TAX: This is the full-value property-tax rate per 10,000 dollars
 - PTRATIO: This is the pupil-teacher ratio by town
 - B: This is calculated as 1000(Bk — 0.63)^2, where Bk is the proportion of people of African American descent by town
 - LSTAT: This is the percentage lower status of the population
 - MEDV: This is the median value of owner-occupied homes in 1000 dollars

In [57]:
# Package import
from torch.autograd import Variable
import torch.nn.functional as F
import pandas as pd 
import numpy as np

In [58]:
class convergence_history_MSELoss:
  def __init__(self):
    """
    Class to save the training converge properties
    """
    self.loss = nn.MSELoss()
    self.MAE_loss = nn.L1Loss()
    self.history={}                #Save convergence measures in the end of each epoch
    self.history['loss']=[]        #value of the cost function on training data
    self.history['mae']=[]         #mean absolute error on training data
    self.history['val_loss']=[]    #value of the cost function on validation data
    self.history['val_mae']=[]     #mean absolute error on validation data
  
  def update(self,current_model,xtrain,ytrain,xtest,ytest):
    
    #convergence information on the training data 
    nb_training_obs=xtrain.shape[0]
    if nb_training_obs>xtest.shape[0]:
        nb_training_obs=xtest.shape[0]
        
    epoch_shuffler=np.arange(xtrain.shape[0]) 
    np.random.shuffle(epoch_shuffler)
    mini_batch_observations = epoch_shuffler[:nb_training_obs]
    var_X_batch = Variable(xtrain[mini_batch_observations,:]).float()
    var_y_batch = Variable(ytrain[mini_batch_observations]).float()
    y_pred_batch = current_model(var_X_batch)
    curr_loss = self.loss(y_pred_batch.view(-1), var_y_batch.view(-1))

    self.history['loss'].append(curr_loss.item())
    self.history['mae'].append(self.MAE_loss(y_pred_batch.view(-1), var_y_batch.view(-1)).item())
    
    #convergence information on the test data 
    var_X_batch = Variable(xtest[:,:]).float()
    var_y_batch = Variable(ytest[:]).float()
    y_pred_batch = current_model(var_X_batch)
    curr_loss = self.loss(y_pred_batch.view(-1), var_y_batch.view(-1))

    self.history['val_loss'].append(curr_loss.item())
    self.history['val_mae'].append(self.MAE_loss(y_pred_batch.view(-1), var_y_batch.view(-1)).item())


## What we're going to do :
 ***
 - Retrieve data
 - Preparing the data
 - Build a model
 - Train the model
 - Evaluate the result

### Step 1: Retrieve data

In [59]:
data = pd.read_csv('../DNN/Boston.csv', header=0)
display(data.head(5).style.format("{0:.2f}").set_caption("Few lines of the dataset :"))
print('Missing Data : ',data.isna().sum().sum(), '  Shape is : ', data.shape)

Unnamed: 0.1,Unnamed: 0,crim,zn,indus,chas,nox,rm,age,dis,rad,tax,ptratio,black,lstat,medv
0,1.0,0.01,18.0,2.31,0.0,0.54,6.58,65.2,4.09,1.0,296.0,15.3,396.9,4.98,24.0
1,2.0,0.03,0.0,7.07,0.0,0.47,6.42,78.9,4.97,2.0,242.0,17.8,396.9,9.14,21.6
2,3.0,0.03,0.0,7.07,0.0,0.47,7.18,61.1,4.97,2.0,242.0,17.8,392.83,4.03,34.7
3,4.0,0.03,0.0,2.18,0.0,0.46,7.0,45.8,6.06,3.0,222.0,18.7,394.63,2.94,33.4
4,5.0,0.07,0.0,2.18,0.0,0.46,7.15,54.2,6.06,3.0,222.0,18.7,396.9,5.33,36.2


Missing Data :  0   Shape is :  (506, 15)


### Step 2: Preparing the data

#### Split the data into train and test dataset 
***
We will use 70% of the data for training and 30% for validation. <br>
The dataset is shuffled and shared between learning and testing. <br>
x will be input data and y the expected output <br>

In [66]:
data_train = data.sample(frac=0.7, axis=0)
data_test  = data.drop(data_train.index)
print(data_test)
x_train = data_train.drop(['Unnamed: 0', 'medv'], axis=1)
y_train = data_train['medv']
x_test  = data_test.drop(['Unnamed: 0', 'medv'], axis=1)
y_test  = data_test['medv']


print('Original data shape was : ', data.shape)
print('x_train : ', x_train.shape, 'y_train : ', y_train.shape)
print('x_test : ', x_test.shape, 'y_test : ', y_test.shape)


     Unnamed: 0     crim    zn  indus  chas    nox     rm   age     dis  rad  \
3             4  0.03237   0.0   2.18     0  0.458  6.998  45.8  6.0622    3   
4             5  0.06905   0.0   2.18     0  0.458  7.147  54.2  6.0622    3   
5             6  0.02985   0.0   2.18     0  0.458  6.430  58.7  6.0622    3   
6             7  0.08829  12.5   7.87     0  0.524  6.012  66.6  5.5605    5   
7             8  0.14455  12.5   7.87     0  0.524  6.172  96.1  5.9505    5   
..          ...      ...   ...    ...   ...    ...    ...   ...     ...  ...   
493         494  0.17331   0.0   9.69     0  0.585  5.707  54.0  2.3817    6   
496         497  0.28960   0.0   9.69     0  0.585  5.390  72.9  2.7986    6   
497         498  0.26838   0.0   9.69     0  0.585  5.794  70.6  2.8927    6   
500         501  0.22438   0.0   9.69     0  0.585  6.027  79.7  2.4982    6   
501         502  0.06263   0.0  11.93     0  0.573  6.593  69.1  2.4786    1   

     tax  ptratio   black  lstat  medv 

#### Data normalization
***
- All input data must be normalized, train and test.  
- To do this we will subtract the mean and divide by the standard deviation.  
- But test data should not be used in any way, even for normalization.  
- The mean and the standard deviation will therefore only be calculated with the train data.

In [44]:
display(x_train.describe().style.format("{0:.2f}").set_caption("Before normalization :"))

mean = x_train.mean()
std  = x_train.std()

x_train = (x_train - mean) / std
x_test  = (x_test - mean) / std
x_train = (x_train - mean) / std

display(x_train.describe().style.format("{0:.2f}").set_caption("After normalization :"))
display(x_train.head(5).style.format("{0:.2f}").set_caption("Few lines of the dataset :"))


x_train, y_train = np.array(x_train, dtype='float32'), np.array(y_train, dtype='float32')
x_test,  y_test  = np.array(x_test, dtype='float32'), np.array(y_test, dtype='float32')

Unnamed: 0,crim,zn,indus,chas,nox,rm,age,dis,rad,tax,ptratio,black,lstat
count,354.0,354.0,354.0,354.0,354.0,354.0,354.0,354.0,354.0,354.0,354.0,354.0,354.0
mean,3.86,12.34,10.91,0.08,0.55,6.31,66.92,3.9,9.67,408.69,18.51,353.19,12.38
std,9.51,23.44,6.7,0.27,0.11,0.67,28.48,2.17,8.75,169.02,2.12,98.1,6.94
min,0.01,0.0,0.74,0.0,0.39,3.56,2.9,1.13,1.0,188.0,12.6,0.32,1.92
25%,0.08,0.0,5.19,0.0,0.45,5.9,42.38,2.11,4.0,279.0,17.4,376.06,6.86
50%,0.23,0.0,9.12,0.0,0.53,6.23,73.7,3.28,5.0,330.0,19.1,391.44,11.14
75%,3.7,20.0,18.1,0.0,0.62,6.63,93.2,5.4,24.0,666.0,20.2,396.21,16.49
max,88.98,95.0,27.74,1.0,0.87,8.7,100.0,10.71,24.0,711.0,22.0,396.9,34.41


Unnamed: 0,crim,zn,indus,chas,nox,rm,age,dis,rad,tax,ptratio,black,lstat
count,354.0,354.0,354.0,354.0,354.0,354.0,354.0,354.0,354.0,354.0,354.0,354.0,354.0
mean,0.0,0.0,0.0,-0.0,-0.0,0.0,-0.0,-0.0,0.0,0.0,0.0,-0.0,-0.0
std,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
min,-0.41,-0.53,-1.52,-0.29,-1.46,-4.11,-2.25,-1.28,-0.99,-1.31,-2.78,-3.6,-1.51
25%,-0.4,-0.53,-0.85,-0.29,-0.91,-0.62,-0.86,-0.83,-0.65,-0.77,-0.52,0.23,-0.79
50%,-0.38,-0.53,-0.27,-0.29,-0.19,-0.12,0.24,-0.29,-0.53,-0.47,0.28,0.39,-0.18
75%,-0.02,0.33,1.07,-0.29,0.66,0.49,0.92,0.69,1.64,1.52,0.8,0.44,0.59
max,8.95,3.53,2.51,3.41,2.86,3.59,1.16,3.14,1.64,1.79,1.64,0.45,3.18


Unnamed: 0,crim,zn,indus,chas,nox,rm,age,dis,rad,tax,ptratio,black,lstat
67,-0.4,0.01,-0.72,-0.29,-1.25,-0.64,-1.6,1.2,-0.65,-0.38,0.18,0.44,-0.62
0,-0.41,0.24,-1.28,-0.29,-0.1,0.4,-0.06,0.09,-0.99,-0.67,-1.51,0.45,-1.07
157,-0.28,-0.53,1.29,-0.29,0.49,0.95,1.07,-0.94,-0.53,-0.03,-1.79,0.1,-1.12
284,-0.4,3.31,-1.18,-0.29,-1.33,1.17,-1.62,1.57,-0.99,-0.73,-1.51,0.42,-0.65
136,-0.37,-0.53,1.64,-0.29,0.66,-0.55,0.93,-0.89,-0.65,0.17,1.27,0.26,0.65


### Step 3: Build a model 
***
Informations about: 
 - Optimizer: https://pytorch.org/docs/stable/optim.html
 - Basic neural-netork blocks:  https://pytorch.org/docs/stable/nn.html
 - loss:  https://pytorch.org/docs/stable/nn.html#loss-functions
 


In [67]:
import torch.nn as nn

class model_v1(nn.Module):
    """
    Basic fully connected neural-network for tabular data
    """
    
    def __init__(self,num_vars):
        super(model_v1, self).__init__()
        self.num_vars=num_vars
        self.hidden1 = nn.Linear(self.num_vars,  64)
        self.hidden2 = nn.Linear(64,64)
        self.hidden3 = nn.Linear(64, 1)

    def forward(self, x):
        x = x.view(-1,self.num_vars)   #flatten the observation before using fully-connected layers
        x = self.hidden1(x)
        x = F.relu(x)
        x = self.hidden2(x)
        x = F.relu(x)
        x = self.hidden3(x)
        return x 

import torch.nn as nn
class model_v1

### Step 4: Train the model
***
#### Stochastic gradient descent strategy to fit the model

In [68]:
import torch
from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader



In [69]:
 
def fit(model,X_train,Y_train,X_test,Y_test, EPOCHS = 5, BATCH_SIZE = 32):
    
    loss_fn = F.mse_loss
#    opt = torch.optim.Adam(model.parameters(), lr=1e-4)
    opt = torch.optim.SGD(model.parameters(), lr=1e-4)
    model.train()
    
    history=convergence_history_MSELoss()
    
    history.update(model,torch.from_numpy(X_train),
                       torch.from_numpy(Y_train),
                       torch.from_numpy(X_test),
                       torch.from_numpy(Y_test))
    
    train_ds = TensorDataset(torch.from_numpy(X_train), torch.from_numpy(Y_train))
    train_dl = DataLoader(train_ds, BATCH_SIZE, shuffle=True)
    
    #stochastic gradient descent
    for epoch in range(EPOCHS):
        # Train with batches of data
        for xb,yb in train_dl:
            
            # 1. Generate predictions
            pred = model(xb)
            
            # 2. Calculate loss
            loss = loss_fn(pred, yb)
            
            # 3. Compute gradients
            loss.backward()
            
            # 4. Update parameters using gradients
            opt.step()
            
            # 5. Reset the gradients to zero
            opt.zero_grad()
        
        # Print the progress
        if (epoch+1) % 10 == 0:
            print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, EPOCHS, loss.item()))
        history.update(model,torch.from_numpy(X_train),
                       torch.from_numpy(Y_train),
                       torch.from_numpy(X_test),
                       torch.from_numpy(Y_test))
    
    return history

#### Instanciate the model

In [76]:
model = model_v1(20)
print(model)

model_v1(
  (hidden1): Linear(in_features=20, out_features=64, bias=True)
  (hidden2): Linear(in_features=64, out_features=64, bias=True)
  (hidden3): Linear(in_features=64, out_features=1, bias=True)
)


#### Train the model

In [None]:
import torch 

#torch_x_train=torch.from_numpy(x_train)
#torch_y_train=torch.from_numpy(y_train)
#torch_x_test=torch.from_numpy(x_test)
#torch_y_test=torch.from_numpy(y_test)

batch_size  = 10
epochs      = 100

#res = {}
#for batch_size in 10,30, 50, 100:

res = fit(model, x_train, y_train, x_test, y_test, epochs, batch_size)

### Step 5: Evaluate the model
***
#### Model evaluation 
MAE =  Mean Absolute Error (between the labels and predictions) <br>
A mae equal to 3 represents an average error in prediction of $3k.

In [54]:
torch_x_test=torch.from_numpy(x_test)
torch_y_test=torch.from_numpy(y_test)

var_x_test = Variable(torch_x_test).float()
var_y_test = Variable(torch_y_test).float()
y_pred = model(var_x_test)

nn_loss = nn.MSELoss()
nn_MAE_loss = nn.L1Loss()

print('x_test / loss      : {:5.4f}'.format(nn_loss(y_pred.view(-1), var_y_test.view(-1)).item()))
print('x_test / mae       : {:5.4f}'.format(nn_MAE_loss(y_pred.view(-1), var_y_test.view(-1)).item()))



x_test / loss      : 585.6776
x_test / mae       : 22.3192


In [None]:
import matplotlib.pyplot as plt

plt.plot(res.history['loss'])
plt.plot(res.history['val_loss'])