<a href="https://colab.research.google.com/github/ShahdTarek/Pediction-using-PyTorch-NN.ipynb/blob/main/Prediction_using_PyTorch_NN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Boston Housing Data Price Prediction
Housing Values in Suburbs of Boston
The medv variable is the target variable.

Data description
The Boston data frame has 506 rows and 14 columns.

This data frame contains the following columns:

crim
per capita crime rate by town.

zn
proportion of residential land zoned for lots over 25,000 sq.ft.

indus
proportion of non-retail business acres per town.

chas
Charles River dummy variable (= 1 if tract bounds river; 0 otherwise).

nox
nitrogen oxides concentration (parts per 10 million).

rm
average number of rooms per dwelling.

age
proportion of owner-occupied units built prior to 1940.

dis
weighted mean of distances to five Boston employment centres.

rad
index of accessibility to radial highways.

tax
full-value property-tax rate per \$10,000.

ptratio
pupil-teacher ratio by town.

black
1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town.

lstat
lower status of the population (percent).

medv
median value of owner-occupied homes in \$1000s.

Source
Harrison, D. and Rubinfeld, D.L. (1978) Hedonic prices and the demand for clean air. J. Environ. Economics and Management 5, 81–102.

Belsley D.A., Kuh, E. and Welsch, R.E. (1980) Regression Diagnostics. Identifying Influential Data and Sources of Collinearity. New York: Wiley.

In [None]:
import pandas as pd
import numpy as np
import warnings

#suppress warnings
warnings.filterwarnings('ignore')


df=pd.read_csv('BostonHousing.csv')
df.head(5)

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import warnings
warnings.filterwarnings('ignore')

#Drooping the label and creating sperated df for it
X=df.drop(['medv'],1)
y=df['medv']

#making object of StandardScaler class
scaler=StandardScaler()

#extracting the indexs/columns
columns=X.columns

#tansforming the data into a standrlized and less variance form "Bais calculation per feature"
X=scaler.fit_transform(X)

#creating a new df that cotain bais per feature only
X=pd.DataFrame(X,columns=columns)

#Spliting bais data 'X' and label data , random state : shuffling process of data to generate different set in each execution
X_train, X_test,y_train,y_test=train_test_split(X,y,test_size=0.3, random_state=20)

print('X_train shape',X_train.shape)
print('y_train shape',y_train.shape)
print('X_test shape',X_test.shape)
print('y_test shape',y_test.shape)

X_train shape (354, 13)
y_train shape (354,)
X_test shape (152, 13)
y_test shape (152,)


In [None]:
import torch
from torch import nn # nn : is the base class for all neural network module

#abstract class representing the dataset
class data(torch.utils.data.Dataset):

  #intilizing a constructor
    def __init__(self,X,y):

      #trasnforming data 'x' into a tensor/n dimension array
        self.X=torch.from_numpy(X)

      #trasnforming label data 'y' into a tensor/n dimension array
        self.y=torch.from_numpy(y)

    #function to resturn the size/shape of the data 'x'
    def __len__(self):
        return len(self.X)

    #return the indices of each data record in 'X' and 'y' data
    def __getitem__(self,i):
        return self.X[i],self.y[i]

# nn.Module : neural network that takes input to add weights and bias to it then feed the input through multiple hidden layers and finally returns the output
class NeuralNetwork(nn.Module):
    def __init__(self ):

      #inheriting the first base class properties means that data here is also in the shape of tensor
        super().__init__()

        #variable layers contain the full network of type sequential
        self.layers=nn.Sequential(
            nn.Linear(13,32),    #Applies a linear transformation to the input sample with size 13 (number of columns) and with 32 size of output sample
            nn.Sigmoid(),       #Applies a filterion using sigmoid per neuron
            nn.Linear(32,1))    #input sample with size 32 and with 1 output sample / prediction

    #feed data into the network layers
    def forward(self,inputs):
        return self.layers(inputs)

train_dataset=data(X_train.values,y_train.values)

#DataLoader : iterates over the train dataset with batch size 8
train_loader= torch.utils.data.DataLoader(train_dataset,batch_size= 8)

#getting instance of the class to apply the hyperparameter on it
NN= NeuralNetwork()

#NN.parameters() : input parameters in reshape to tensor
#optim : construct an sdg optimizer with learning rate of 1e-4 on the data tensor
#the stochastic gradient descent used to update network weights during training process
optimizer = torch.optim.SGD(NN.parameters(),lr=1e-4)

#object of MSEloss that calculate loss (y - y^) during the training
cost= nn.MSELoss()

# 20 cycle of F.P and B.P
epochs=20

for epoch in range(epochs):
    print('epoch:',epoch)
    total_loss=0  #initialize loss value

#loop over the data from 0 to the train_loader size
    for i , data in enumerate(train_loader,0):

      #split data into target/label and inputs
        inputs,targets=data

        #the network accepts a float32 tensor only
        inputs,targets=inputs.float(),targets.float()
        targets=targets.reshape((-1,1))  #???

        #initially set all grediant tensors to 0
        optimizer.zero_grad()

        #feed the input tensor to the forward function to start making predictions based on the current cycle
        preds=NN.forward(inputs)

        # calculating the loss/difference true data and the predicted data
        losses=cost(preds,targets)

        #calculating the error per tensor
        losses.backward()

        #perfom optimization per learning step
        optimizer.step()

        #accumlating all the errors / cost function
        total_loss += losses.item()
    print('total_loss: ',total_loss)

    #casting the test data  tensors to float
preds=NN.forward(torch.from_numpy(X_test.values).float())
result=pd.DataFrame()
#detach : return a new tensor seprated from the old one
result['Preds']=preds.detach().numpy().reshape(-1,)
result['Actual']=y_test.values
result

epoch: 0
total_loss:  24719.76629638672
epoch: 1
total_loss:  21531.146057128906
epoch: 2
total_loss:  18779.871368408203
epoch: 3
total_loss:  16389.66421508789
epoch: 4
total_loss:  14303.755004882812
epoch: 5
total_loss:  12479.296249389648
epoch: 6
total_loss:  10883.38198852539
epoch: 7
total_loss:  9490.058681488037
epoch: 8
total_loss:  8278.040916442871
epoch: 9
total_loss:  7229.010320663452
epoch: 10
total_loss:  6326.43116569519
epoch: 11
total_loss:  5554.832290649414
epoch: 12
total_loss:  4899.4685616493225
epoch: 13
total_loss:  4346.2535927295685
epoch: 14
total_loss:  3881.8504351377487
epoch: 15
total_loss:  3493.8282729387283
epoch: 16
total_loss:  3170.809861704707
epoch: 17
total_loss:  2902.576039262116
epoch: 18
total_loss:  2680.1098918020725
epoch: 19
total_loss:  2495.5861745476723


Unnamed: 0,Preds,Actual
0,19.491924,21.2
1,21.240629,20.6
2,18.599703,21.5
3,19.287558,21.7
4,13.820187,13.4
...,...,...
147,19.152626,20.1
148,11.933540,8.8
149,19.660833,21.7
150,20.616467,20.3
