### **Homework 2 Problem 1**
* **Objective**: To build a fully connected neural network for Housing dataset
* **Train:** Training 80% and Validation 20% for 200 epochs
* **Network:** 1 Hidden Layer w/ 8 nodes
* **Report:** Training time, Training loss, evaluation accuracy

In [1]:
%matplotlib inline
import numpy as np
import torch
import torch.optim as optim
import pandas as pd
import torch.nn as nn
import matplotlib.pyplot as plt

**Importing Data**

In [2]:
housing = pd.read_csv("../Housing.csv")
housing.head()

Unnamed: 0,price,area,bedrooms,bathrooms,stories,mainroad,guestroom,basement,hotwaterheating,airconditioning,parking,prefarea,furnishingstatus
0,13300000,7420,4,2,3,yes,no,no,no,yes,2,yes,furnished
1,12250000,8960,4,4,4,yes,no,no,no,yes,3,no,furnished
2,12250000,9960,3,2,2,yes,no,yes,no,no,2,yes,semi-furnished
3,12215000,7500,4,2,2,yes,no,yes,no,yes,3,yes,furnished
4,11410000,7420,4,1,2,yes,yes,yes,no,yes,2,no,furnished


**Selecting Features**

* Normalizing Data

In [3]:
feature_set = housing.iloc[:, [0,1,2,3,4,10]]
feature_set.head()

Unnamed: 0,price,area,bedrooms,bathrooms,stories,parking
0,13300000,7420,4,2,3,2
1,12250000,8960,4,4,4,3
2,12250000,9960,3,2,2,2
3,12215000,7500,4,2,2,3
4,11410000,7420,4,1,2,2


**Normalizing Feature Dataset**

In [4]:
housing = (feature_set-feature_set.mean())/feature_set.std()
housing.head()

Unnamed: 0,price,area,bedrooms,bathrooms,stories,parking
0,4.562174,1.045766,1.402131,1.420507,1.376952,1.516299
1,4.000809,1.755397,1.402131,5.400847,2.5297,2.67695
2,4.000809,2.216196,0.047235,1.420507,0.224204,1.516299
3,3.982096,1.08263,1.402131,1.420507,0.224204,2.67695
4,3.551716,1.045766,1.402131,-0.569663,0.224204,1.516299


**Spliting Data**
* Training: 80%
* Validation: 20%

In [15]:
train, validate = np.split(housing.sample(frac=1, random_state=42), [int(.8*len(housing))])

X1_training = train.iloc[:,[1,2,3,4,5]]
Y_training = train.iloc[:,0]

X1_val = validate.iloc[:,[1,2,3,4,5]]
Y_val = validate.iloc[:,0]

X1_training = torch.tensor(X1_training.values).float()
Y_training = torch.tensor(Y_training.values).float()
Y_training = Y_training[:,None]

X1_val = torch.tensor(X1_val.values).float()
Y_val = torch.tensor(Y_val.values).float()
Y_val = Y_val[:,None]
print("Size of X1_train: ", X1_training.shape)
print("Size of Y_train: ", Y_training.shape)
print("Size of X_val: ", X1_val.shape)
print("Size of Y_val: ", Y_val.shape)

Size of X1_train:  torch.Size([436, 5])
Size of Y_train:  torch.Size([436, 1])
Size of X_val:  torch.Size([109, 5])
Size of Y_val:  torch.Size([109, 1])


In [9]:
def training_loop(n_epochs, optimizer, model, loss_fn, t_u_train, 
t_u_val, t_c_train, t_c_val):
    for epoch in range(1, n_epochs+1):
        t_p_train = model(t_u_train)
        loss_train = loss_fn(t_p_train,t_c_train)

        t_p_val = model(t_u_val)
        loss_val = loss_fn(t_p_val, t_c_val)

        optimizer.zero_grad()
        loss_train.backward()
        optimizer.step()

        if epoch == 1 or epoch % 10 ==0:
            print(f"Epoch {epoch}, Training loss {loss_train.item():.4f},"
                    f" Validation loss {loss_val.item():.4f}")
            
    

In [10]:
def loss_fn(t_p, t_c):
    squared_diffs = (t_p - t_c)**2
    return squared_diffs.mean()

**Single Layer Model**

In [11]:
seq_model1 = nn.Sequential(
    nn.Linear(5,8),
    nn.Tanh(),
    nn.Linear(8,1)
)
seq_model1

Sequential(
  (0): Linear(in_features=5, out_features=8, bias=True)
  (1): Tanh()
  (2): Linear(in_features=8, out_features=1, bias=True)
)

**Multilayer Model**

In [17]:
seq_model2 = nn.Sequential(
    nn.Linear(5,8),
    nn.Tanh(),
    nn.Linear(8,4),
    nn.Tanh(),
    nn.Linear(4,2),
    nn.Tanh(),
    nn.Linear(2,1)
)
seq_model2

Sequential(
  (0): Linear(in_features=5, out_features=8, bias=True)
  (1): Tanh()
  (2): Linear(in_features=8, out_features=4, bias=True)
  (3): Tanh()
  (4): Linear(in_features=4, out_features=2, bias=True)
  (5): Tanh()
  (6): Linear(in_features=2, out_features=1, bias=True)
)

In [13]:
optimizer = optim.SGD(seq_model1.parameters(), lr=1e-3)

training_loop(200,optimizer, seq_model1,nn.MSELoss(), X1_training, X1_val, Y_training, Y_val)
# print('output', seq_model(X1_val))
# print('answer', Y_val)


Epoch 1, Training loss 0.9913, Validation loss 1.0486
Epoch 10, Training loss 0.9760, Validation loss 1.0324
Epoch 20, Training loss 0.9595, Validation loss 1.0150
Epoch 30, Training loss 0.9436, Validation loss 0.9982
Epoch 40, Training loss 0.9282, Validation loss 0.9820
Epoch 50, Training loss 0.9132, Validation loss 0.9664
Epoch 60, Training loss 0.8988, Validation loss 0.9513
Epoch 70, Training loss 0.8848, Validation loss 0.9368
Epoch 80, Training loss 0.8712, Validation loss 0.9228
Epoch 90, Training loss 0.8581, Validation loss 0.9092
Epoch 100, Training loss 0.8454, Validation loss 0.8961
Epoch 110, Training loss 0.8330, Validation loss 0.8835
Epoch 120, Training loss 0.8211, Validation loss 0.8713
Epoch 130, Training loss 0.8095, Validation loss 0.8595
Epoch 140, Training loss 0.7982, Validation loss 0.8481
Epoch 150, Training loss 0.7873, Validation loss 0.8372
Epoch 160, Training loss 0.7768, Validation loss 0.8265
Epoch 170, Training loss 0.7665, Validation loss 0.8163
Epo

In [14]:
optimizer = optim.SGD(seq_model2.parameters(), lr=1e-3)

training_loop(200,optimizer, seq_model2,nn.MSELoss(), X1_training, X1_val, Y_training, Y_val)

Epoch 1, Training loss 0.9420, Validation loss 0.9729
Epoch 10, Training loss 0.9405, Validation loss 0.9717
Epoch 20, Training loss 0.9388, Validation loss 0.9702
Epoch 30, Training loss 0.9371, Validation loss 0.9688
Epoch 40, Training loss 0.9353, Validation loss 0.9673
Epoch 50, Training loss 0.9336, Validation loss 0.9659
Epoch 60, Training loss 0.9318, Validation loss 0.9644
Epoch 70, Training loss 0.9301, Validation loss 0.9629
Epoch 80, Training loss 0.9283, Validation loss 0.9614
Epoch 90, Training loss 0.9264, Validation loss 0.9599
Epoch 100, Training loss 0.9246, Validation loss 0.9583
Epoch 110, Training loss 0.9228, Validation loss 0.9568
Epoch 120, Training loss 0.9209, Validation loss 0.9552
Epoch 130, Training loss 0.9190, Validation loss 0.9536
Epoch 140, Training loss 0.9171, Validation loss 0.9520
Epoch 150, Training loss 0.9152, Validation loss 0.9504
Epoch 160, Training loss 0.9132, Validation loss 0.9488
Epoch 170, Training loss 0.9113, Validation loss 0.9471
Epo