# Implementing a first neural network using PyTorch 

In this exercise, we will learn how to implement a neural network using PyTorch. 

## Single Layer Neural Network 

Let's implmement a single layer neural network, namely 

$$
\begin{align}
x &= [x_1, \ldots, x_m] \\
g(x) &= \sum_{j=1}^n w_j x_{ij} + b
\end{align}
$$

### Preprocessing

We will use the King County housing dataset [link](https://www.kaggle.com/datasets/harlfoxem/housesalesprediction).


In [4]:
# from sklearn.datasets import fetch_california_housing
import seaborn as sns
import numpy as np
import pandas as pd

data_table = pd.read_csv("../data/kc_house_data.csv")
data_table["sale_yr"] = pd.to_numeric(data_table.date.str.slice(0, 4))
data_table["sale_month"] = pd.to_numeric(data_table.date.str.slice(4, 6))
data_table["sale_day"] = pd.to_numeric(data_table.date.str.slice(6, 8))
data_table = pd.DataFrame(
    data_table,
    columns=[
        "sale_yr",
        "sale_month",
        "sale_day",
        "view",
        "waterfront",
        "lat",
        "long",
        "bedrooms",
        "bathrooms",
        "sqft_living",
        "sqft_lot",
        "floors",
        "condition",
        "grade",
        "sqft_above",
        "sqft_basement",
        "yr_built",
        "yr_renovated",
        "zipcode",
        "sqft_living15",
        "sqft_lot15",
        "price",
    ],
)
data_table

Unnamed: 0,sale_yr,sale_month,sale_day,view,waterfront,lat,long,bedrooms,bathrooms,sqft_living,...,condition,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,sqft_living15,sqft_lot15,price
0,2014,10,13,0,0,47.5112,-122.257,3,1.00,1180,...,3,7,1180,0,1955,0,98178,1340,5650,221900.0
1,2014,12,9,0,0,47.7210,-122.319,3,2.25,2570,...,3,7,2170,400,1951,1991,98125,1690,7639,538000.0
2,2015,2,25,0,0,47.7379,-122.233,2,1.00,770,...,3,6,770,0,1933,0,98028,2720,8062,180000.0
3,2014,12,9,0,0,47.5208,-122.393,4,3.00,1960,...,5,7,1050,910,1965,0,98136,1360,5000,604000.0
4,2015,2,18,0,0,47.6168,-122.045,3,2.00,1680,...,3,8,1680,0,1987,0,98074,1800,7503,510000.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
21608,2014,5,21,0,0,47.6993,-122.346,3,2.50,1530,...,3,8,1530,0,2009,0,98103,1530,1509,360000.0
21609,2015,2,23,0,0,47.5107,-122.362,4,2.50,2310,...,3,8,2310,0,2014,0,98146,1830,7200,400000.0
21610,2014,6,23,0,0,47.5944,-122.299,2,0.75,1020,...,3,7,1020,0,2009,0,98144,1020,2007,402101.0
21611,2015,1,16,0,0,47.5345,-122.069,3,2.50,1600,...,3,8,1600,0,2004,0,98027,1410,1287,400000.0


In [5]:
y = data_table["price"].values / 1000
X = data_table.drop("price", axis=1).values

Split the data into training and test set. Use 80% of the data for training and 20% for testing.


In [15]:
# TODO: split the data into training and testing sets

Standardize the features using its mean and standard deviation.


In [16]:
# TODO: Standardize the data
from sklearn.preprocessing import StandardScaler

Convert to the data to PyTorch Tensors


In [17]:
# TODO: Convert to PyTorch tensors
import torch

In [18]:
# TODO: Prepare the dataset for batching
from torch.utils.data import DataLoader, TensorDataset

## Model & Training 


Define the model:


In [19]:
# TODO: Define the model

Define the loss:


In [20]:
# TODO: Define the loss function and optimizer

Define the optimizer


In [21]:
# TODO: Optimizer

In [6]:
# Training loop with DataLoader

Evaluate the model


In [7]:
# Evaluate the model

# Deeper neural networks

Define a deeper model w/ non-linear activation function.


In [8]:
# TODO: Define a deeper model w/ non-linear activation function and add dropout to prevent overfitting

Train & Evaluate the model


In [9]:
# TODO: Run the training loop and evaluate the model

# Drop out


In [11]:
# TODO: Define a deeper model w/ non-linear activation function

In [10]:
# TODO: Run the training loop and evaluate the model