<a href="https://colab.research.google.com/github/hmcarrasco/codecademy_pytorch_basic/blob/main/PytorchBasicLN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Import Libraries

Import PyTorch, pandas, NumPy, and scikit-learn. (Or feel free to import them as needed in the cells below.

In [18]:
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.model_selection import train_test_split


# Import Data

Import the `streeteasy.csv` dataset and preview the first few rows.

In [8]:
apartments_df = pd.read_csv("streeteasy.csv")
apartments_df.head()

Unnamed: 0,rental_id,rent,bedrooms,bathrooms,size_sqft,min_to_subway,floor,building_age_yrs,no_fee,has_roofdeck,has_washer_dryer,has_doorman,has_elevator,has_dishwasher,has_patio,has_gym,neighborhood,borough
0,1545,2550,0.0,1,480,9,2.0,17,1,1,0,0,1,1,0,1,Upper East Side,Manhattan
1,2472,11500,2.0,2,2000,4,1.0,96,0,0,0,0,0,0,0,0,Greenwich Village,Manhattan
2,2919,4500,1.0,1,916,2,51.0,29,0,1,0,1,1,1,0,0,Midtown,Manhattan
3,2790,4795,1.0,1,975,3,8.0,31,0,0,0,1,1,1,0,1,Greenwich Village,Manhattan
4,3946,17500,2.0,2,4800,3,4.0,136,0,0,0,1,1,1,0,1,Soho,Manhattan


# Select Target

Select the numeric column that the neural network will be trying to predict. Feel free to use rent again, or try to predict another column!

Convert this column to a PyTorch tensor.

In [10]:
y = torch.tensor(apartments_df['building_age_yrs'].values, dtype=torch.float).view(-1,1)

# Select Features

Select the numeric columns that the neural network will use as input features to predict the target.

In [11]:
numerical_features = ['bedrooms', 'bathrooms', 'size_sqft', 'min_to_subway', 'floor', 'rent',
                      'no_fee', 'has_roofdeck', 'has_washer_dryer', 'has_doorman', 'has_elevator', 'has_dishwasher',
                      'has_patio', 'has_gym']

X = torch.tensor(apartments_df[numerical_features].values, dtype=torch.float)

# Train-Test-Split

Split the features and target into training and testing datasets. A good initial proportion is 80/20.

In [12]:
X_train, X_test, y_train, y_test = train_test_split (X, y,
    train_size = 0.8,
    test_size = 0.2,
    random_state = 2
)

# Create a Neural Network

Create a neural network using either `Sequential` or OOP. Remember, the first `nn.Linear()` needs to match the number of input features, and the final output needs to have one node for regression.

In [15]:
torch.manual_seed(42)

model = nn.Sequential(
    nn.Linear(14, 128),
    nn.ReLU(),
    nn.Linear(128, 64),
    nn.ReLU(),
    nn.Linear(64,32),
    nn.ReLU(),
    nn.Linear(32,1)
)

# Select a Loss Function

Select a loss function. Feel free to use MSE again, or check out PyTorch's other [loss functions](https://pytorch.org/docs/stable/nn.html#loss-functions). A good alternate to MSE is `nn.L1Loss()`, which is the Mean Absolute Error.

In [16]:
loss = nn.L1Loss()

# Select an Optimizer

Select an optimizer. Feel free to use Adam again, or check out PyTorch's other [optimizers](https://pytorch.org/docs/stable/optim.html#algorithms). A good alternate to Adam is `nn.SGD`, another gradient descent algorithm (stochastic gradient descent).

In [19]:
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training Loop

Use your selected loss and optimizer functions to train the neural network.

In [21]:
num_epochs = 1000

for epoch in range(num_epochs):
    predictions = model(X_train)
    MSE = loss(predictions, y_train)
    MSE.backward()
    optimizer.step()
    optimizer.zero_grad()

    if (epoch + 1) % 100 == 0:
        print(f'Epoch [{epoch + 1}/{num_epochs}], MSE Loss: {MSE.item()}')

Epoch [100/1000], MSE Loss: 83.79371643066406
Epoch [200/1000], MSE Loss: 85.50395965576172
Epoch [300/1000], MSE Loss: 64.82847595214844
Epoch [400/1000], MSE Loss: 20.366796493530273
Epoch [500/1000], MSE Loss: 5.333431720733643
Epoch [600/1000], MSE Loss: 18.271549224853516
Epoch [700/1000], MSE Loss: 47.70771026611328
Epoch [800/1000], MSE Loss: 34.745052337646484
Epoch [900/1000], MSE Loss: 10.709973335266113
Epoch [1000/1000], MSE Loss: 0.3243952691555023


# Evaluate

As you experiment, evaluate each version of your model on the testing dataset, to validate its performance on unseen data.


In [23]:
model.eval()
with torch.no_grad():
    predictions = model(X_test)
    test_MSE = loss(predictions, y_test)

print('Test MSE is ' + str(test_MSE.item()))
print('Test Root MSE is ' + str(test_MSE.item()**(1/2)))

Test MSE is 17.02067756652832
Test Root MSE is 4.125612386849777


# Save the Final Network

Save your final network for later use.

In [24]:
torch.save(model, 'model.pth')