<a href="https://colab.research.google.com/github/VMBoehm/N3ASProject_Annie/blob/main/First_Regression_with_Pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import numpy as np
import matplotlib.pyplot as plt

In [None]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from sklearn.datasets import load_boston
from sklearn.preprocessing import StandardScaler

In [None]:
  # Set fixed random number seed
  torch.manual_seed(42)
  
  # Load Boston dataset
  # TASK: what are the X and y values in this dataset?
  X, y = load_boston(return_X_y=True)


    The Boston housing prices dataset has an ethical problem. You can refer to
    the documentation of this function for further details.

    The scikit-learn maintainers therefore strongly discourage the use of this
    dataset unless the purpose of the code is to study and educate about
    ethical issues in data science and machine learning.

    In this special case, you can fetch the dataset from the original
    source::

        import pandas as pd
        import numpy as np


        data_url = "http://lib.stat.cmu.edu/datasets/boston"
        raw_df = pd.read_csv(data_url, sep="\s+", skiprows=22, header=None)
        data = np.hstack([raw_df.values[::2, :], raw_df.values[1::2, :2]])
        target = raw_df.values[1::2, 2]

    Alternative datasets include the California housing dataset (i.e.
    :func:`~sklearn.datasets.fetch_california_housing`) and the Ames housing
    dataset. You can load the datasets as follows::

        from sklearn.datasets import fetch_california_h

In [None]:
# TASK: What is a class is python? What's it used for? How does inheritance work in python?
# TASK: Find out what StandardScaler does. 
class BostonDataset(torch.utils.data.Dataset):
  '''
  Prepare the Boston dataset for regression
  '''

  def __init__(self, X, y, scale_data=True):
    if not torch.is_tensor(X) and not torch.is_tensor(y):
      # Apply scaling if necessary
      if scale_data:
          X = StandardScaler().fit_transform(X)
      self.X = torch.from_numpy(X)
      self.y = torch.from_numpy(y)

  def __len__(self):
      return len(self.X)

  def __getitem__(self, i):
      return self.X[i], self.y[i]

In [None]:
class MLP(nn.Module):
  '''
    Multilayer Perceptron for regression.
  '''
  def __init__(self):
    super().__init__()
    # TASK: how many parameters does this network have?
    self.layers = nn.Sequential(
      nn.Linear(13, 64),
      nn.ReLU(),
      nn.Linear(64, 32),
      nn.ReLU(),
      nn.Linear(32, 1)
    )


  def forward(self, x):
    '''
      Forward pass
    '''
    return self.layers(x)

In [None]:
# Prepare Boston dataset
dataset = BostonDataset(X, y)
trainloader = torch.utils.data.DataLoader(dataset, batch_size=10, shuffle=True, num_workers=1)

In [None]:
# Initialize the MLP
mlp = MLP()

# Define the loss function and optimizer
# TASK: what is L1 loss? what other loss could we use?
loss_function = nn.L1Loss()
optimizer = torch.optim.Adam(mlp.parameters(), lr=1e-4)

In [None]:
# Run the training loop
for epoch in range(0, 5): # 5 epochs at maximum
  
  # Print epoch
  print(f'Starting epoch {epoch+1}')
  
  # Set current loss value
  current_loss = 0.0
  
  # Iterate over the DataLoader for training data
  for i, data in enumerate(trainloader, 0):
    
    # Get and prepare inputs
    inputs, targets = data
    inputs, targets = inputs.float(), targets.float()
    targets = targets.reshape((targets.shape[0], 1))
    
    # Zero the gradients
    optimizer.zero_grad()
    
    # Perform forward pass
    outputs = mlp(inputs)
    
    # Compute loss
    loss = loss_function(outputs, targets)
    
    # Perform backward pass
    loss.backward()
    
    # Perform optimization
    optimizer.step()
    
    # Print statistics
    current_loss += loss.item()
    if i % 10 == 0:
        print('Loss after mini-batch %5d: %.3f' %
              (i + 1, current_loss / 500))
        current_loss = 0.0

# Process is complete.
print('Training process has finished.')

Starting epoch 1
Loss after mini-batch     1: 0.043
Loss after mini-batch    11: 0.426
Loss after mini-batch    21: 0.442
Loss after mini-batch    31: 0.461
Loss after mini-batch    41: 0.452
Loss after mini-batch    51: 0.472
Starting epoch 2
Loss after mini-batch     1: 0.047
Loss after mini-batch    11: 0.432
Loss after mini-batch    21: 0.450
Loss after mini-batch    31: 0.434
Loss after mini-batch    41: 0.485
Loss after mini-batch    51: 0.434
Starting epoch 3
Loss after mini-batch     1: 0.049
Loss after mini-batch    11: 0.458
Loss after mini-batch    21: 0.435
Loss after mini-batch    31: 0.423
Loss after mini-batch    41: 0.417
Loss after mini-batch    51: 0.480
Starting epoch 4
Loss after mini-batch     1: 0.051
Loss after mini-batch    11: 0.436
Loss after mini-batch    21: 0.417
Loss after mini-batch    31: 0.452
Loss after mini-batch    41: 0.447
Loss after mini-batch    51: 0.438
Starting epoch 5
Loss after mini-batch     1: 0.043
Loss after mini-batch    11: 0.442
Loss 

In [None]:
#TASK: how well does this model do? What happens when you change the network architecture? (try different modifications, e.g.: more layers/less layers; wider network; other activation functions). 
# What happens when you change the loss function?