## BME i9400
## Fall 2024
### Neural Networks in PyTorch


## Classes and objects in Python
- An ```object``` is a structure that contains both data (variables) and functions (methods)
- A ```class``` is a blueprint for creating objects
- Classes are used to organize programs and make them more modular

## Example: Medical record class
- We will create a class to represent a patient's medical record
- The class will have the following attributes:
    - Name
    - Age
    - Height
    - Weight
    - BMI
- The class will have the following methods:
    - ```calculate_bmi```: calculates the BMI of the patient
    - ```print_record```: prints the patient's record
    - ```update_weight```: updates the patient's weight
    - ```update_height```: updates the patient's height
    - ```update_age```: updates the patient's age
    - ```update_name```: updates the patient's name
    - ```update_record```: updates the patient's record

In [1]:
class MedicalRecord:
    def __init__(self, name, age, height, weight):
        self.name = name
        self.age = age
        self.height = height
        self.weight = weight
        self.bmi = self.calculate_bmi()

    def calculate_bmi(self):
        return self.weight / (self.height**2)

    def print_record(self):
        print(f'Name: {self.name}')
        print(f'Age: {self.age}')
        print(f'Height: {self.height}')
        print(f'Weight: {self.weight}')
        print(f'BMI: {self.bmi}')

    def update_weight(self, weight):
        self.weight = weight
        self.bmi = self.calculate_bmi()

    def update_height(self, height):
        self.height = height
        self.bmi = self.calculate_bmi()

    def update_age(self, age):
        self.age = age

    def update_name(self, name):
        self.name = name

    def update_record(self, name, age, height, weight):
        self.name = name
        self.age = age
        self.height = height
        self.weight = weight
        self.bmi = self.calculate_bmi()

## Creating an object of the MedicalRecord class

In [8]:
patient = MedicalRecord('Jacek', 29, 1.77, 73)
# patient is an instance of the MedicalRecord class (an object)

## Obtaining *attributes* of a class

In [6]:
print(patient.name)
print(patient.age)

Jacek
29


## Calling *methods* of a class

In [7]:
patient.print_record()

Name: Jacek
Age: 29
Height: 1.77
Weight: 73
BMI: 23.301094832264035


## From theory to practice
- We have seen the theory of multilayer perceptrons (MLPs) in the previous lecture
- We now turn to the practical aspects of implementing MLPs in PyTorch
- We will begin by implementing logistic regression in PyTorch

## Library imports

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

## Generating a synthetic dataset
- We will make use of scikit-learn's ```make_classification``` function to generate a simple binary classification dataset
- We will split the dataset into training and validation sets
- We will then convert the NumPy arrays to PyTorch tensors

In [None]:
# generate features and labels
X, y = make_classification(n_samples=100, n_features=2, n_informative=2, n_redundant=0, random_state=42)

# train/validation split
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# convert features and labels into PyTorch tensors
X_train = torch.tensor(X_train, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.long)
X_val = torch.tensor(X_val, dtype=torch.float32)
y_val = torch.tensor(y_val, dtype=torch.long)

## Logistic regression in PyTorch
- A neural network in PyTorch is an instance of the ```nn.Module``` class
- Each neural network consists of a series of *layers*
- In our simple case, we will use a single layer for logistic regression
- Each nn.Module subclass must implement a ```forward``` method that defines the forward pass of the network
- Remember that ``forward pass'' means computing the output of the network given the input

In [None]:
# Define the logistic regression model
class LogisticRegressionModel(nn.Module):
    def __init__(self, input_size, output_size): # input_size: number of features, output_size: number of classes
        super(LogisticRegressionModel, self).__init__()
        # Single layer for logistic regression
        self.linear = nn.Linear(input_size, output_size) # "linear" is the *name* of our layer
        self.softmax = nn.Softmax(dim=1)  # Softmax activation function to convert outputs to probabilities

    def forward(self, x):
        return self.softmax(self.linear(x))

## Creating an instance of the model
- So far we have defined a class, which is a blueprint for the model
- To actually create a model that we can work with, we need to *instantiate* the class
- Instantiating a class means creating an *object* of that class

In [None]:
input_size = X_train.shape[1]  # Number of features
output_size = 2  # Binary classification
model = LogisticRegressionModel(input_size, output_size)

## Inspecting the model
- The model object provides a convenient way to access the layers and parameters of the model
- Let's take a look at the weights and biases of the model

In [None]:
model.linear.weight

In [None]:
model.linear.bias

### Q1: Why are there two rows in the weight matrix?
### Q2: Why are there two columns in the weight matrix?
### Q3: Where did the weights and biases come from?

## Evaluating the (untrained) model on random inputs
- Before training the model, let's see what the model predicts on random inputs
- This will help us to understand how the forward pass works in PyTorch

In [None]:
# Random input
x = torch.randn(1, input_size)

# Forward pass
output = model(x)
output

## Evaluating model on a batch of inputs
- A *batch* refers to a set of examples that is used to compute the gradient of the loss function during training
- A typical batch size is 32, 64, or 128 examples
- An *epoch* refers to a complete pass through the dataset, such that each examples has appeared in a batch once

In [None]:
# Random batch of inputs
X_batch = torch.randn(5, input_size)

# Forward pass
output = model(X_batch)
output

## Defining the loss function and optimizer
- PyTorch provides a wide range of loss functions and optimizers
- For logistic regression, we will use the cross-entropy loss and stochastic gradient descent (SGD) optimizer
- An important parameter for the optimizer is the *learning rate*, which controls the size of the updates to the weights

In [None]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01) # learning rate = 0.01

## Training the model
- We are now ready to run our stochastic gradient descent (SGD) algorithm to train the model
- For simplicity, we will use all of the data in a single batch (batch size = 100 = number of training examples)
- We will train over 100 epochs
- At each epoch, we will compute the loss, compute the gradients, and update the weights

In [None]:
# Train the logistic regression model
epochs = 100
for epoch in range(epochs):
    
    ## Forward pass
    
    # Compute the predicted outputs
    outputs = model(X_train)
    
    # Compute the loss function using the current predictions
    loss = criterion(outputs, y_train)

    ## Backward pass

    # Zero gradients, backward pass, and update weights
    optimizer.zero_grad() # zero them from the previous iteration
    loss.backward() # compute the gradients!
    optimizer.step() # update the weights!

    # Print loss every 20 epochs
    if (epoch+1) % 20 == 0:
        print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}')

## Evaluating the trained model
- We can now evaluate the trained model on the validation set

In [None]:
# Compute the predicted outputs
outputs = model(X_val)

# compute the estimated probabilities of the positive class
py_hat_val = outputs[:,1]

# convert the probabilities to numpy format
py_hat_val = py_hat_val.detach().numpy()

# measure the area under the ROC curve
from sklearn.metrics import roc_auc_score
roc_auc_score(y_val, py_hat_val)


### How did we do relative to random guessing?

## Implementing an MLP with a single hidden layer
- We will now implement a multilayer perceptron (MLP) with a single hidden layer
- For this, we will need to define a new class that inherits from ```nn.Module```

In [None]:
class SingleLayerMLP(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(SingleLayerMLP, self).__init__()
        # First layer (input to hidden)
        self.hidden = nn.Linear(input_size, hidden_size)
        # Activation function (ReLU introduces non-linearity)
        self.relu = nn.ReLU()
        # Output layer (hidden to output)
        self.output = nn.Linear(hidden_size, output_size)
        # softmax activation for output layer
        self.softmax = nn.Softmax(dim=1)

    def forward(self, x):
        # Pass input through the first layer, apply ReLU, then pass to output layer and apply softmax
        x = self.hidden(x)
        x = self.relu(x)
        x = self.output(x)
        x = self.softmax(x)
        return x

## Creating an instance of the single-layer MLP model
- We need to select the number of units in our hidden layer
- Remember that this is called a *hyperparameter* because it is not learned from the data
- We will begin with an arbitrary choice of 5 hidden units

In [None]:
hidden_size = 5  # Choose an arbitrary number of neurons for the hidden layer
model = SingleLayerMLP(input_size, hidden_size, output_size)

## Inspecting the model architecture is good practice

In [None]:
print(model)

## Define the optimizer and loss function (same as before)
- Now we will train our single-layer MLP model using the same cross-entropy loss and stochastic gradient descent optimizer

In [None]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

In [None]:
# Train the single-layer MLP model
epochs = 100
for epoch in range(epochs):
    # Forward pass
    outputs = model(X_train)
    loss = criterion(outputs, y_train)

    # Zero gradients, backward pass, and update weights
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Print loss every 20 epochs
    if (epoch+1) % 20 == 0:
        print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}')

## Evaluating the trained model

In [None]:
# Compute the predicted outputs
outputs = model(X_val)

# compute the estimated probabilities of the positive class
py_hat_val = outputs[:,1]

# convert the probabilities to numpy format
py_hat_val = py_hat_val.detach().numpy()

roc_auc_score(y_val, py_hat_val)

### How did we do relative to logistic regression?

## Creating an MLP with an arbitrary number of hidden layers
- PyTorch makes it easy to create MLPs with an arbitrary number of hidden layers
- We will create another class inheriting from ```nn.Module``` that allows us to specify the number of hidden layers
- In the code below, the list ```hidden_sizes``` contains the number of neurons in each hidden layer
- The number of elements in this list defines the number of hidden layers

In [None]:
class MultiLayerMLP(nn.Module):
    def __init__(self, input_size, hidden_sizes, output_size):
        super(MultiLayerMLP, self).__init__()
        # Define the layers sequentially
        layers = []
        in_size = input_size
        for h in hidden_sizes: # hidden_sizes 
            # Add a hidden layer with ReLU activation
            layers.append(nn.Linear(in_size, h))
            layers.append(nn.ReLU())
            in_size = h
        # Add the final output layer
        layers.append(nn.Linear(in_size, output_size))
        # Combine layers into a sequential model
        self.model = nn.Sequential(*layers)
        self.softmax = nn.Softmax(dim=1)

    def forward(self, x):
        return self.softmax(self.model(x))

## Creating an instance of the multi-layer MLP model

In [None]:
hidden_sizes = [10, 5]  # Two hidden layers with 10 and 5 neurons, respectively
model = MultiLayerMLP(input_size, hidden_sizes, output_size)
print(model)

## Training the model (same as before)

In [None]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

In [None]:
epochs = 100
for epoch in range(epochs):
    # Forward pass
    outputs = model(X_train)
    loss = criterion(outputs, y_train)

    # Zero gradients, backward pass, and update weights
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Print loss every 20 epochs
    if (epoch+1) % 20 == 0:
        print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}')

## Evaluating the trained model

In [None]:
# Compute the predicted outputs
outputs = model(X_val)

# compute the estimated probabilities of the positive class
py_hat_val = outputs[:,1]

# convert the probabilities to numpy format
py_hat_val = py_hat_val.detach().numpy()

roc_auc_score(y_val, py_hat_val)