<a href="https://colab.research.google.com/github/hangsheng0625/AI_story_generator/blob/main/Week_01_FIT5215_Tute1b_PyTorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# <span style="color:#0b486b">  FIT3181: Deep Learning (2024)</span>
***
*Lecturer (Malaysia):*  **Dr Arghya Pal** | arghya.pal@monash.edu <br/>
*Lecturer (Malaysia):*  **Dr Lim Chern Hong** | lim.chernhong@monash.edu <br/>

*CE/Lecturer (Clayton):*  **Dr Trung Le** | trunglm@monash.edu <br/>
*Lecturer (Clayton):* **Prof Dinh Phung** | dinh.phung@monash.edu <br/>
  <br/>
<br/>
School of IT and Faculty of Information Technology, Monash University, Malaysia and Australia
***

# Tutorial 1b: Logistic Regression with PyTorch


This tutorial aims to introduce the Logistic Regression which can be regarded as a feed-forward neural network with one layer.

## Import Necessary Libraries

In [None]:
import torch
import torch.nn as nn
import numpy as np
from sklearn import datasets
from sklearn.preprocessing import StandardScaler  # for feature scaling
from sklearn.model_selection import train_test_split  # for train/test split

## Prepare Data

We first load the `breast cancer` dataset from `sklean` datasets and then split into 80% for training and 20% for testing.

In [None]:
# Prepare data
bc = datasets.load_breast_cancer()
X, y = bc.data, bc.target

n_samples, n_features = X.shape
print(f'number of samples: {n_samples}, number of features: {n_features}')

# split data to 80% for training and 20% for testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1234)


number of samples: 569, number of features: 30


**<span style="color:red">Exercise 1</span>:** Write the code to print out the first 10 feature vectors in `X_train` and `y_train`. Write the code to show the unique labels in `y_train`.

In [None]:
#Your answer here




We use `StandardScaler()` from `sklearn` to normalize the training/testing sets. We convert the training/testing numpy arrays to PyTorch arrays and then reshape them.

In [None]:
# scale data
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

# convert to tensors
X_train = torch.from_numpy(X_train.astype(np.float32))
X_test = torch.from_numpy(X_test.astype(np.float32))
y_train = torch.from_numpy(y_train.astype(np.float32))
y_test = torch.from_numpy(y_test.astype(np.float32))

# reshape y tensors
y_train = y_train.view(y_train.shape[0], 1)
y_test = y_test.view(y_test.shape[0], 1)

## Training/Testing Procedure

We now present the `fundamental workflow of PyTorch` including training a model based on the training set and testing the trained model on the testing set. This fundamental workflow is the same for various PyTorch models.

### Prepare Model

First, we need to declare and define a model, which is a computational graph showing how to compute the model output from the input vector $x$. Specifically, given a data point $x$ (i.e., [1,30]) a batch $x$ (i.e., [64,30]), or even the entire training set $x$ (i.e., [569,30]), we compute
- logits = xW + b
- pred_probs = softmax(logits)

In [None]:
# Create model
# f = wx + b, softmax at the end
class LogisticRegression(nn.Module):

    def __init__(self, n_input_features):
        super(LogisticRegression, self).__init__()
        self.linear = nn.Linear(n_input_features, 2)

    def forward(self, x):
        logits = self.linear(x)
        pred_probs = torch.nn.Softmax(dim=-1)(logits) #for asking question only
        return logits #return the logits

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = LogisticRegression(n_features).to(device)  #load the model to the current device

**<span style="color:red">Exercise 2</span>:** Explain the forward function. What are the meanings and dimensions of `logits` and `pred_probs`?

### Prepare Loss and Optimizer

We declare `loss_fn` as the cross entropy loss. To train our logistic regression, we invoke the SGD optimizer with the learning rate $0.01$.

In [None]:
# Loss and optimizer
learning_rate = 0.01
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

### Train Model By Feeding the Training Set All-in-Once

We train the model in $100$ epochs (i.e., going through the entire training set $100$ times). Here in each epoch, we input entire training set to the model to compute the cross-entropy loss over the training set and then use the optimizer to update the model parameters (i.e., W and b).

In [None]:
# training loop
num_epochs = 200

for epoch in range(num_epochs):
    # forward pass and loss
    y_predicted = model(X_train)

    loss = loss_fn(y_predicted, y_train.squeeze().long())

    # backward pass to compute the gradient
    loss.backward()

    # updates the model parameter based on the gradient
    optimizer.step()

    # zero gradients
    optimizer.zero_grad()

    if (epoch+1) % 10 == 0:
        print(f'epoch: {epoch+1}, loss = {loss.item():.4f}')

epoch: 10, loss = 0.5223
epoch: 20, loss = 0.3920
epoch: 30, loss = 0.3230
epoch: 40, loss = 0.2797
epoch: 50, loss = 0.2497
epoch: 60, loss = 0.2275
epoch: 70, loss = 0.2103
epoch: 80, loss = 0.1965
epoch: 90, loss = 0.1852
epoch: 100, loss = 0.1758
epoch: 110, loss = 0.1677
epoch: 120, loss = 0.1607
epoch: 130, loss = 0.1547
epoch: 140, loss = 0.1493
epoch: 150, loss = 0.1445
epoch: 160, loss = 0.1402
epoch: 170, loss = 0.1364
epoch: 180, loss = 0.1329
epoch: 190, loss = 0.1297
epoch: 200, loss = 0.1267


### Evaluate Trained Model on Testing Set

We compute the accuracy on the testing set (i.e., the testing accuracy).

In [None]:
with torch.no_grad():
  pred_probs = model(X_test.type(torch.float32))
  y_predicted = torch.argmax(pred_probs.data, 1)
  corrects = (y_predicted == y_test.type(torch.long)).sum().item()
  totals = y_test.size(0)
  print(f'totals = {totals}')
  acc = float(corrects)/totals
  print(f'accuracy = {acc:.4f}')

totals = 114
accuracy = 60.7895


**<span style="color:red">Exercise 3</span>:** Explain the code above to compute the testing accuracy. What are `pred_probs` and `y_predicted`?

**<span style="color:red">Exercise 4</span>:** Package the above code in a function, allowing you to try with different learning rates. Then, train the logistic regression models with different learning rates (i.e., 0.05, 0.04, 0.005, 0.001) and observe the loss tendency and testing accuracies.

In [None]:
#Your answer here





----

**The end**