# Introduction

In general, when working with **PyTorch** there are 4 steps to the training pipeline.

1. Preprare the data (convert to tensors, reshape etc)
2. Construct the model (usually as a class)
3. Define the loss and optimiser
4. Write a training loop

This time, we will practice these steps on a **logistic regression** problem.

<br>

**Libraries**

In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import torch
import torch.nn as nn

# 1. Preprare the data

We load the famous **breast cancer dataset** from sklearn. The aim is to predict whether a patient has breast cancer from several measurements so this is a **binary classification** problem. 

In [2]:
# Load data
data = datasets.load_breast_cancer()
X, y = data.data, data.target
print('X shape:',X.shape)
print('y shape:',y.shape)

X shape: (569, 30)
y shape: (569,)


In [3]:
# Split data (80/20)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

In [4]:
# Scale data (mean=0, std=1)
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

In [5]:
# Convert to tensors
X_train = torch.from_numpy(X_train.astype(np.float32))
y_train = torch.from_numpy(y_train.astype(np.float32))
X_test = torch.from_numpy(X_test.astype(np.float32))
y_test = torch.from_numpy(y_test.astype(np.float32))

In [6]:
# Reshape tensors
y_train = y_train.view(-1,1)
y_test = y_test.view(-1,1)
print('y_train shape:', y_train.shape)
print('y_test shape:', y_test.shape)

y_train shape: torch.Size([455, 1])
y_test shape: torch.Size([114, 1])


# 2. Construct the model

When working with neural networks, we need to **inherit** the '**nn.Module**' and define a '**forward**' attribute. 

The inheritance part is done to get access to **attributes** like '**model.parameters()**', which are used in training.

In [7]:
# Model
class LogisticRegression(nn.Module):
    def __init__(self, n_feats):
        super().__init__()
        
        # Define layers
        self.lin = nn.Linear(n_feats,1)
        self.sig = nn.Sigmoid()
        
    def forward(self, x):
        return self.sig(self.lin(x))
    
model = LogisticRegression(n_feats = X.shape[1])

# 3. Define the loss and optimiser

The loss function is called from the '**torch.nn**' library and the optimiser from the '**torch.optim**' library. 

In [8]:
# Binary cross entropy loss
loss = nn.BCELoss()

# Adam optimiser
optimiser = torch.optim.Adam(params = model.parameters(), lr=0.01)

# 4. Write a training loop

The most **important** thing to remember with PyTorch is that after every epoch, you have to **zero the gradients** (otherwise they will accumulate and explode).

In [9]:
n_iters = 200

# Loop
for epoch in range(n_iters):
    # Forward pass
    y_preds = model(X_train)
    L = loss(y_preds, y_train)
    
    # Backprop
    L.backward()
    
    # Update parameters
    optimiser.step()
    
    # Zero gradients
    optimiser.zero_grad()
    
    # Print loss
    if epoch % 20 == 0:
        print(f'Epoch {epoch}, loss {L.item():.3f}')

Epoch 0, loss 0.946
Epoch 20, loss 0.252
Epoch 40, loss 0.156
Epoch 60, loss 0.127
Epoch 80, loss 0.112
Epoch 100, loss 0.103
Epoch 120, loss 0.096
Epoch 140, loss 0.090
Epoch 160, loss 0.086
Epoch 180, loss 0.082


# Evaluate

Finally, we calculate the accuracy on the test set.

In [10]:
# Turn gradient tracking off
with torch.no_grad():
    acc = model(X_test).round().eq(y_test).sum() / len(y_test.numpy())
    print(f'Accuracy on test set: {100*acc.item():.2f} %')

Accuracy on test set: 95.61 %


**Check out my other PyTorch tutorials**

1. [PT1 - Linear Regression with PyTorch](https://www.kaggle.com/code/samuelcortinhas/pt1-linear-regression-with-pytorch/notebook)
2. [PT2 - Logistic Regression with PyTorch](https://www.kaggle.com/code/samuelcortinhas/pt2-logistic-regression-with-pytorch)
3. [PT3 - Neural Networks with PyTorch](https://www.kaggle.com/code/samuelcortinhas/pt3-neural-networks-with-pytorch)
4. [PT4 - CNNs with PyTorch](https://www.kaggle.com/samuelcortinhas/pt4-cnns-with-pytorch)
5. [PT5 - Save & load models with PyTorch](https://www.kaggle.com/samuelcortinhas/pt5-save-load-models-with-pytorch)