# Demo FNN Training in PyTorch
This notebook shows a quick demo of training a feedforward neural network with sklearn breast cancer data.
The main steps are 
- 1.Data preparation: load raw data and create an torch DataLoader object for loading data in training step.
- 2.Defining FNN model: define dimensions and activation functions of hidden layers.
- 3.Train FNN model: define objective fucntion, optimizer and training loop. In each training epoch, update FNN coefficients with the gradient of the loss.

## Data Preparation

### Load raw data

In [1]:
from sklearn.datasets import load_breast_cancer
import numpy as np
from sklearn.preprocessing import StandardScaler
np.random.seed(seed=5)
data = load_breast_cancer()
x = data.data[:400, 0:3]
y = data.target[:400]
sc = StandardScaler()
x = sc.fit_transform(x)
print(x[:4],y[:4])

[[ 1.02657917 -2.08243377  1.19736196]
 [ 1.74850201 -0.28733171  1.60760452]
 [ 1.50226476  0.55799375  1.4898121 ]
 [-0.81180573  0.34666238 -0.6393874 ]] [0 0 0 0]


### Create dataLoader

In [2]:
import torch
from torch.utils.data import Dataset, DataLoader
# define dataset class
class dataset(Dataset):
    def __init__(self,x,y):
        self.x = torch.tensor(x, dtype=torch.float32)
        self.y = torch.tensor(y, dtype=torch.float32)
        self.length = self.x.shape[0]

    def __getitem__(self, idx):
        return self.x[idx], self.y[idx]

    def __len__(self):
        return self.length
trainset = dataset(x,y)
print(trainset[0:4])
# create dataloader
trainloader = DataLoader(trainset, batch_size=4, shuffle=False)
print(next(iter(trainloader)))
trainloader = DataLoader(trainset, batch_size=50, shuffle=True)

(tensor([[ 1.0266, -2.0824,  1.1974],
        [ 1.7485, -0.2873,  1.6076],
        [ 1.5023,  0.5580,  1.4898],
        [-0.8118,  0.3467, -0.6394]]), tensor([0., 0., 0., 0.]))
[tensor([[ 1.0266, -2.0824,  1.1974],
        [ 1.7485, -0.2873,  1.6076],
        [ 1.5023,  0.5580,  1.4898],
        [-0.8118,  0.3467, -0.6394]]), tensor([0., 0., 0., 0.])]


## Define FNN Model
Denote the parameters in each layer as $\boldsymbol\theta = [\boldsymbol\theta_1, \boldsymbol\theta_2, \boldsymbol\theta_3]$, for each input $x$, 
$$
z_1 = \sigma[f_1(\boldsymbol x;\boldsymbol\theta_1)], z_2 = \sigma[f_2(\boldsymbol z_1;\boldsymbol\theta_2)], p = \sigma[f_3(\boldsymbol z_2;\boldsymbol\theta_3)],
$$
which is impletemented as
```python
p = model(x)
```

In [3]:
## Define FNN Model
import torch
from torch import nn
torch.manual_seed(0)
class Net(nn.Module):
    def __init__(self, input_shape, n1=3, n2=3):
        super(Net, self).__init__()
        self.f1 = nn.Linear(input_shape,n1)
        self.f2 = nn.Linear(n1, n2)
        self.f3 = nn.Linear(n2,1)

    def forward(self, x):
        z1 = torch.sigmoid(self.f1(x))
        z2 = torch.sigmoid(self.f2(z1))
        x = torch.sigmoid(self.f3(z2))
        return x

model = Net(input_shape=x.shape[1])

## Train FNN Model

### Loss Function

Use binary cross-entropy for binary $y$ and $p\in(0,1)$
$$
H(y,p)= -y\ln p - (1-y) \ln(1-p).
$$
Loss function for each record
$$
J_i(\boldsymbol\theta;\boldsymbol x_i,y_i) = H[y_i,p(\boldsymbol x_i;\!\boldsymbol\theta)].
$$
Total loss of all training records
$$
J(\boldsymbol\theta) = \sum_i{J_i(\boldsymbol\theta;\boldsymbol x_i,y_i)}.
$$
In each training epoch, update model as
$$
\boldsymbol\theta \leftarrow \boldsymbol\theta - \lambda \frac{\partial}{\partial \boldsymbol\theta}J(\boldsymbol\theta).
$$

### Model Training

In [4]:
# Hyperparameter
learning_rate = 0.1
epochs = 400

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
loss_fn = nn.BCELoss()

# Training loop
for i in range(epochs):
    total_loss = 0
    correct =0
    total = 0
    for j, (x_train, y_train) in enumerate(trainloader): # for each batch

        # calculate loss
        p = model(x_train)
        loss = loss_fn(p, y_train.reshape(-1,1))

        # backpropagation for gradients
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        # Loss accumulation for all batches
        total_loss += loss.item()

        # Accuracy for the all batches
        predicted = (p.detach().numpy() > 0.5).astype(int).reshape(-1)
        correct += (predicted == y_train.numpy()).sum()
        total += y_train.size(0)
    
    # output loss and accuracy for every 50 epochs
    if (i+1)%50 ==0:
        avg_loss = total_loss / len(trainloader)
        accuracy = correct /total
        print(f"epoch {i+1}: loss={round(loss.item(),3)}, acuracy={accuracy}")
torch.save(model.state_dict(), "model_weights.pth")

epoch 50: loss=0.695, acuracy=0.5675
epoch 100: loss=0.694, acuracy=0.5675
epoch 150: loss=0.342, acuracy=0.89
epoch 200: loss=0.333, acuracy=0.8925
epoch 250: loss=0.329, acuracy=0.895
epoch 300: loss=0.279, acuracy=0.895
epoch 350: loss=0.209, acuracy=0.8975
epoch 400: loss=0.257, acuracy=0.8975


### Other metrics

In [5]:
import importlib
import sys
sys.path.append('..')  

import torch
import pandas as pd
from utils.ROCanalysis import calculate_roc_metrics

model_loaded = Net(input_shape=x.shape[1])
model_loaded.load_state_dict(torch.load("model_weights.pth", weights_only=True))
with torch.no_grad():
    all_data = []
    all_targets = []
    for data, target in trainloader:
        all_data.append(data)
        all_targets.append(target)
    all_data = torch.cat(all_data, dim=0)
    all_targets = torch.cat(all_targets, dim=0)
    all_predictions = model_loaded(all_data)
    predictions = all_predictions.detach().numpy()
    targets = all_targets.detach().numpy()

roc_df = pd.DataFrame({
    'p': predictions.flatten(),
    'y': targets,
    'w': np.ones(len(targets))
})
roc_metrics = calculate_roc_metrics(roc_df, 'p', 'y')
print(roc_metrics)

{'auc': 96.03, 'gini': 92.06, 'ks': 80.785}
