## Spoilers
- build a model for binary classification
- understanding the concept of logits and how it is related to probabilities
- use binary cross entropy as a loss function to train model
- use the loss function to handle imbalanced dataset
- understanding the concepts of decision boundary and separability
- learn how the choice of a classification threshold impacts evaluation metrics
- build ROC and precision-recall curves to evaluate model performance

In [1]:
from pathlib import Path
import sys
path_src = str(Path().resolve().parents[0])
sys.path.append(path_src)
from stepbystep.v0 import StepByStep


import numpy as np
import torch
import torch.optim as optim
import torch.nn as nn
import torch.functional as F
from torch.utils.data import DataLoader, TensorDataset
from sklearn.datasets import make_moons
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, roc_curve, precision_recall_curve, auc

## A Simple Classification Problem

- In classification problem, we're trying to predict which class a data point belongs to.

## Data Generation

In [2]:
X, y = make_moons(n_samples=100, noise=0.3, random_state=0)

In [3]:
x_train, x_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=11)

In [4]:
sc = StandardScaler()
sc.fit(x_train)

x_train = sc.transform(x_train)
x_val   = sc.transform(x_val)

-> Remember, only use the training set to fit the StandardScaler and then use it transform() method to apply the pre-processing step to all datasets. 
- REMEMBER: You should never fit() the test set, ONLY Train set.

## Data Preparation

In [5]:
torch.manual_seed(13)

# Builds tensors from numpy arrays
x_train_tensor = torch.from_numpy(x_train).float()
y_train_tensor = torch.from_numpy(y_train.reshape(-1,1)).float()

x_val_tensor = torch.from_numpy(x_val).float()
y_val_tensor = torch.from_numpy(y_val.reshape(-1,1)).float()

# Builds dataset contain All data points
train_dataset = TensorDataset(x_train_tensor, y_train_tensor)
val_dataset = TensorDataset(x_val_tensor, y_val_tensor)

# Builds data loader contain mini-batches
train_loader = DataLoader(dataset=train_dataset, batch_size=16, shuffle=True)
val_loader = DataLoader(dataset=val_dataset, batch_size=16)

## Model

- Using Logistic Regression

In [6]:
torch.manual_seed(13)
model1 = torch.nn.Sequential(
    torch.nn.Linear(2,1),
    torch.nn.Sigmoid())
model1.state_dict()

OrderedDict([('0.weight', tensor([[-0.5773, -0.0292]])),
             ('0.bias', tensor([0.4392]))])

## LOSS

#### BCELoss

## FULLCODE

In [7]:
# 1. Data Preparation
torch.manual_seed(13)

# Builds tensors from numpy arrays
x_train_tensor = torch.from_numpy(x_train).float()
y_train_tensor = torch.from_numpy(y_train.reshape(-1,1)).float()

x_val_tensor = torch.from_numpy(x_val).float()
y_val_tensor = torch.from_numpy(y_val.reshape(-1,1)).float()

# Builds dataset contain All data points
train_dataset = TensorDataset(x_train_tensor, y_train_tensor)
val_dataset = TensorDataset(x_val_tensor, y_val_tensor)

# Builds data loader contain mini-batches
train_loader = DataLoader(dataset=train_dataset, batch_size=16, shuffle=True)
val_loader = DataLoader(dataset=val_dataset, batch_size=16)

In [8]:
# 2. Model Configuration
# learning rate
learning_rate = 1e-3

# Build the model
torch.manual_seed(42)
model_class = torch.nn.Sequential(
    torch.nn.Linear(2, 1),
    torch.nn.Sigmoid()
)

# Optimizer
optimizer_class = torch.optim.SGD(model_class.parameters(), lr=learning_rate)

# Loss function
loss_function_class = torch.nn.BCELoss(reduction='mean')

In [9]:
# 3. Training Model
n_epochs = 100
sbs = StepByStep(
    model=model_class,
    loss_fn=loss_function_class,
    optimizer=optimizer_class,
)

sbs.set_loaders(train_loader, val_loader)
sbs.train(n_epochs=n_epochs)

In [11]:
logits_val = sbs.predict(x_val_tensor)
confusion_matrix(y_val, logits_val >= 0.5)

array([[ 5,  4],
       [10,  1]])