<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Objective" data-toc-modified-id="Objective-1">Objective</a></span></li><li><span><a href="#Our-Data" data-toc-modified-id="Our-Data-2">Our Data</a></span></li><li><span><a href="#One-layer-NN" data-toc-modified-id="One-layer-NN-3">One-layer NN</a></span><ul class="toc-item"><li><span><a href="#nn.functional" data-toc-modified-id="nn.functional-3.1">nn.functional</a></span></li><li><span><a href="#Logistic-Regression-Network" data-toc-modified-id="Logistic-Regression-Network-3.2">Logistic Regression Network</a></span></li></ul></li><li><span><a href="#Train" data-toc-modified-id="Train-4">Train</a></span></li></ul></div>

In [1]:
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

In [2]:
# Set seed for reproducibility
seed = 9
np.random.seed(seed)

---------

# Objective

The goal of this notebook is to write a simple two-layer feedforward neural net in pure python. We'd also like a way to visualize how the weight space changes as we perform gradient descent.

----------

# Our Data

We'll use the breast cancer dataset from `sklearn`, which is a binary classification task with imbalanced classes (212 malignant tumors and 357 benign tumors). Each observed tumor has 30 attributes.

In [3]:
# Load data
X, y = load_breast_cancer(return_X_y=True)
X.shape, y.shape

((569, 30), (569,))

In [4]:
# Train-test-split
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=seed)
X_train.shape, X_val.shape

((455, 30), (114, 30))

In [5]:
# Normalize input
mu, sigma = np.mean(X_train), np.std(X_train)
X_train = (X_train - mu) / sigma
X_val = (X_val - mu) / sigma

-----------

# One-layer NN

To start, we'll build a single layer classifier consisting of just a single linear transformation passed to an output sigmoid function. 

## nn.functional

Our version of `torch.nn.functional`

In [6]:
def sigmoid(X):
    """Pass an minibatch through a sigmoid layer."""
    return 1 / (1 + np.exp(-X))

def accuracy(y, y_hat):
    """Compute accuracy given soft binary predictions."""
    y_pred = y_hat > 0.5
    return (y_pred == y).mean()

def binary_cross_entropy(y, y_hat):
    """Return binary cross entropy given targets and predictions."""
    return np.where(y==1, -np.log(y_hat), -np.log(1 - y_hat)).mean()

def binary_cross_entropy_grad(X, y, y_hat):
    """Return the gradient of weights and bias w.r.t binary cross entropy loss."""
    grad_w = 1 / len(y) * (y_hat - y) @ X 
    grad_b = np.mean(y_hat - y)
    return grad_w, grad_b

## Logistic Regression Network

Let's build a simple class to model our single layer neural network, which is just ordinary logistic regression.

In [7]:
class OneLayerBinaryClassifier:
    """Container for a single layer binary classifier."""
    
    def __init__(self, n_inp):
        """Initialise weights and bias."""
        self.linear = np.random.uniform(-0.1, 0.1, (n_inp, 1))
        self.bias = np.zeros(1)
        self.out = sigmoid
            
    def forward(self, X):
        """Pass training data through the network."""
        return self.out(X @ self.linear + self.bias).squeeze(1)
    
    def step(self, X, y, y_hat, lr):
        """Perform one step of gradient descent."""
        grad_w, grad_b = binary_cross_entropy_grad(X, y, y_hat)
        self.linear -= lr * grad_w.reshape(-1, 1)
        self.bias -= lr * grad_b

----------

# Train

Now we're ready to put our model to the test.

In [8]:
class BinaryTrainer:
    """Container for training a binary classifier."""
    
    def __init__(self, model, train_dl, val_dl, lr=1e-3):
        self.model = model
        self.train_dl = train_dl
        self.val_dl = val_dl
        self.lr = lr
        
    def train(self, n_epochs):
        """Perform gradient descent for a number of epochs."""
        for epoch in range(n_epochs):
            X, y = self.train_dl
            y_hat = self.model.forward(X)
            loss = binary_cross_entropy(y, y_hat)
            self.model.step(X, y, y_hat, self.lr)
            
            val_loss, val_acc = self.evaluate(self.val_dl)
            print(f"{epoch= :2d} | {loss= :.3f} | {val_loss= :.3f} | {val_acc= :.3f}")
            
    def evaluate(self, dl):
        """Return loss and accuracy on validation or test set."""
        X, y = dl
        y_hat = self.model.forward(X)
        return binary_cross_entropy(y, y_hat), accuracy(y, y_hat)

In [9]:
# Bundle inputs and targets
train_dl = (X_train, y_train)
val_dl = (X_val, y_val)

In [10]:
# Initialise model & trainer
model = OneLayerBinaryClassifier(X_train.shape[1])
trainer = BinaryTrainer(model, train_dl, val_dl,lr=0.1)

In [11]:
trainer.train(10)

epoch=  0 | loss= 0.688 | val_loss= 0.679 | val_acc= 0.351
epoch=  1 | loss= 0.677 | val_loss= 0.667 | val_acc= 0.351
epoch=  2 | loss= 0.666 | val_loss= 0.656 | val_acc= 0.404
epoch=  3 | loss= 0.655 | val_loss= 0.645 | val_acc= 0.482
epoch=  4 | loss= 0.645 | val_loss= 0.634 | val_acc= 0.588
epoch=  5 | loss= 0.636 | val_loss= 0.624 | val_acc= 0.719
epoch=  6 | loss= 0.626 | val_loss= 0.614 | val_acc= 0.798
epoch=  7 | loss= 0.617 | val_loss= 0.605 | val_acc= 0.860
epoch=  8 | loss= 0.609 | val_loss= 0.596 | val_acc= 0.877
epoch=  9 | loss= 0.601 | val_loss= 0.587 | val_acc= 0.877


In [12]:
trainer.train(10)

epoch=  0 | loss= 0.593 | val_loss= 0.578 | val_acc= 0.886
epoch=  1 | loss= 0.585 | val_loss= 0.570 | val_acc= 0.886
epoch=  2 | loss= 0.577 | val_loss= 0.562 | val_acc= 0.877
epoch=  3 | loss= 0.570 | val_loss= 0.555 | val_acc= 0.886
epoch=  4 | loss= 0.563 | val_loss= 0.548 | val_acc= 0.886
epoch=  5 | loss= 0.557 | val_loss= 0.541 | val_acc= 0.904
epoch=  6 | loss= 0.550 | val_loss= 0.534 | val_acc= 0.886
epoch=  7 | loss= 0.544 | val_loss= 0.527 | val_acc= 0.895
epoch=  8 | loss= 0.538 | val_loss= 0.521 | val_acc= 0.895
epoch=  9 | loss= 0.532 | val_loss= 0.515 | val_acc= 0.904
