# Cat-astrophe

You've recently been hired by the city of Dogtopia to solve a growing problem with cats. You see, the residents of Dogtopia all hate cats, and many of them are allergic to cats. In recent years, cats have found their ways into the city gate, past the cat-traps, and started multiplying within the city perimeters. A team of rookie machine learning scientists developed a Convolutional Neural Network that take in images from security cameras and detect cats, however since they lack experience, they can't figure out why their neural network performs poorly. You, being a industry expert with many years of experience, are hired by the city to look over the scientists' work, and make the adjustments necessary to fix their neural network.

In [None]:
!pip install torch torchmetrics

In [None]:
! nvidia-smi

## Visualizing our Dataset

The fine folks of Dogtopia has gathered what they describe as a "Large, Diverse Dataset" of cat and non-cat images. They have already done the processing for you.

In [None]:
import numpy as np
import pandas as pd
import h5py
import matplotlib.pyplot as plt

pd.set_option("display.max_columns", None)

In [None]:
trainset = h5py.File("datasets/train_catvnoncat.h5")
testset = h5py.File("datasets/test_catvnoncat.h5")
#print(trainset.keys())

features = np.array(trainset["train_set_x"])
labels = np.array(trainset["train_set_y"]).reshape(-1, 1)

print(f"features shape: {features.shape}")
print(f"labels shape: {labels.shape}")


plt.imshow(features[1])
plt.show()
print(labels[1])

In [None]:
from sklearn.model_selection import train_test_split

X_train, X_cv, Y_train, Y_cv = train_test_split(features, labels, test_size=0.3)

print(X_train.shape)
print(Y_train.shape)
print(X_cv.shape)
print(Y_cv.shape)

In [None]:
def class_weights(Y_train):
    """
    Creates a class weight array for a binary classification
    Args:
        Y_train: np.array containing binary labels
    Returns:
        weights: np.array with weights for each example
    """
    
    pos = np.sum(Y_train)
    neg = Y_train.shape[0] - pos
    print(f"Number of positive examples: {pos}")
    print(f"Number of negative examples: {neg}\n")
    w_pos = Y_train.shape[0]/(2*pos)
    w_neg = Y_train.shape[0]/(2*neg)
    print(f"weight of pos: {w_pos}")
    print(f"weight of neg: {w_neg}")
    posmask = Y_train * w_pos
    negmask = ~(Y_train.astype(bool)) * w_neg
    weights = posmask + negmask
    return weights

    
weights = class_weights(Y_train)

## Their Model

The scientists of Dogtopia presented you with their convolutional neural network they designed. It really looks like something developed by experienced computing scientists, so what can possibly go wrong?

In [None]:
import torch
from torch.nn import Module, Linear, BatchNorm1d, BatchNorm2d, Sigmoid, BCELoss, ReLU, Conv2d, MaxPool2d, Dropout
from torch.optim import SGD, Adam
from torchmetrics import Accuracy
from bokeh.layouts import Row

from bokeh.plotting import figure, show, output_notebook

class Classifier(Module):
    def __init__(self, input_dim, output_dim):
        super().__init__()
        self.bnin = BatchNorm2d(3)
        self.conv1 = Conv2d(3, 24, kernel_size=7, padding=3)
        self.bn1 = BatchNorm2d(24)
        self.act1 = ReLU()
        self.mp1 = MaxPool2d(2)
        self.d1 = Dropout(0.5)
        
        self.conv2 = Conv2d(24, 48, kernel_size=5, padding=2)
        self.bn2 = BatchNorm2d(48)
        self.act2 = ReLU()
        self.mp2 = MaxPool2d(4)     
        self.d2 = Dropout(0.5)        
        
        self.fc3 = Linear(48*8*8, 200)
        self.bn3 = BatchNorm1d(200)
        self.act3 = ReLU()
        self.d3 = Dropout(0.5)

        self.fc4 = Linear(200, 100)
        self.bn4 = BatchNorm1d(100)
        self.act4 = ReLU()
        self.d4 = Dropout(0.5)       

        self.fc5 = Linear(100, 100)
        self.bn5 = BatchNorm1d(100)
        self.act5 = ReLU()
        self.d5 = Dropout(0.5) 

        self.out = Linear(100, 1)
        self.act6 = Sigmoid()

        self.train_costs = []
        self.val_costs = []
        self.train_acc = []
        self.val_acc = []
        self.epochs = None
    
    def forward(self, x):
        x = x.permute(0,3,1,2)
        
        x = self.bnin(x)
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.act1(x)
        x = self.mp1(x)
        x = self.d1(x)
        
        x = self.conv2(x)
        x = self.bn2(x)
        x = self.act2(x)
        x = self.mp2(x)
        x = self.d2(x)

        x = torch.flatten(x, start_dim=1)
        x = self.fc3(x)
        x = self.bn3(x)
        x = self.act3(x)
        x = self.d3(x)

        x = self.fc4(x)
        x = self.bn4(x)
        x = self.act4(x)
        x = self.d4(x)

        x = self.fc5(x)
        x = self.bn5(x)
        x = self.act5(x)
        x = self.d5(x)

        x = self.out(x)
        x = self.act6(x)

        return x


    
    def fit(self, X_train, Y_train, loss_fn, opt, X_cv=None, Y_cv=None, epochs=1):
        self.epochs = epochs    
        for i in range(epochs):   
            self.train()
            opt.zero_grad()
        
            Y_pred = self(X_train)
            cost = loss_fn(Y_pred, Y_train)
            
            acc = Accuracy().cuda()
            
            msg = f"Iter: {i}    loss: {cost.item(): .4f}    accuracy: {acc(torch.round(Y_pred).int(), Y_train.int()): .4f}    "
            self.train_costs.append(cost.item())
            self.train_acc.append(acc(torch.round(Y_pred).int(), Y_train.int()).item())

            if torch.is_tensor(X_cv) and torch.is_tensor(Y_cv):
                self.eval()
                Y_pred_val = self(X_cv)
                cost_val = BCELoss().cuda()(Y_pred_val, Y_cv)
                msg += f"cv_loss: {cost_val.item(): .4f}    cv_accuracy: {acc(torch.round(Y_pred_val).int(), Y_cv.int()): .4f}"
                self.val_costs.append(cost_val.item())
                self.val_acc.append(acc(torch.round(Y_pred_val).int(), Y_cv.int()).item())
                self.train()
            print(msg)
            
            cost.backward()
        
            opt.step()

    def disp_metrics(self):

        p = figure(width=500, height=300, x_axis_label="iterations", y_axis_label="cost", title="costs vs iterations", y_range=(0, 1.5))
        p.line(np.arange(1, self.epochs+1, 1),self.train_costs, color="blue")
        p.line(np.arange(1, self.epochs+1, 1), self.val_costs, color="red")

        p2 = figure(width=500, height=300, x_axis_label="iterations", y_axis_label="accuracy", title="accuracy vs iterations", y_range=(0.4, 1.05))
        p2.line(np.arange(1, self.epochs+1, 1), self.train_acc, color="blue")
        p2.line(np.arange(1, self.epochs+1, 1), self.val_acc, color="red")

        output_notebook()
        show(Row(p, p2))

In [None]:
input_dim, output_dim = (X_train.shape[1], 1)
LEARNING_RATE = 0.001
EPOCHS = 300

model_v1 = Classifier(input_dim, output_dim).cuda()
optimizer_v1 = Adam(model_v1.parameters(), lr=LEARNING_RATE)
criterion_v1 = BCELoss(weight=torch.from_numpy(weights).cuda()).cuda()

In [None]:
X_train_t = torch.from_numpy(X_train).float().cuda()
Y_train_t = torch.from_numpy(Y_train).float().cuda()
X_cv_t = torch.from_numpy(X_cv).float().cuda()
Y_cv_t = torch.from_numpy(Y_cv).float().cuda()

model_v1.fit(X_train_t, Y_train_t, criterion_v1, optimizer_v1, X_cv=X_cv_t, Y_cv=Y_cv_t, epochs=EPOCHS)
model_v1.disp_metrics()

## Your Take

While the machine learning model is state-of-the-art, you realize that the scientists really do not know how to analyze training and cross validation cost! While their model is doing very well on the training set, it **generalizes very poorly to the cross validation set**. See if you can best their attempt by applying some **L2 Regularization**.

In [None]:
input_dim, output_dim = (X_train.shape[1], 1)
LEARNING_RATE = 0.001
EPOCHS = 300

# ====  CHANGE THIS PARAMETER ==== #
L2 = None
# ================================ #

model_v2 = Classifier(input_dim, output_dim).cuda()
optimizer_v2 = Adam(model_v2.parameters(), lr=LEARNING_RATE, weight_decay=L2)
criterion_v2 = BCELoss(weight=torch.from_numpy(weights).cuda()).cuda()

In [None]:
X_train_t = torch.from_numpy(X_train).float().cuda()
Y_train_t = torch.from_numpy(Y_train).float().cuda()
X_cv_t = torch.from_numpy(X_cv).float().cuda()
Y_cv_t = torch.from_numpy(Y_cv).float().cuda()

model_v2.fit(X_train_t, Y_train_t, criterion_v2, optimizer_v2, X_cv=X_cv_t, Y_cv=Y_cv_t, epochs=EPOCHS)
model_v2.disp_metrics()

## Testing Your Model

Now it's time to report back to the city of Dogtopia on your results! You use the testing data they gave you and demoed your classifier.

In [None]:
pass

## Conclusion

The city of Dogtopia are thoroughly impressed, and offered you a permanent job there. You politely declined their offer (since you like cats), however you agreed to spend a few days explaining to the scientists the importance of **deciphering bias and variance problems from training/validation loss and accuracy**, as well as **fixing variance problems using regularization.**