# The building blocks of DL 2: linear layer with activation function

In the previous post I've explained what is the most important thing in neural networks - technique that allows us to incrementally find minimum of a function. This is called Gradient Descent algorithm!

In this post I will build on this concept and show you how to create a basic linear model to predict what is on a medical image!

### Download data

As in the first post showing how to build medical image recognition with pure statistics, we need to download data first. For some basic description of how the data looks like, see that first post

In [1]:
! git clone https://github.com/apolanco3225/Medical-MNIST-Classification.git
! mv Medical-MNIST-Classification/resized/ ./medical_mnist
! rm -rf Medical-MNIST-Classification

Cloning into 'Medical-MNIST-Classification'...
remote: Enumerating objects: 58532, done.[K
remote: Total 58532 (delta 0), reused 0 (delta 0), pack-reused 58532[K
Receiving objects: 100% (58532/58532), 77.86 MiB | 5.34 MiB/s, done.
Resolving deltas: 100% (506/506), done.
Checking connectivity... done.
Checking out files: 100% (58959/58959), done.


In [2]:
from pathlib import Path

PATH = Path("medical_mnist/")

We have much more powerful tools now, so we will deal with all 6 classes now, but first need to prepare data:
 - all data needs to be numerical
 - it needs to be in arrays
 - it needs to be labeled

In [3]:
classes = !ls {PATH}
classes

['AbdomenCT', 'BreastMRI', 'CXR', 'ChestCT', 'Hand', 'HeadCT']

In [4]:
images = {}
for cls in classes:
    images[cls] = !ls {PATH/cls}

In [5]:
from PIL import Image

In [6]:
# ToTensor converts images to tensorts
from torchvision.transforms import ToTensor

In [7]:
import torch

image_tensors = {}

for cls in classes:
    image_tensors[cls] = torch.stack([ToTensor()(Image.open(path)).view(-1, 64 * 64).squeeze().float()/255 for path in (PATH/cls).iterdir()])

In [8]:
for cls in classes:
    class_shape = image_tensors[cls].shape
    print(f"{cls} has {class_shape[0]} images of a size {class_shape[1:]}")

AbdomenCT has 10000 images of a size torch.Size([4096])
BreastMRI has 8954 images of a size torch.Size([4096])
CXR has 10000 images of a size torch.Size([4096])
ChestCT has 10000 images of a size torch.Size([4096])
Hand has 10000 images of a size torch.Size([4096])
HeadCT has 10000 images of a size torch.Size([4096])


In [9]:
x_train = torch.cat([image_tensors[cls] for cls in classes], dim=0)
y_train = torch.cat([torch.tensor([index] * image_tensors[cls].shape[0]) for index, cls in enumerate(classes)])

shuffle

In [10]:
permutations = torch.randperm(x_train.shape[0])

In [11]:
x_train = x_train[permutations]
y_train = y_train[permutations]

create validation set that is 20% of the training set

In [12]:
valid_pct = 0.2
valid_index = int(x_train.shape[0] * valid_pct)
valid_index

11790

In [13]:
x_valid = x_train[:valid_index]
y_valid = y_train[:valid_index]
x_train = x_train[valid_index:]
y_train = y_train[valid_index:]

In [14]:
x_train.shape, y_train.shape

(torch.Size([47164, 4096]), torch.Size([47164]))

In [15]:
# it normalizes all 10 classes so we can treat each class prediction as probability that add up to 1.0
def log_softmax(x):
    return x - x.exp().sum(-1).log().unsqueeze(-1)

In [16]:
def model(x):
    return log_softmax(x @ weights + bias)

In [17]:
def loss_func(preds, targets):
    return -preds[range(targets.shape[0]), targets].mean()

def accuracy(preds, targets):
    return (torch.argmax(preds, dim=-1) == targets).float().mean()

In [18]:
from IPython.core.debugger import set_trace

n  = x_train.shape[0]
bs = 64
lr = 0.5  # learning rate
epochs = 30  # how many epochs to train for

weights = torch.zeros((64 * 64, 10), requires_grad=True)
bias = torch.zeros(10, requires_grad=True)

xb, yb = None, None

for epoch in range(epochs):
    for i in range((n - 1) // bs + 1):
        #         set_trace()
        start_i = i * bs
        end_i = start_i + bs
        xb = x_train[start_i:end_i]
        yb = y_train[start_i:end_i]
        preds = model(xb)
        loss = loss_func(preds, yb)

        loss.backward()
        with torch.no_grad():
            weights -= weights.grad * lr
            bias -= bias.grad * lr
            weights.grad.zero_()
            bias.grad.zero_()

    print(f"Epoch {epoch} accuracy: {accuracy(model(x_valid),y_valid)}, loss: {loss_func(model(x_valid), y_valid)}")


Epoch 0 accuracy: 0.47073790431022644, loss: 1.65959894657135
Epoch 1 accuracy: 0.6564037203788757, loss: 1.5399587154388428
Epoch 2 accuracy: 0.9083121418952942, loss: 1.4410401582717896
Epoch 3 accuracy: 0.9347752332687378, loss: 1.3571207523345947
Epoch 4 accuracy: 0.9279050230979919, loss: 1.2848182916641235
Epoch 5 accuracy: 0.9279050230979919, loss: 1.221755027770996
Epoch 6 accuracy: 0.9306191802024841, loss: 1.1661864519119263
Epoch 7 accuracy: 0.9348600506782532, loss: 1.116800308227539
Epoch 8 accuracy: 0.9382527470588684, loss: 1.0725876092910767
Epoch 9 accuracy: 0.9405428171157837, loss: 1.0327564477920532
Epoch 10 accuracy: 0.944614052772522, loss: 0.9966747760772705
Epoch 11 accuracy: 0.9467345476150513, loss: 0.9638299345970154
Epoch 12 accuracy: 0.9485157132148743, loss: 0.9338014125823975
Epoch 13 accuracy: 0.9498727917671204, loss: 0.9062393307685852
Epoch 14 accuracy: 0.9508057832717896, loss: 0.8808506727218628
Epoch 15 accuracy: 0.9516539573669434, loss: 0.8573866

### A simple Neural Network

This will be the simplest neural network that meets criteria of [universal approximation theorem](https://en.wikipedia.org/wiki/Universal_approximation_theorem), so in theory it can approximate any function providing we introduce enough parameters.