# The building blocks of DL 2: linear layer with activation function

In the previous post I've explained what is the most important thing in neural networks - technique that allows us to incrementally find minimum of a function. This is called Gradient Descent algorithm!

In this post I will build on this concept and show you how to create a basic linear model to predict what is on a medical image!

### Download data

As in the first post showing how to build medical image recognition with pure statistics, we need to download data first. For some basic description of how the data looks like, see that first post

In [1]:
! git clone https://github.com/apolanco3225/Medical-MNIST-Classification.git
! mv Medical-MNIST-Classification/resized/ ./medical_mnist
! rm -rf Medical-MNIST-Classification

Cloning into 'Medical-MNIST-Classification'...
remote: Enumerating objects: 58532, done.[K
remote: Total 58532 (delta 0), reused 0 (delta 0), pack-reused 58532[K
Receiving objects: 100% (58532/58532), 77.86 MiB | 28.99 MiB/s, done.
Resolving deltas: 100% (506/506), done.
Checking connectivity... done.
Checking out files: 100% (58959/58959), done.


In [9]:
from pathlib import Path

PATH = Path("medical_mnist/")

We have much more powerful tools now, so we will deal with all 6 classes now, but first need to prepare data:
 - all data needs to be numerical
 - it needs to be in arrays
 - it needs to be labeled

In [14]:
classes = !ls {PATH}
classes

['AbdomenCT', 'BreastMRI', 'CXR', 'ChestCT', 'Hand', 'HeadCT']

In [16]:
images = {}
for cls in classes:
    images[cls] = !ls {PATH/cls}

In [15]:
from PIL import Image

In [43]:
# ToTensor converts images to tensorts
from torchvision.transforms import ToTensor

In [73]:
import torch

image_tensors = {}

for cls in classes:
    image_tensors[cls] = torch.stack([ToTensor()(Image.open(path)).view(-1, 64 * 64).squeeze().float()/255 for path in (PATH/cls).iterdir()])

In [74]:
for cls in classes:
    class_shape = image_tensors[cls].shape
    print(f"{cls} has {class_shape[0]} images of a size {class_shape[1:]}")

AbdomenCT has 10000 images of a size torch.Size([4096])
BreastMRI has 8954 images of a size torch.Size([4096])
CXR has 10000 images of a size torch.Size([4096])
ChestCT has 10000 images of a size torch.Size([4096])
Hand has 10000 images of a size torch.Size([4096])
HeadCT has 10000 images of a size torch.Size([4096])


In [80]:
x_train = torch.cat([image_tensors[cls] for cls in classes], dim=0)
y_train = torch.cat([torch.tensor([index] * image_tensors[cls].shape[0]) for index, cls in enumerate(classes)])

In [81]:
x_train.shape, y_train.shape

(torch.Size([58954, 4096]), torch.Size([58954]))

In [76]:
# it normalizes all 10 classes so we can treat each class prediction as probability that add up to 1.0
def log_softmax(x):
    return x - x.exp().sum(-1).log().unsqueeze(-1)

In [77]:
def model(x):
    return log_softmax(x @ weights + bias)

In [88]:
def loss_func(preds, targets):
    return -preds[range(targets.shape[0]), targets].mean()


In [89]:
from IPython.core.debugger import set_trace

n  = x_train.shape[0]
bs = 32
lr = 0.5  # learning rate
epochs = 2  # how many epochs to train for

weights = torch.zeros((64 * 64, 10), requires_grad=True)
bias = torch.zeros(10, requires_grad=True)

for epoch in range(epochs):
    for i in range((n - 1) // bs + 1):
        #         set_trace()
        start_i = i * bs
        end_i = start_i + bs
        xb = x_train[start_i:end_i]
        yb = y_train[start_i:end_i]
        pred = model(xb)
        loss = loss_func(pred, yb)

        loss.backward()
        with torch.no_grad():
            weights -= weights.grad * lr
            bias -= bias.grad * lr
            weights.grad.zero_()
            bias.grad.zero_()

### A simple Neural Network

This will be the simplest neural network that meets criteria of [universal approximation theorem](https://en.wikipedia.org/wiki/Universal_approximation_theorem), so in theory it can approximate any function providing we introduce enough parameters.