## Image classification with deep learning methods.

-- Description --

When you train the network, it is recommended to use the GPU resources of your computer.
This will help you to learn the "know how" of setting up a working Python environment on your computer.
In the case of unavailable Nvidia hardware or problems with your Python environment you can use Google Colab.
Please go to the menu, Runtime - Change runtime type, and select **GPU** as the hardware accelerator.
Although you used your computer successfuly it is highly recommended to give a try to Google Colab environment.


In [40]:
# Import libraries
# These libraries should be sufficient for this Practice.
# However, if any other library is needed, please install it by yourself.

import torch
import torchvision
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision.transforms as transforms
import torch.utils.data as data
import numpy as np
import time
import os
import random
import matplotlib.pyplot as plt
from matplotlib import colors
from PIL import Image
from tqdm import tqdm
from sklearn.metrics import roc_auc_score

# -----------------

import medmnist
from medmnist import INFO, Evaluator
from medmnist import OrganAMNIST
from torch.utils.data import DataLoader, random_split
import torchvision.transforms as transforms


In [13]:
# Install and import the MedMNIST package and datasets.

%pip install medmnist
import medmnist
from medmnist import info

Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.0.1[0m[39;49m -> [0m[32;49m24.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49m/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [41]:
## Preparing the Dataset
NUM_EPOCHS = 3
batch_size_train = 64
batch_size_test = 1000
learning_rate = 0.01
momentum = 0.5
log_interval = 10

random_seed = 1234
torch.backends.cudnn.enabled = False
torch.manual_seed(random_seed)

<torch._C.Generator at 0x1068af070>

## Download the imaging dataset

You can browse the imaging datasets on their webpage https://medmnist.com/, and download them as such:


In [15]:
# from medmnist import OrganAMNIST
# from torch.utils.data import DataLoader, random_split
# import torchvision.transforms as transforms

# # Definir transformaciones que deseas aplicar a tus datos
# transform = transforms.Compose([
#     transforms.Resize((28, 28)),  # Cambiar el tamaño a 64x64
#     transforms.ToTensor(),         # Convertir la imagen a un tensor
#     transforms.Normalize((0.5,), (0.5,))  # Normalizar las imágenes
# ])

# # Crear instancia del dataset
# dataset = OrganAMNIST(split="test", download=True, transform=transform, size=128)

# # Calcular las longitudes de los conjuntos de entrenamiento y prueba
# total_size = len(dataset)
# train_size = int(0.8 * total_size)
# test_size = total_size - train_size

# # Dividir el dataset en conjuntos de entrenamiento y prueba
# train_dataset, test_dataset = random_split(dataset, [train_size, test_size])

# # Crear DataLoader para manejar los datos de entrenamiento
# train_loader = DataLoader(train_dataset, batch_size=batch_size_train, shuffle=True)

# # Crear DataLoader para manejar los datos de prueba
# test_loader = DataLoader(test_dataset, batch_size=batch_size_test, shuffle=False)

# # Ahora puedes iterar sobre los DataLoaders para obtener los lotes de datos
# for batch in train_loader:
#     # batch contiene un conjunto de imágenes y etiquetas de entrenamiento
#     train_images, train_labels = batch
#     # Aquí puedes realizar operaciones con los datos del lote de entrenamiento

# for batch in test_loader:
#     # batch contiene un conjunto de imágenes y etiquetas de prueba
#     test_images, test_labels = batch
#     # Aquí puedes realizar operaciones con los datos del lote de prueba


In [42]:
data_flag = 'organamnist'
# data_flag = 'breastmnist'
download = True

NUM_EPOCHS = 3
BATCH_SIZE = 128
lr = 0.001

info = INFO[data_flag]
task = info['task']
n_channels = info['n_channels']
n_classes = len(info['label'])

DataClass = getattr(medmnist, info['python_class'])

In [43]:
# preprocessing
data_transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[.5], std=[.5])
])

# load the data
train_dataset = DataClass(split='train', transform=data_transform, download=download)
test_dataset = DataClass(split='test', transform=data_transform, download=download)

pil_dataset = DataClass(split='train', download=download)

# encapsulate data into dataloader form
train_loader = data.DataLoader(dataset=train_dataset, batch_size=BATCH_SIZE, shuffle=True)
train_loader_at_eval = data.DataLoader(dataset=train_dataset, batch_size=2*BATCH_SIZE, shuffle=False)
test_loader = data.DataLoader(dataset=test_dataset, batch_size=2*BATCH_SIZE, shuffle=False)

Using downloaded and verified file: /Users/cmoro/.medmnist/organamnist.npz
Using downloaded and verified file: /Users/cmoro/.medmnist/organamnist.npz
Using downloaded and verified file: /Users/cmoro/.medmnist/organamnist.npz


## Visualize the imaging dataset

You can find relevant information about the datasets in the info.INFO dictionary.

For visualizing the images, you can use the montage method, though we recomend
you practice accesing the individual images and labels.



#Generate a dataloader

A convinient option for accessing data in torch is with the use of the Dataloader class. These work directly when given a MNIST dataset as input.
You can also apply any necesary preprocesing steps directly as you load the data with the Transforms package and the transform MNIST argument.

Choose apropiate values for the training hiperparameters (you can experiment with sampling strategies if you want) and implement the adecuate preprocesing steps. Finally, choose an Mnist dataset and create the dataloader for the training, validation and test splits.

In [44]:
examples = enumerate(test_loader)
batch_idx, (example_data, example_targets) = next(examples)

print(example_data.shape)

torch.Size([256, 1, 28, 28])


#Create a deep learning model

In [19]:
# # Define a simple CNN model

# class Net(nn.Module):
#     def __init__(self, in_channels, num_classes, im_size):
#         super(Net, self).__init__()
#         # Define the desired deep learning model
#         self.conv1 = nn.Conv2d(in_channels, 16, kernel_size=3, stride=1, padding=1)
#         self.conv2 = nn.Conv2d(16, 32, kernel_size=3, stride=1, padding=1)
#         self.conv3 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
#         self.fc1 = nn.Linear(64 * (im_size // 8) * (im_size // 8), 512)
#         self.fc2 = nn.Linear(512, num_classes)

#     def forward(self, x):
#         # Convolutional layers with ReLU activation and max pooling
#         x = F.relu(self.conv1(x))
#         x = F.max_pool2d(x, 2)
#         x = F.relu(self.conv2(x))
#         x = F.max_pool2d(x, 2)
#         x = F.relu(self.conv3(x))
#         x = F.max_pool2d(x, 2)

#         # Flatten the output for fully connected layers
#         x = x.view(x.size(0), -1)
        
#         # Fully connected layers with ReLU activation
#         x = F.relu(self.fc1(x))
#         x = self.fc2(x)
#         return F.log_softmax(x, dim=1)

# # Define the input channels, number of classes, and image size
# in_channels = 1  # MNIST images are grayscale, hence 1 channel
# num_classes = 10  # Number of classes in MNIST dataset
# im_size = 64  # Resized image size

# # Create an instance of the model
# model = Net(in_channels=in_channels, num_classes=num_classes, im_size=im_size)



In [47]:
# define a simple CNN model

class Net(nn.Module):
    def __init__(self, in_channels, num_classes):
        super(Net, self).__init__()

        self.layer1 = nn.Sequential(
            nn.Conv2d(in_channels, 16, kernel_size=3),
            nn.BatchNorm2d(16),
            nn.ReLU())

        self.layer2 = nn.Sequential(
            nn.Conv2d(16, 16, kernel_size=3),
            nn.BatchNorm2d(16),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2))

        self.layer3 = nn.Sequential(
            nn.Conv2d(16, 64, kernel_size=3),
            nn.BatchNorm2d(64),
            nn.ReLU())
        
        self.layer4 = nn.Sequential(
            nn.Conv2d(64, 64, kernel_size=3),
            nn.BatchNorm2d(64),
            nn.ReLU())

        self.layer5 = nn.Sequential(
            nn.Conv2d(64, 64, kernel_size=3, padding=1),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2))

        self.fc = nn.Sequential(
            nn.Linear(64 * 4 * 4, 128),
            nn.ReLU(),
            nn.Linear(128, 128),
            nn.ReLU(),
            nn.Linear(128, num_classes))

    def forward(self, x):
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        x = self.layer5(x)
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        return x

model = Net(in_channels=n_channels, num_classes=n_classes)
    
# define loss function and optimizer
if task == "multi-label, binary-class":
    criterion = nn.BCEWithLogitsLoss()
else:
    criterion = nn.CrossEntropyLoss()
    
optimizer = optim.SGD(model.parameters(), lr=lr, momentum=0.9)

In [49]:
# train

for epoch in range(NUM_EPOCHS):
    train_correct = 0
    train_total = 0
    test_correct = 0
    test_total = 0
    
    model.train()
    for inputs, targets in tqdm(train_loader):
        # forward + backward + optimize
        optimizer.zero_grad()
        outputs = model(inputs)
        
        if task == 'multi-label, binary-class':
            targets = targets.to(torch.float32)
            loss = criterion(outputs, targets)
        else:
            targets = targets.squeeze().long()
            loss = criterion(outputs, targets)
        
        loss.backward()
        optimizer.step()

100%|█████████▉| 270/271 [00:29<00:00,  9.29it/s]


ValueError: Expected input batch_size (1) to match target batch_size (0).

In [36]:
# evaluation

def test(split):
    model.eval()
    y_true = torch.tensor([])
    y_score = torch.tensor([])
    
    data_loader = train_loader_at_eval if split == 'train' else test_loader

    with torch.no_grad():
        for inputs, targets in data_loader:
            outputs = model(inputs)

            if task == 'multi-label, binary-class':
                targets = targets.to(torch.float32)
                outputs = outputs.softmax(dim=-1)
            else:
                targets = targets.squeeze().long()
                outputs = outputs.softmax(dim=-1)
                targets = targets.float().resize_(len(targets), 1)

            y_true = torch.cat((y_true, targets), 0)
            y_score = torch.cat((y_score, outputs), 0)

        y_true = y_true.numpy()
        y_score = y_score.detach().numpy()
        
        evaluator = Evaluator(data_flag, split)
        metrics = evaluator.evaluate(y_score)
    
        print('%s  auc: %.3f  acc:%.3f' % (split, *metrics))

        
print('==> Evaluating ...')
test('train')
test('test')

#Train Model

Implement the main traning loop to train the deep learning model.
This should include the forward and backward passes. You can find information about how to do this with torch in https://pytorch.org/tutorials/beginner/pytorch_with_examples.html#id14

In [39]:
# # Train the model

# for epoch in range(NUM_EPOCHS):

#     model.train()
#     for inputs, targets in tqdm(train_loader):
#         # forward + backward + optimize

#         #Your code


#Evaluation

Finally, implement the evaluation of the object clasification task. You can implement any metric you want, though the most common are accuracy and AUC (one class against all for the multiclass task). You can use torch.no_grad() for speeding up predictions when no gradients are needed.

How do you compare with the MedMNIST benchmarks?

In [None]:
# Evaluation

# Your code
