# Multi-label classification

![multilabel_intro.png](https://live.staticflickr.com/65535/54927204610_28ff033051.jpg)

*Image generated by the ChatGPT image generation tool.*

## Introduction

Modern photographs often contain multiple objects and complex scenes. The photograph above is an example: we see a man in a jacket, shirt, and tie, a woman in a dress with a veil, a group of people in summer clothes, and the whole scene takes place on a beach, against a backdrop of water and a sunset.

If we were to stick to the classic "one image—one label" approach, we would have to choose just one category: does this photo represent a wedding, a man, a woman, a jacket, a dress, a beach...? In practice, this means an artificial limitation of information.

However, we don't have to limit ourselves to a single label. This is precisely why multi-label classification is used, as it allows assigning multiple categories describing various objects to a single image.
Instead of choosing one label, we can therefore say:
"This image contains instances of the classes: person, jacket, shirt, tie, dress, beach, etc."

## Task

Your task is to define and train a neural network to perform multi-label classification. In this task, the composition of the photo will be simplified compared to the example above. Only clothes will be visible in the photos, and your task is to create a model that can determine whether a given type of clothing is present in the picture.

## Data

In this task, you have at your disposal:
* a training set (6318 samples),
* a validation set (702 samples).

The test set, on which your solution will be finally evaluated, has 780 samples and is not public. It was created in the same way as the validation set, so it has analogous characteristics.

Each sample is an image with dimensions of $168 \times 168$ pixels. Associated with each image is a 10-element vector with values of $0$ and $1$, which determines the presence of a given class in the image. Information about which piece of clothing corresponds to each index in the label vector is provided in the `LABEL_NAMES` dictionary defined in one of the code cells.

## Evaluation Criterion
The final evaluation of the task will be based on the average value of the $F1$ measure calculated using the *macro* scheme.

For this task, you can score between 0 and 100 points. Your final point score for solving the task will be calculated according to the function below (the higher the value, the better), with additional rounding to integer values:
$$
\mathrm{score} =
\begin{cases}
    0 & \text{if } {F1}\leq 0.57 \\
    100 \times \frac{{F1}- 0.57}{0.87 - 0.57} & \text{if } 0.57 < {F1} < 0.87 \\
    100 & \text{if } {F1} \geq 0.87
\end{cases}
$$

## Limitations

- Your solution will be tested on the Competition Platform in an environment with a GPU. There is no internet access on the Platform, however, it is possible to use pretrained ResNet models (*ResNet18*, *ResNet34*, *ResNet50*) from the torchvision package, which are saved in memory as files. To use them, you must use the same command in your code as in cases where the Internet is available.
- The evaluation of your final solution on the test data on the Competition Platform cannot take longer than 2.5 minutes with a GPU.

## Submission files

This notebook supplemented with your solution (model definition, a function for training the model, and a function that returns the model's predictions).

## Evaluation

Remember that during the check, the `FINAL_EVALUATION_MODE` flag will be set to `True`.

For this task, you can score between 0 and 100 points. The number of points you earn will be calculated on a (secret) test set on the Competition Platform based on the aforementioned formula, rounded to the nearest integer. If your solution does not meet the above criteria or does not execute correctly, you will receive 0 points for the task.

## Starting Code
In this section, we initialize the environment by importing the necessary libraries and functions. The prepared code will help you operate on data efficiently and build the right solution.

In [None]:
######################### DO NOT CHANGE THIS CELL WHEN SENDING ##########################

FINAL_EVALUATION_MODE: bool = False  # We will set this flag to True during evaluation.

In [None]:
######################### DO NOT CHANGE THIS CELL WHEN SENDING ##########################

import os

import numpy as np

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as fun
import torchvision

from torch.utils.data import DataLoader, TensorDataset
from torchvision import transforms

from tqdm import tqdm

from sklearn.metrics import f1_score

In [None]:
######################### DO NOT CHANGE THIS CELL WHEN SENDING ##########################

def seed_everything(seed: int):
    """Sets the seed for reproducibility of results in Python, NumPy, and PyTorch."""

    os.environ["PYTHONHASHSEED"] = str(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False

In [None]:
######################### DO NOT CHANGE THIS CELL WHEN SENDING ##########################

# number of classes to be classified
N_CLASSES: int = 10

# mapping a class index to its name
LABEL_NAMES: dict[int, str] = {
    0: "T-shirt/top",
    1: "Trouser",
    2: "Pullover",
    3: "Dress",
    4: "Coat",
    5: "Sandal",
    6: "Shirt",
    7: "Sneaker",
    8: "Bag",
    9: "Ankle boot"
}

In [None]:
######################### DO NOT CHANGE THIS CELL WHEN SENDING ##########################

if not FINAL_EVALUATION_MODE:
    FILES: list[str] = [
        "runway-mnist/train-x.npz",
        "runway-mnist/train-y.npz",
        "runway-mnist/val-x.npz",
        "runway-mnist/val-y.npz"
    ]

    # download the data again if something is missing
    if not all(os.path.exists(file) for file in FILES):
        import gzip
        import tarfile
        import shutil

        if not os.path.exists("runway-mnist"):
            os.mkdir("runway-mnist")

        COMPRESSED_ARCHIVE = "runway-mnist.tar.gz"
        TAR_ARCHIVE  = COMPRESSED_ARCHIVE.rstrip(".gz")
        DOWNLOAD_URL = "https://drive.google.com/uc?id=1oNAFYdJyCVe3Po90KLUPxAG9HGuL7NSw"

        try:
            import gdown
        except ImportError as err:
            raise RuntimeError("To download the dataset you need a local installation of the gdown package: `pip install gdown`") from err

        gdown.download(DOWNLOAD_URL, str(COMPRESSED_ARCHIVE), quiet=False)

        with gzip.open(COMPRESSED_ARCHIVE, "rb") as compressed:
            with open(TAR_ARCHIVE, "wb") as archive:
                shutil.copyfileobj(compressed, archive)

        os.remove(COMPRESSED_ARCHIVE)
        print(f"Decompressed: {TAR_ARCHIVE}")

        with tarfile.open(TAR_ARCHIVE, "r") as tar:
            tar.extractall("runway-mnist")

        os.remove(TAR_ARCHIVE)
        print(f"Unpacked: {TAR_ARCHIVE}")

### Ładowanie danych

In [None]:
######################### DO NOT CHANGE THIS CELL WHEN SENDING ##########################

SEED: int = 42
seed_everything(SEED)

def load_x(usage) -> torch.Tensor:
    path = f"runway-mnist/{usage}-x.npz"
    return torch.tensor(np.load(path)["images"], dtype=torch.float32).unsqueeze(1)

def load_y(usage) -> torch.Tensor:
    path = f"runway-mnist/{usage}-y.npz"
    return torch.tensor(np.load(path)["labels"], dtype=torch.long)

train_dataset = TensorDataset(load_x("train"), load_y("train"))
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)

val_dataset = TensorDataset(load_x("val"), load_y("val"))
val_loader = DataLoader(val_dataset, batch_size=64, shuffle=False)

In [None]:
######################### DO NOT CHANGE THIS CELL WHEN SENDING ##########################

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

## Evaluation Function

In [None]:
######################### DO NOT CHANGE THIS CELL WHEN SENDING ##########################

def compute_score(f1: float) -> int:
    """Calculates a score based on the F1 metric value."""
    lower_bound = 0.57
    upper_bound = 0.87

    if f1 <= lower_bound:
        return 0
    elif lower_bound < f1 < upper_bound:
        return int(round(100 * (f1 - lower_bound) / (upper_bound - lower_bound)))
    else:
        return 100

def evaluate_algorithm(model, predict, loader) -> float:
    """Calculates metrics and evaluates your solution based on them. Returns the calculated F1 metric value."""
    preds = []
    labels = []
    model.eval()
    with torch.no_grad():
        for x, y in loader:
            prediction = predict(model, x.to(device)).cpu()
            preds.append(prediction)
            labels.append(y)

    predictions = torch.cat(preds)
    labels = torch.cat(labels)

    f1 = f1_score(labels.numpy(), predictions.numpy(), average="macro")
    points = compute_score(f1)
    print(f"Twój wynik F1: {f1:.3f}, co daje {points} punktów.")
    return f1

## Example solution

Below we present a simplified solution that demonstrates the basic functionality
of the notebook. It can serve as a starting point for developing your solution.

In [None]:
######################### DO NOT CHANGE THIS CELL WHEN SENDING ##########################

class NaiveSolution(nn.Module):
    """Naive solution."""

    def __init__(self):
        super().__init__()

    def forward(self, input: torch.Tensor) -> torch.Tensor:
        """This naive model predicts that all classes are in the image."""
        BATCH_SIZE = input.size(0)
        return torch.Tensor([1] * BATCH_SIZE * N_CLASSES).reshape(BATCH_SIZE, -1)

def train_naive(_: NaiveSolution):
    """Model, no training required."""
    pass

def predict_naive(model: NaiveSolution, input: torch.Tensor) -> torch.Tensor:
    """Our model returns the model predictions immediately, we do not process them additionally."""
    return model(input).to(torch.long)

In [None]:
######################### DO NOT CHANGE THIS CELL WHEN SENDING ##########################
if not FINAL_EVALUATION_MODE:
    naive_solution = NaiveSolution().to(device)
    naive_solution.train()
    train_naive(naive_solution)
    evaluate_algorithm(naive_solution, predict_naive, val_loader)

## Your solution
Place your solution in the cell below. Make changes only here!

In [None]:
class Solution(nn.Module):
    """Your solution."""

    def __init__(self):
        super().__init__()

    def forward(self, input: torch.Tensor) -> torch.Tensor:
        """Model inference"""
        BATCH_SIZE = input.size(0)
        return torch.rand(BATCH_SIZE * N_CLASSES).reshape(BATCH_SIZE, -1)

def train_solution(_: Solution):
    """Training loop for your model."""
    pass

def predict_solution(model: Solution, input: torch.Tensor) -> torch.Tensor:
    """Classification using a model.
    The purpose of separating this function is to enable simple post-processing of model results."""
    predictions = model(input).round().to(torch.long)
    return predictions

In [None]:
######################### DO NOT CHANGE THIS CELL WHEN SENDING #########################

solution = Solution().to(device)
solution.train()

train_solution(solution)

## Evaluation

The following code will be used to evaluate the solution. After sending the solution to us, the function `evaluate_algorithm(solution, predict_solution)` will be executed, i.e., almost identical code as below will be run on a test set available only to the graders.

Make sure before sending that the entire notebook runs from start to finish without errors and without user intervention after executing the `Run All` command.

In [None]:
######################### DO NOT CHANGE THIS CELL WHEN SENDING #########################

if not FINAL_EVALUATION_MODE:
    evaluate_algorithm(solution, predict_solution, val_loader)