# Classification task

Hi! It's a classification task baseline notebook.
It include a data reader, baseline model and submission generator.

You should use GPU to train your model, so we recommend using [Kaggle Notebooks](https://www.kaggle.com/docs/notebooks).
To get maximum score of the task, your model should have accuracy greater than `0.8`.

You can use everything, that suits into the rules in `README.md`.

In [1]:
from pathlib import Path
from datetime import datetime
import numpy as np

import torch
import torch.nn as nn
from torch.utils import data

import catalyst
from catalyst import dl
from catalyst.utils import metrics, set_global_seed, get_available_gpus
from catalyst.utils import (
    create_dataset, create_dataframe, get_dataset_labeling, map_dataframe
)
from catalyst.utils import split_dataframe_train_test
from catalyst.data.cv.reader import ImageReader
from catalyst.dl import utils
from catalyst.data import ScalarReader, ReaderCompose
from catalyst.data import Augmentor
import albumentations as albu
from albumentations.pytorch import ToTensorV2 as ToTensor

from catalyst.dl import SupervisedRunner

import pandas as pd
from PIL import Image
from tqdm.notebook import tqdm

from GaborNet import GaborConv2d

In [2]:
set_global_seed(42)
get_available_gpus()

[0]

## Dataset

This code will help you to generate dataset. If your data have the following folder structure:

```
dataset/
    class_1/
        *.ext
        ...
    class_2/
        *.ext
        ...
    ...
    class_N/
        *.ext
        ...
```
First of all `create_dataset` function goes through a given directory and creates a dictionary `Dict[class_name, List[image]]`.
Then `create_dataframe` function creates typical `pandas.DataFrame` for further analysis.
After that, `prepare_dataset_labeling` creates a numerical label for each unique class name.
Finally, to add a column with a numerical label value to the DataFrame, we can use `map_dataframe` function.

Additionaly let's save the `class_names` for further usage.

In [3]:
path_to_dataset = str(Path.cwd().parent/"data2")

In [4]:
dataset = create_dataset(dirs=f"{path_to_dataset}/train/*", extension="*.jpg")
df = create_dataframe(dataset, columns=["class", "filepath"])

tag_to_label = get_dataset_labeling(df, "class")
class_names = [
    name for name, id_ in sorted(tag_to_label.items(), key=lambda x: x[1])
]

df_with_labels = map_dataframe(
    df, 
    tag_column="class", 
    class_column="label", 
    tag2class=tag_to_label, 
    verbose=False
)
df_with_labels.head()

Unnamed: 0,class,filepath,label
0,c1d6f6c4,/home/analekseev/PycharmProjects/dl-course/dat...,0
1,c1d6f6c4,/home/analekseev/PycharmProjects/dl-course/dat...,0
2,c1d6f6c4,/home/analekseev/PycharmProjects/dl-course/dat...,0
3,c1d6f6c4,/home/analekseev/PycharmProjects/dl-course/dat...,0
4,c1d6f6c4,/home/analekseev/PycharmProjects/dl-course/dat...,0


And you should split data in `train / valid / test` parts.
There are only `train` and `valid` parts, so you must load test data as shows in a code cell.

In [5]:
train_data, valid_data = split_dataframe_train_test(
    df_with_labels, test_size=0.2, random_state=42
)
train_data, valid_data = (
    train_data.to_dict("records"),
    valid_data.to_dict("records"),
)

In [6]:
num_classes = len(tag_to_label)

open_fn = ReaderCompose(
    [
        ImageReader(
            input_key="filepath", output_key="features", rootpath="train"
        ),
        ScalarReader(
            input_key="label",
            output_key="targets",
            default_value=-1,
            dtype=np.int64,
        ),
        ScalarReader(
            input_key="label",
            output_key="targets_one_hot",
            default_value=-1,
            dtype=np.int64,
            one_hot_classes=num_classes,
        ),
    ]
)

## Augmentation

Use some augmentations to generate more images for training process.

In [7]:
IMAGE_SIZE = 224

train_transform = albu.Compose([
    albu.HorizontalFlip(p=0.25),
    albu.VerticalFlip(p=0.25),
    albu.RandomRotate90(p=0.25),
    albu.RandomBrightnessContrast(p=0.25),
    albu.LongestMaxSize(IMAGE_SIZE),
    albu.PadIfNeeded(IMAGE_SIZE, IMAGE_SIZE, border_mode=0),
    albu.RandomResizedCrop(IMAGE_SIZE, IMAGE_SIZE, p=0.5),
    albu.Normalize(),
    ToTensor()
])

valid_transform = albu.Compose([
    albu.LongestMaxSize(IMAGE_SIZE),
    albu.PadIfNeeded(IMAGE_SIZE, IMAGE_SIZE, border_mode=0),
    albu.RandomResizedCrop(IMAGE_SIZE, IMAGE_SIZE, p=0.25),
    albu.Normalize(),
    ToTensor()
])


In [8]:
train_data_transform = Augmentor(
    dict_key="features", augment_fn=lambda x: train_transform(image=x)["image"]
)

valid_data_transform = Augmentor(
    dict_key="features", augment_fn=lambda x: valid_transform(image=x)["image"]
)


Don't forget to create test loader.

In [9]:
batch_size = 128
num_workers = 4

train_loader = utils.get_loader(
    train_data,
    open_fn=open_fn,
    dict_transform=train_data_transform,
    batch_size=batch_size,
    num_workers=num_workers,
    shuffle=True,
    sampler=None,
    drop_last=True,
)

valid_loader = utils.get_loader(
    valid_data,
    open_fn=open_fn,
    dict_transform=valid_data_transform,
    batch_size=batch_size,
    num_workers=num_workers,
    shuffle=False, 
    sampler=None,
    drop_last=True,
)

loaders = {
    "train": train_loader,
    "valid": valid_loader
}

## Model

For the baseline, we will use a ResNet model, we already have examined in the seminar.
Enhance the model, use any* instruments or module as you like.

*(Don't forget about the rules!)

In [10]:
class ResNetBlock(nn.Module):
    def __init__(self, in_channels, out_channels, stride, p=0.1):
        super().__init__()

        self.input = nn.Sequential(
            nn.Conv2d(
                in_channels,
                out_channels,
                kernel_size=3,
                stride=stride,
                padding=1,
            ),
            nn.BatchNorm2d(out_channels),
            nn.LeakyReLU(),
            nn.Conv2d(
                out_channels, out_channels, kernel_size=3, stride=1, padding=1
            ),
            nn.BatchNorm2d(out_channels),
        )
        self.res = nn.Conv2d(
            in_channels, out_channels, kernel_size=1, stride=stride
        )
        self.output = nn.Sequential(
            nn.BatchNorm2d(out_channels), 
            nn.LeakyReLU()
        )

    def forward(self, x):
        input_ = self.input(x)
        res = self.res(x)
        return self.output(res + input_)


class BaselineModel(nn.Module):
    def __init__(self, channels=3, in_features=64, num_classes=10, p=0.1):
        super().__init__()
        
        self.input = nn.Sequential(
            GaborConv2d(
                3, in_features, kernel_size=17, stride=2, padding=8,
            ),
            nn.BatchNorm2d(in_features),
            nn.LeakyReLU(),
            nn.MaxPool2d(kernel_size=3, stride=2),
        )

        self.layer_0 = self._make_layer(in_features, 1)
        self.layer_1 = self._make_layer(in_features)
        in_features *= 2
        
        self.layer_2 = self._make_layer(in_features, 1)
        self.layer_3 = self._make_layer(in_features)
        in_features *= 2
        
        self.layer_4 = self._make_layer(in_features, 1)

        self.fc = nn.Sequential(
            nn.AdaptiveAvgPool2d((1, 1)),
            nn.Flatten(),
            nn.Linear(in_features, num_classes),
        )

    def _make_layer(self, in_features, multiplier=2, p=0.1):
        return nn.Sequential(
            ResNetBlock(in_features, in_features * multiplier, stride=2, p=p),
            ResNetBlock(
                in_features * multiplier,
                in_features * multiplier,
                stride=1,
                p=p,
            ),
        )

    def forward(self, x):
        x = self.input(x)
        x = self.layer_0(x)
        x = self.layer_1(x)
        x = self.layer_2(x)
        x = self.layer_3(x)
        x = self.layer_4(x)
        return self.fc(x)

from catalyst.dl import SupervisedRunner

class ClassificationRunner(SupervisedRunner):
    def predict_batch(self, batch):
        prediction = {
            "filepath": batch["filepath"],
            "log_probs": self.model(batch[self.input_key].to(self.device))
        }
        return prediction

In [11]:
class ClassificationRunner(SupervisedRunner):
    def predict_batch(self, batch):
        prediction = {
            "filepath": batch["filepath"],
            "log_probs": self.model(batch[self.input_key].to(self.device))
        }
        return prediction

In [12]:
model = BaselineModel()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.AdamW(model.parameters(), lr=3e-4)

runner = ClassificationRunner(input_key="features", input_target_key="targets")
runner.train(
    model=model,
    optimizer=optimizer,
    criterion=criterion,
    loaders=loaders,
    logdir=Path("logs") / datetime.now().strftime("%Y%m%d-%H%M%S"),
    num_epochs=100,
    verbose=True,
    load_best_on_end=True,
    callbacks={
        "optimizer": dl.OptimizerCallback(
            metric_key="loss", accumulation_steps=1, grad_clip_params=None,
        ),
        "criterion": dl.CriterionCallback(
            input_key="targets", output_key="logits", prefix="loss",
        ),
        "accuracy": dl.AccuracyCallback(num_classes=10),
    },
)

1/100 * Epoch (train):   0% 0/56 [00:00<?, ?it/s]


This overload of add is deprecated:
	add(Number alpha, Tensor other)
Consider using one of the following signatures instead:
	add(Tensor other, *, Number alpha) (Triggered internally at  /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)



1/100 * Epoch (train): 100% 56/56 [00:32<00:00,  1.74it/s, accuracy01=0.359, accuracy03=0.664, accuracy05=0.820, loss=1.876]
1/100 * Epoch (valid): 100% 14/14 [00:03<00:00,  3.82it/s, accuracy01=0.320, accuracy03=0.633, accuracy05=0.789, loss=1.953]
[2020-10-17 17:37:55,659] 
1/100 * Epoch 1 (_base): lr=0.0003 | momentum=0.9000
1/100 * Epoch 1 (train): accuracy01=0.2363 | accuracy03=0.5113 | accuracy05=0.7172 | loss=2.0883
1/100 * Epoch 1 (valid): accuracy01=0.2746 | accuracy03=0.6038 | accuracy05=0.7818 | loss=2.1236
2/100 * Epoch (train): 100% 56/56 [00:32<00:00,  1.72it/s, accuracy01=0.500, accuracy03=0.734, accuracy05=0.914, loss=1.586]
2/100 * Epoch (valid): 100% 14/14 [00:03<00:00,  3.73it/s, accuracy01=0.414, accuracy03=0.719, accuracy05=0.875, loss=1.690]
[2020-10-17 17:38:32,314] 
2/100 * Epoch 2 (_base): lr=0.0003 | momentum=0.9000
2/100 * Epoch 2 (train): accuracy01=0.3892 | accuracy03=0.7181 | accuracy05=0.8641 | loss=1.7206
2/100 * Epoch 2 (valid): accuracy01=0.4174 | accu

17/100 * Epoch (train): 100% 56/56 [00:34<00:00,  1.62it/s, accuracy01=0.656, accuracy03=0.883, accuracy05=0.969, loss=1.114]
17/100 * Epoch (valid): 100% 14/14 [00:03<00:00,  3.61it/s, accuracy01=0.641, accuracy03=0.859, accuracy05=0.938, loss=1.072]
[2020-10-17 17:48:13,669] 
17/100 * Epoch 17 (_base): lr=0.0003 | momentum=0.9000
17/100 * Epoch 17 (train): accuracy01=0.7069 | accuracy03=0.9030 | accuracy05=0.9584 | loss=0.9094
17/100 * Epoch 17 (valid): accuracy01=0.6440 | accuracy03=0.8689 | accuracy05=0.9492 | loss=1.1087
18/100 * Epoch (train): 100% 56/56 [00:33<00:00,  1.69it/s, accuracy01=0.695, accuracy03=0.898, accuracy05=0.961, loss=0.865]
18/100 * Epoch (valid): 100% 14/14 [00:03<00:00,  3.64it/s, accuracy01=0.734, accuracy03=0.891, accuracy05=0.945, loss=0.975]
[2020-10-17 17:48:51,162] 
18/100 * Epoch 18 (_base): lr=0.0003 | momentum=0.9000
18/100 * Epoch 18 (train): accuracy01=0.6999 | accuracy03=0.9065 | accuracy05=0.9618 | loss=0.9113
18/100 * Epoch 18 (valid): accuracy

48/100 * Epoch (train): 100% 56/56 [00:32<00:00,  1.70it/s, accuracy01=0.898, accuracy03=0.969, accuracy05=0.992, loss=0.415]
48/100 * Epoch (valid): 100% 14/14 [00:03<00:00,  3.68it/s, accuracy01=0.703, accuracy03=0.914, accuracy05=0.961, loss=0.929]
[2020-10-17 18:07:43,687] 
48/100 * Epoch 48 (_base): lr=0.0003 | momentum=0.9000
48/100 * Epoch 48 (train): accuracy01=0.8258 | accuracy03=0.9516 | accuracy05=0.9806 | loss=0.5395
48/100 * Epoch 48 (valid): accuracy01=0.6970 | accuracy03=0.8923 | accuracy05=0.9520 | loss=1.0760
49/100 * Epoch (train): 100% 56/56 [00:33<00:00,  1.70it/s, accuracy01=0.805, accuracy03=0.977, accuracy05=0.992, loss=0.518]
49/100 * Epoch (valid): 100% 14/14 [00:03<00:00,  3.75it/s, accuracy01=0.766, accuracy03=0.945, accuracy05=0.969, loss=0.733]
[2020-10-17 18:08:20,864] 
49/100 * Epoch 49 (_base): lr=0.0003 | momentum=0.9000
49/100 * Epoch 49 (train): accuracy01=0.8199 | accuracy03=0.9530 | accuracy05=0.9816 | loss=0.5471
49/100 * Epoch 49 (valid): accuracy

79/100 * Epoch (train): 100% 56/56 [00:32<00:00,  1.70it/s, accuracy01=0.867, accuracy03=0.977, accuracy05=1.000, loss=0.388]
79/100 * Epoch (valid): 100% 14/14 [00:03<00:00,  3.79it/s, accuracy01=0.766, accuracy03=0.922, accuracy05=0.953, loss=0.810]
[2020-10-17 18:27:03,191] 
79/100 * Epoch 79 (_base): lr=0.0003 | momentum=0.9000
79/100 * Epoch 79 (train): accuracy01=0.8680 | accuracy03=0.9637 | accuracy05=0.9856 | loss=0.4148
79/100 * Epoch 79 (valid): accuracy01=0.8013 | accuracy03=0.9358 | accuracy05=0.9699 | loss=0.6727
80/100 * Epoch (train): 100% 56/56 [00:32<00:00,  1.70it/s, accuracy01=0.906, accuracy03=0.969, accuracy05=0.992, loss=0.311]
80/100 * Epoch (valid): 100% 14/14 [00:03<00:00,  3.81it/s, accuracy01=0.773, accuracy03=0.914, accuracy05=0.953, loss=0.676]
[2020-10-17 18:27:40,483] 
80/100 * Epoch 80 (_base): lr=0.0003 | momentum=0.9000
80/100 * Epoch 80 (train): accuracy01=0.8723 | accuracy03=0.9626 | accuracy05=0.9872 | loss=0.4019
80/100 * Epoch 80 (valid): accuracy

This code below will generate a submission.
It reads images from `test` folder and gathers prediction from the trained model.
Check your submission before uploading it into `Kaggle`.

In [13]:
submission = {"Id": [], "Category": []}
model.eval()

test_dataset = create_dataset(dirs=f"{path_to_dataset}/test", extension="*.jpg")
test_data = list({"filepath": filepath} for filepath in test_dataset["test"])

test_open_fn = ReaderCompose(
    [
        ImageReader(
            input_key="filepath", output_key="features", rootpath=""
        ),
        ScalarReader(
            input_key="filepath",
            output_key="filepath",
            default_value="",
            dtype=str,
        ),
    ]
)

test_loader = utils.get_loader(
    test_data,
    open_fn=test_open_fn,
    dict_transform=valid_data_transform,
    batch_size=batch_size,
    num_workers=num_workers,
    shuffle=False,
    sampler=None,
    drop_last=False,
)

for prediction in runner.predict_loader(loader=test_loader):
    prediction["labels"] = [class_names[c] for c in torch.max(prediction["log_probs"], axis=1)[1]]
    submission["Id"].extend(Path(f).stem for f in prediction["filepath"])
    submission["Category"].extend(prediction["labels"])

In [14]:
pd.DataFrame(submission).to_csv(f"sub_cls_{str(datetime.now().strftime('%Y%m%d-%H%M%S'))}.csv", index=False)