### Обзор попробованных методов
1. Первое решение у меня было fine-tuning только линейного слоя efficientnet_b0 (Public LB 0.93)
2. Дальше я пробовал различные Sharpness or Blur Metrics (в скобках я указал ROC AUC, который получался при применении их к трейну так, как было сделано в бейзлайне):

- статистики на основе edges от фильтров Sobel, Candy, Robert (from skimage.filters import laplace, sobel, roberts)
- Laplasian variance (0.93)
- tenengrad, modified_laplacian, focused_sum
- Haar Wavelet Transform (0.88) - https://github.com/pedrofrodenas/blur-Detection-Haar-Wavelet
- DCT algorithm https://github.com/linghu8812/blur_detector
- Singular Value Decomposition (SVD) https://github.com/fled/blur_detection
- skimage.measure.blur_effect (**0.966**)
- cpbd (0.93) https://github.com/daniilkk/python-cpbd
- No-reference blurred image quality assessment by structural similarity index (0.916) https://github.com/ISipi/nssim_implementation
- Sharpness Estimation for Document and Scene Images (**0.97 !**) https://github.com/umang-singhal/pydom

Отобрав лучшие из них, с помощью catboost получил Public LB 0.9797

3. Вернулся к efficientnet_b0, только тренировал ее вместе с лучшими метриками из предыдущего пункта (Public LB 0.992)
4. Разбивал каждое изображение на 5 частей и усреднял предсказания модели (Public LB 0.987)
5. В итоге убрал метрики, разбиение изображения и тренировал полностью всю efficientnet_b0 - это решение представлено в данном ноутбуке (Public LB 0.996).

efficientnet_b0 - потому что нам нужно хорошее выделение признаков, но одновременно не самая сильная сеть для уменьшения переобучения, так как сложность данной задачи намного меньше той, на которой обучалась efficientnet_b0.

In [1]:
import sys
import numpy as np
import pandas as pd

import os
import gc
import matplotlib.pyplot as plt
import importlib
import pickle

from tqdm.notebook import tqdm

pd.set_option('display.max_rows', 200)
pd.set_option("max_colwidth", 45)
pd.set_option("display.precision", 1)
pd.options.display.float_format = "{:.3f}".format
# pd.set_option("display.max_rows", 5)
# pd.reset_option("display.max_rows")

from sklearn.model_selection import train_test_split
from sklearn.model_selection import StratifiedKFold

# from pandarallel import pandarallel
# pandarallel.initialize(progress_bar=True)

SEED = 34
N_CPU = os.cpu_count()

In [2]:
import cv2
import random
from torch.utils.data import DataLoader,Dataset
from typing import Union, Any, Optional, Tuple, Dict, List
import torchvision.transforms as T
import torchvision
from PIL import Image

import torch
from torch import nn, Tensor
from torch.nn.modules.loss import BCEWithLogitsLoss
from torchmetrics.classification import BinaryAUROC
import torch.optim as optim

torch.__version__, torchvision.__version__

('1.11.0', '0.12.0')

In [3]:
dir_data = '/kaggle/input/shift-cv-winter-2023/'

DIR_TRAIN = dir_data + 'train/train/'
DIR_TEST = dir_data + 'test/test/'

df_train = pd.read_csv(dir_data + 'train.csv')
df_test = pd.read_csv(dir_data + 'sample_submission.csv')

df_train['filename'] = DIR_TRAIN + df_train['filename']
df_test['filename'] = DIR_TEST + + df_test['filename']
df_train[:3]

Unnamed: 0,filename,blur
0,/kaggle/input/shift-cv-winter-2023/train/...,0.0
1,/kaggle/input/shift-cv-winter-2023/train/...,0.0
2,/kaggle/input/shift-cv-winter-2023/train/...,0.0


In [4]:
df_train['blur'].value_counts()

0.000    1367
1.000    1297
Name: blur, dtype: int64

In [6]:
def add_folds(df:pd.DataFrame, n_folds:int=4, seed:int=34) -> pd.DataFrame:

    skf = StratifiedKFold(n_splits=n_folds, shuffle=True, random_state=seed)

    df['fold'] = -1
    for fold, (trn_, val_) in enumerate(skf.split(df,df['blur'])):
        df.loc[val_,'fold'] = fold

    return df

In [7]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

def set_seed(seed: int = 34) -> None:
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
    os.environ["PYTHONHASHSEED"] = str(seed)
    print(f"Random seed set as {seed}")

set_seed(SEED)

cuda
Random seed set as 34


In [8]:
class Dset(Dataset):
    def __init__(self, df:pd.DataFrame, augmentation:Optional[T.Compose]=None):
        
        self.df = df.reset_index(drop=True)
        self.labels = torch.tensor(self.df['blur'].to_numpy(), dtype=torch.float32).unsqueeze(1)
        self.aug = augmentation

    def __len__(self):
        return len(self.df)

    def __getitem__(self, idx):

        img_path = self.df['filename'].iloc[idx]
        image = Image.open(img_path)

        if self.aug:
            image = self.aug(image)

        return {
            "image": image,
            "label": self.labels[idx],
            }

In [9]:
OUTPUT_SHAPE = (224, 224)

train_transform = T.Compose([
    T.RandomCrop(OUTPUT_SHAPE), 
    T.ToTensor(),
    T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    T.RandomVerticalFlip(p=0.5),
    T.RandomHorizontalFlip(p=0.5)]) 

val_transform = T.Compose([
    T.CenterCrop(OUTPUT_SHAPE), 
    T.ToTensor(),
    T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])

In [10]:
class Model(nn.Module):
    def __init__(self, hid_dim:int = 10) -> None:
        super().__init__()

        self.feats_extractor = torchvision.models.efficientnet_b0(pretrained=True)
        self.feats_extractor.classifier = nn.Sequential(
            torch.nn.Dropout(p=0.4, inplace=True), 
            torch.nn.Linear(in_features=1280, out_features=hid_dim),
            torch.nn.ReLU(),
            torch.nn.Dropout(p=0.1), 
            torch.nn.Linear(in_features=hid_dim, out_features=1))

    def forward(self, batch:Dict[str,Tensor]):
        x = self.feats_extractor(batch['image'])
        return x

class Trainer():
    def __init__(self, model: torch.nn.Module, train_dataloader: DataLoader, val_dataloader: DataLoader,loss_fn: torch.nn.Module, optimizer: torch.optim.Optimizer,scheduler, device: Union[torch.device,str], max_tol:int=14, load_best:bool=True, fold:int=111) -> None:

        self.model = model.to(device)
        self.optimizer = optimizer
        self.scheduler = scheduler
        self.device = device

        self.train_dl = train_dataloader
        self.val_dl = val_dataloader

        self.loss_fn = loss_fn
        self.metric = BinaryAUROC()

        self.train_losses = []
        self.val_losses = []

        self.best_metric = None
        self.max_tol = max_tol
        self.tol = 0

        self.total_epochs = 0
        self.load_best = load_best
        self.best_ckpt = None
        self.fold = fold
        
        self.ckpt_dir = 'ckpts/'    
        os.makedirs(self.ckpt_dir,exist_ok=True)

    def training_step(self, batch:Dict[str,Tensor]) -> None:
        batch = self.dict_to_device(batch)
        y = batch['label']

        y_pred = self.model(batch)
        loss = self.loss_fn(y_pred, y)

        loss.backward()
        self.optimizer.step()
        self.optimizer.zero_grad()

        self.train_losses.append(loss.item())

    def validation_step(self, batch:Dict[str,Tensor]) -> None:
        batch = self.dict_to_device(batch)
        y = batch['label']

        y_pred = self.model(batch)
        loss = self.loss_fn(y_pred, y)
        
        #  If preds has values outside [0,1] range we consider the input to be logits and will auto apply sigmoid per element.
        self.metric(y_pred,y)

        self.val_losses.append(loss.item())

    def train(self, max_epochs:int=1000) -> None:

        for epoch in tqdm(range(max_epochs)):
            self.total_epochs += 1

            self.model.train()
            for batch in tqdm(self.train_dl, leave=False):
                self.training_step(batch)

            self.model.eval()
            with torch.inference_mode():
                for batch in self.val_dl:
                    self.validation_step(batch)

            train_epoch_loss = np.mean(self.train_losses)
            val_epoch_loss = np.mean(self.val_losses)
            val_metric = self.metric.compute()
            self.reset_losses_metrics()

            self.scheduler.step(val_metric)

            print(f"Epoch: {epoch+1} | "
                    f"train_loss: {train_epoch_loss:.4f} | "
                    f"val_loss: {val_epoch_loss:.4f} | "
                    f"ROC_AUC: {val_metric:.5f}")

            if self.early_stopping(val_metric):
                break
            elif self.is_best_metric(val_metric):
                self.save_checkpoint(f'ckpts/best_fold_{self.fold}.ckpt')
            

        print(f'[BEST_METRIC]: {self.best_metric}')

        if self.load_best:
            self.load_checkpoint(self.best_ckpt)

    def predict(self, test_dl:DataLoader):
        preds = []
        self.model.eval()
        with torch.inference_mode():
            for batch in test_dl:
                batch = self.dict_to_device(batch)
                y_probs = torch.sigmoid(self.model(batch)).cpu()
                preds.append(y_probs)

        return torch.cat(preds).numpy()

    def dict_to_device(self, batch:Dict[str,torch.tensor]) -> Dict[str,torch.tensor]:
        return {k: v.to(self.device) if hasattr(v, 'to') else v for k, v in batch.items()}

    def reset_losses_metrics(self) -> None:
        self.train_losses = []
        self.val_losses = []
        self.metric.reset()

    def is_best_metric(self, val_metric:float) -> bool:
        if (self.best_metric is None) or val_metric > self.best_metric:
            self.best_metric = val_metric
            return True
        return False

    def early_stopping(self, val_metric:float) -> bool:
        if (self.best_metric is not None) and val_metric <= self.best_metric:
            self.tol += 1
        else:
            self.tol = 0
        if self.tol >= self.max_tol:
            print('early_stopping')
            return True
        return False

    def save_checkpoint(self, filename:str = 'best.ckpt') -> None:
        checkpoint = {'epoch': self.total_epochs,
                      'model_state_dict': self.model.state_dict(),
                      'optimizer_state_dict': self.optimizer.state_dict(),
                      'best_metric':self.best_metric}
        torch.save(checkpoint, filename)
        self.best_ckpt = filename

    def load_checkpoint(self, filename:str = 'best.ckpt') -> None:
        print('load checkpoint: ', filename)
        checkpoint = torch.load(filename)

        self.model.load_state_dict(checkpoint['model_state_dict'])
        self.optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
        self.model.train()

In [None]:
# import shutil
# shutil.rmtree('./ckpts')

In [11]:
from itertools import product

N_REPEATS = 1
N_FOLDS = 4
TRAIN_BATCH_SIZE = 64
VAL_BATCH_SIZE = 128

test_ds = Dset(df_test, augmentation=val_transform)
test_dl = DataLoader(test_ds, batch_size=VAL_BATCH_SIZE, num_workers=N_CPU)

preds = np.zeros(len(df_test))

for repeat, fold in product(range(N_REPEATS),range(N_FOLDS)):
    
    print(f'\n[START TRAINING]: REPEAT {repeat} FOLD {fold}')
    df_train = add_folds(df_train, n_folds=N_FOLDS, seed=SEED*(repeat+1))

    Xy_train = df_train[df_train['fold']!=fold]
    Xy_val = df_train[df_train['fold']==fold]

    train_ds = Dset(Xy_train, augmentation=train_transform)
    val_ds = Dset(Xy_val, augmentation=val_transform)

    train_dl = DataLoader(train_ds, batch_size=TRAIN_BATCH_SIZE, shuffle=True, num_workers=N_CPU)
    val_dl = DataLoader(val_ds, batch_size=VAL_BATCH_SIZE, num_workers=N_CPU)

    model = Model().to(device)

    optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

    scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, 'max', factor=0.5, patience=5, threshold=0.0001, min_lr=0.00005,verbose=True)

    loss_fn = BCEWithLogitsLoss()

    trainer = Trainer(model=model,
                       train_dataloader=train_dl,
                       val_dataloader=val_dl,
                       optimizer=optimizer,
                       scheduler=scheduler,
                       loss_fn=loss_fn,
                       device=device,
                       fold=f'{repeat}_{fold}')

    trainer.train(max_epochs=1000)

    fold_preds = trainer.predict(test_dl)
    preds += fold_preds.flatten()

    del model, trainer
    gc.collect()

    torch.cuda.empty_cache()


[START TRAINING]: REPEAT 0 FOLD 0


Downloading: "https://download.pytorch.org/models/efficientnet_b0_rwightman-3dd342df.pth" to /root/.cache/torch/hub/checkpoints/efficientnet_b0_rwightman-3dd342df.pth


  0%|          | 0.00/20.5M [00:00<?, ?B/s]

  0%|          | 0/1000 [00:00<?, ?it/s]

  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.3469 | val_loss: 0.1448 | ROC_AUC: 0.99080


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 2 | train_loss: 0.1919 | val_loss: 0.1050 | ROC_AUC: 0.98860


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 3 | train_loss: 0.1796 | val_loss: 0.0743 | ROC_AUC: 0.99462


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 4 | train_loss: 0.1531 | val_loss: 0.0632 | ROC_AUC: 0.99747


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 5 | train_loss: 0.1472 | val_loss: 0.0586 | ROC_AUC: 0.99788


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 6 | train_loss: 0.1393 | val_loss: 0.0618 | ROC_AUC: 0.99726


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 7 | train_loss: 0.1016 | val_loss: 0.0558 | ROC_AUC: 0.99625


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 8 | train_loss: 0.1103 | val_loss: 0.0399 | ROC_AUC: 0.99874


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 9 | train_loss: 0.0917 | val_loss: 0.0514 | ROC_AUC: 0.99798


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 10 | train_loss: 0.1427 | val_loss: 0.0499 | ROC_AUC: 0.99814


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 11 | train_loss: 0.1040 | val_loss: 0.0307 | ROC_AUC: 0.99913


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 12 | train_loss: 0.1263 | val_loss: 0.0261 | ROC_AUC: 0.99980


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 13 | train_loss: 0.1026 | val_loss: 0.0824 | ROC_AUC: 0.99683


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 14 | train_loss: 0.0861 | val_loss: 0.0640 | ROC_AUC: 0.99696


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 15 | train_loss: 0.1113 | val_loss: 0.0367 | ROC_AUC: 0.99887


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 16 | train_loss: 0.0952 | val_loss: 0.0392 | ROC_AUC: 0.99868


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 17 | train_loss: 0.0728 | val_loss: 0.0333 | ROC_AUC: 0.99836


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch 00018: reducing learning rate of group 0 to 5.0000e-04.
Epoch: 18 | train_loss: 0.1011 | val_loss: 0.0488 | ROC_AUC: 0.99796


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 19 | train_loss: 0.0873 | val_loss: 0.0266 | ROC_AUC: 0.99903


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 20 | train_loss: 0.0427 | val_loss: 0.0217 | ROC_AUC: 0.99949


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 21 | train_loss: 0.0443 | val_loss: 0.0239 | ROC_AUC: 0.99963


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 22 | train_loss: 0.0670 | val_loss: 0.0344 | ROC_AUC: 0.99920


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 23 | train_loss: 0.0688 | val_loss: 0.0326 | ROC_AUC: 0.99894


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch 00024: reducing learning rate of group 0 to 2.5000e-04.
Epoch: 24 | train_loss: 0.0540 | val_loss: 0.0263 | ROC_AUC: 0.99953


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 25 | train_loss: 0.0415 | val_loss: 0.0440 | ROC_AUC: 0.99833


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 26 | train_loss: 0.0757 | val_loss: 0.0366 | ROC_AUC: 0.99947
early_stopping
[BEST_METRIC]: 0.9998014569282532
load checkpoint:  ckpts/best_fold_0_0.ckpt

[START TRAINING]: REPEAT 0 FOLD 1


  0%|          | 0/1000 [00:00<?, ?it/s]

  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.3677 | val_loss: 0.2257 | ROC_AUC: 0.97473


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 2 | train_loss: 0.1912 | val_loss: 0.1432 | ROC_AUC: 0.99366


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 3 | train_loss: 0.1671 | val_loss: 0.0715 | ROC_AUC: 0.99699


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 4 | train_loss: 0.1891 | val_loss: 0.0934 | ROC_AUC: 0.99735


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 5 | train_loss: 0.1471 | val_loss: 0.0590 | ROC_AUC: 0.99842


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 6 | train_loss: 0.1106 | val_loss: 0.0437 | ROC_AUC: 0.99901


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 7 | train_loss: 0.1069 | val_loss: 0.0564 | ROC_AUC: 0.99743


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 8 | train_loss: 0.1175 | val_loss: 0.0750 | ROC_AUC: 0.99774


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 9 | train_loss: 0.1105 | val_loss: 0.0436 | ROC_AUC: 0.99847


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 10 | train_loss: 0.1060 | val_loss: 0.0754 | ROC_AUC: 0.99720


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 11 | train_loss: 0.1019 | val_loss: 0.0656 | ROC_AUC: 0.99814


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 12 | train_loss: 0.0813 | val_loss: 0.0339 | ROC_AUC: 0.99958


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 13 | train_loss: 0.0885 | val_loss: 0.0486 | ROC_AUC: 0.99898


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 14 | train_loss: 0.1182 | val_loss: 0.0572 | ROC_AUC: 0.99782


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 15 | train_loss: 0.1058 | val_loss: 0.0331 | ROC_AUC: 0.99898


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 16 | train_loss: 0.0702 | val_loss: 0.0610 | ROC_AUC: 0.99800


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 17 | train_loss: 0.1084 | val_loss: 0.0512 | ROC_AUC: 0.99911


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch 00018: reducing learning rate of group 0 to 5.0000e-04.
Epoch: 18 | train_loss: 0.0987 | val_loss: 0.0335 | ROC_AUC: 0.99883


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 19 | train_loss: 0.0899 | val_loss: 0.0309 | ROC_AUC: 0.99926


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 20 | train_loss: 0.0616 | val_loss: 0.0205 | ROC_AUC: 0.99969


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 21 | train_loss: 0.0616 | val_loss: 0.0329 | ROC_AUC: 0.99941


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 22 | train_loss: 0.0809 | val_loss: 0.0276 | ROC_AUC: 0.99909


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 23 | train_loss: 0.0432 | val_loss: 0.0149 | ROC_AUC: 0.99968


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 24 | train_loss: 0.0490 | val_loss: 0.0255 | ROC_AUC: 0.99970


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 25 | train_loss: 0.0431 | val_loss: 0.0144 | ROC_AUC: 0.99978


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 26 | train_loss: 0.0453 | val_loss: 0.0166 | ROC_AUC: 0.99988


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 27 | train_loss: 0.0483 | val_loss: 0.0289 | ROC_AUC: 0.99991


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 28 | train_loss: 0.0643 | val_loss: 0.0341 | ROC_AUC: 0.99889


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 29 | train_loss: 0.0632 | val_loss: 0.0160 | ROC_AUC: 0.99980


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 30 | train_loss: 0.0721 | val_loss: 0.0335 | ROC_AUC: 0.99928


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 31 | train_loss: 0.0412 | val_loss: 0.0285 | ROC_AUC: 0.99900


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch 00032: reducing learning rate of group 0 to 2.5000e-04.
Epoch: 32 | train_loss: 0.0299 | val_loss: 0.0323 | ROC_AUC: 0.99903


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 33 | train_loss: 0.0351 | val_loss: 0.0162 | ROC_AUC: 0.99959


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 34 | train_loss: 0.0236 | val_loss: 0.0182 | ROC_AUC: 0.99954


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 35 | train_loss: 0.0331 | val_loss: 0.0278 | ROC_AUC: 0.99951


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 36 | train_loss: 0.0320 | val_loss: 0.0173 | ROC_AUC: 0.99959


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 37 | train_loss: 0.0280 | val_loss: 0.0108 | ROC_AUC: 0.99982


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch 00038: reducing learning rate of group 0 to 1.2500e-04.
Epoch: 38 | train_loss: 0.0230 | val_loss: 0.0115 | ROC_AUC: 0.99973


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 39 | train_loss: 0.0279 | val_loss: 0.0147 | ROC_AUC: 0.99964


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 40 | train_loss: 0.0298 | val_loss: 0.0166 | ROC_AUC: 0.99950


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 41 | train_loss: 0.0221 | val_loss: 0.0171 | ROC_AUC: 0.99939
early_stopping
[BEST_METRIC]: 0.9999097585678101
load checkpoint:  ckpts/best_fold_0_1.ckpt

[START TRAINING]: REPEAT 0 FOLD 2


  0%|          | 0/1000 [00:00<?, ?it/s]

  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.3575 | val_loss: 0.2585 | ROC_AUC: 0.96741


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 2 | train_loss: 0.2001 | val_loss: 0.1543 | ROC_AUC: 0.98029


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 3 | train_loss: 0.1747 | val_loss: 0.1583 | ROC_AUC: 0.98644


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 4 | train_loss: 0.1546 | val_loss: 0.1869 | ROC_AUC: 0.99320


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 5 | train_loss: 0.1136 | val_loss: 0.0788 | ROC_AUC: 0.99435


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 6 | train_loss: 0.1130 | val_loss: 0.1347 | ROC_AUC: 0.99280


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 7 | train_loss: 0.1264 | val_loss: 0.0953 | ROC_AUC: 0.99229


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 8 | train_loss: 0.0993 | val_loss: 0.0885 | ROC_AUC: 0.99346


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 9 | train_loss: 0.1171 | val_loss: 0.0723 | ROC_AUC: 0.99625


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 10 | train_loss: 0.1040 | val_loss: 0.0818 | ROC_AUC: 0.99459


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 11 | train_loss: 0.0886 | val_loss: 0.0894 | ROC_AUC: 0.99765


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 12 | train_loss: 0.0915 | val_loss: 0.0710 | ROC_AUC: 0.99543


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 13 | train_loss: 0.0798 | val_loss: 0.0713 | ROC_AUC: 0.99586


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 14 | train_loss: 0.0670 | val_loss: 0.0718 | ROC_AUC: 0.99646


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 15 | train_loss: 0.0804 | val_loss: 0.1081 | ROC_AUC: 0.99464


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 16 | train_loss: 0.0889 | val_loss: 0.0516 | ROC_AUC: 0.99820


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 17 | train_loss: 0.0683 | val_loss: 0.0801 | ROC_AUC: 0.99580


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 18 | train_loss: 0.0904 | val_loss: 0.0509 | ROC_AUC: 0.99824


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 19 | train_loss: 0.0660 | val_loss: 0.0848 | ROC_AUC: 0.99724


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 20 | train_loss: 0.0662 | val_loss: 0.0655 | ROC_AUC: 0.99557


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 21 | train_loss: 0.0634 | val_loss: 0.0495 | ROC_AUC: 0.99847


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 22 | train_loss: 0.0523 | val_loss: 0.0581 | ROC_AUC: 0.99699


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 23 | train_loss: 0.0610 | val_loss: 0.0856 | ROC_AUC: 0.99445


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 24 | train_loss: 0.0582 | val_loss: 0.0507 | ROC_AUC: 0.99759


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 25 | train_loss: 0.0634 | val_loss: 0.1447 | ROC_AUC: 0.99615


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 26 | train_loss: 0.0804 | val_loss: 0.0538 | ROC_AUC: 0.99689


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch 00027: reducing learning rate of group 0 to 5.0000e-04.
Epoch: 27 | train_loss: 0.0598 | val_loss: 0.0472 | ROC_AUC: 0.99838


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 28 | train_loss: 0.0388 | val_loss: 0.0518 | ROC_AUC: 0.99756


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 29 | train_loss: 0.0461 | val_loss: 0.0437 | ROC_AUC: 0.99867


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 30 | train_loss: 0.0392 | val_loss: 0.0439 | ROC_AUC: 0.99893


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 31 | train_loss: 0.0386 | val_loss: 0.0340 | ROC_AUC: 0.99886


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 32 | train_loss: 0.0333 | val_loss: 0.0409 | ROC_AUC: 0.99857


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 33 | train_loss: 0.0275 | val_loss: 0.0321 | ROC_AUC: 0.99842


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 34 | train_loss: 0.0312 | val_loss: 0.0402 | ROC_AUC: 0.99797


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 35 | train_loss: 0.0308 | val_loss: 0.0331 | ROC_AUC: 0.99881


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch 00036: reducing learning rate of group 0 to 2.5000e-04.
Epoch: 36 | train_loss: 0.0430 | val_loss: 0.1058 | ROC_AUC: 0.99812


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 37 | train_loss: 0.0302 | val_loss: 0.0390 | ROC_AUC: 0.99879


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 38 | train_loss: 0.0300 | val_loss: 0.0302 | ROC_AUC: 0.99925


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 39 | train_loss: 0.0356 | val_loss: 0.0472 | ROC_AUC: 0.99824


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 40 | train_loss: 0.0393 | val_loss: 0.0395 | ROC_AUC: 0.99848


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 41 | train_loss: 0.0293 | val_loss: 0.0444 | ROC_AUC: 0.99910


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 42 | train_loss: 0.0235 | val_loss: 0.0543 | ROC_AUC: 0.99861


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 43 | train_loss: 0.0305 | val_loss: 0.0432 | ROC_AUC: 0.99861


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch 00044: reducing learning rate of group 0 to 1.2500e-04.
Epoch: 44 | train_loss: 0.0182 | val_loss: 0.0438 | ROC_AUC: 0.99876


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 45 | train_loss: 0.0266 | val_loss: 0.0391 | ROC_AUC: 0.99885


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 46 | train_loss: 0.0257 | val_loss: 0.0363 | ROC_AUC: 0.99914


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 47 | train_loss: 0.0169 | val_loss: 0.0409 | ROC_AUC: 0.99880


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 48 | train_loss: 0.0211 | val_loss: 0.0413 | ROC_AUC: 0.99912


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 49 | train_loss: 0.0157 | val_loss: 0.0384 | ROC_AUC: 0.99910


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch 00050: reducing learning rate of group 0 to 6.2500e-05.
Epoch: 50 | train_loss: 0.0103 | val_loss: 0.0356 | ROC_AUC: 0.99920


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 51 | train_loss: 0.0193 | val_loss: 0.0379 | ROC_AUC: 0.99917


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 52 | train_loss: 0.0159 | val_loss: 0.0344 | ROC_AUC: 0.99898
early_stopping
[BEST_METRIC]: 0.9992509484291077
load checkpoint:  ckpts/best_fold_0_2.ckpt

[START TRAINING]: REPEAT 0 FOLD 3


  0%|          | 0/1000 [00:00<?, ?it/s]

  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.3628 | val_loss: 0.2551 | ROC_AUC: 0.96171


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 2 | train_loss: 0.1786 | val_loss: 0.1114 | ROC_AUC: 0.98910


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 3 | train_loss: 0.2007 | val_loss: 0.0975 | ROC_AUC: 0.99320


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 4 | train_loss: 0.1308 | val_loss: 0.1542 | ROC_AUC: 0.98999


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 5 | train_loss: 0.1556 | val_loss: 0.0664 | ROC_AUC: 0.99595


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 6 | train_loss: 0.1248 | val_loss: 0.0583 | ROC_AUC: 0.99724


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 7 | train_loss: 0.1258 | val_loss: 0.0423 | ROC_AUC: 0.99820


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 8 | train_loss: 0.0971 | val_loss: 0.0561 | ROC_AUC: 0.99635


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 9 | train_loss: 0.0995 | val_loss: 0.0811 | ROC_AUC: 0.99759


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 10 | train_loss: 0.1015 | val_loss: 0.0690 | ROC_AUC: 0.99587


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 11 | train_loss: 0.1178 | val_loss: 0.0605 | ROC_AUC: 0.99520


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 12 | train_loss: 0.0956 | val_loss: 0.0570 | ROC_AUC: 0.99681


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch 00013: reducing learning rate of group 0 to 5.0000e-04.
Epoch: 13 | train_loss: 0.0766 | val_loss: 0.0607 | ROC_AUC: 0.99726


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 14 | train_loss: 0.0613 | val_loss: 0.0477 | ROC_AUC: 0.99813


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 15 | train_loss: 0.0492 | val_loss: 0.0460 | ROC_AUC: 0.99748


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 16 | train_loss: 0.0540 | val_loss: 0.0542 | ROC_AUC: 0.99694


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 17 | train_loss: 0.0813 | val_loss: 0.0346 | ROC_AUC: 0.99774


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 18 | train_loss: 0.0814 | val_loss: 0.0432 | ROC_AUC: 0.99774


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 19 | train_loss: 0.0476 | val_loss: 0.0332 | ROC_AUC: 0.99915


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 20 | train_loss: 0.0963 | val_loss: 0.0531 | ROC_AUC: 0.99820


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 21 | train_loss: 0.0711 | val_loss: 0.0445 | ROC_AUC: 0.99713


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 22 | train_loss: 0.0705 | val_loss: 0.0365 | ROC_AUC: 0.99844


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 23 | train_loss: 0.0519 | val_loss: 0.0403 | ROC_AUC: 0.99760


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 24 | train_loss: 0.0516 | val_loss: 0.0405 | ROC_AUC: 0.99833


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch 00025: reducing learning rate of group 0 to 2.5000e-04.
Epoch: 25 | train_loss: 0.0514 | val_loss: 0.0447 | ROC_AUC: 0.99828


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 26 | train_loss: 0.0378 | val_loss: 0.0357 | ROC_AUC: 0.99896


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 27 | train_loss: 0.0405 | val_loss: 0.0311 | ROC_AUC: 0.99925


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 28 | train_loss: 0.0305 | val_loss: 0.0314 | ROC_AUC: 0.99925


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 29 | train_loss: 0.0276 | val_loss: 0.0339 | ROC_AUC: 0.99904


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 30 | train_loss: 0.0346 | val_loss: 0.0367 | ROC_AUC: 0.99891


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch 00031: reducing learning rate of group 0 to 1.2500e-04.
Epoch: 31 | train_loss: 0.0237 | val_loss: 0.0345 | ROC_AUC: 0.99925


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 32 | train_loss: 0.0391 | val_loss: 0.0349 | ROC_AUC: 0.99928


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 33 | train_loss: 0.0249 | val_loss: 0.0308 | ROC_AUC: 0.99926


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 34 | train_loss: 0.0303 | val_loss: 0.0319 | ROC_AUC: 0.99921


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 35 | train_loss: 0.0294 | val_loss: 0.0302 | ROC_AUC: 0.99923


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 36 | train_loss: 0.0231 | val_loss: 0.0282 | ROC_AUC: 0.99928


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 37 | train_loss: 0.0231 | val_loss: 0.0264 | ROC_AUC: 0.99933


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch 00038: reducing learning rate of group 0 to 6.2500e-05.
Epoch: 38 | train_loss: 0.0204 | val_loss: 0.0305 | ROC_AUC: 0.99914


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 39 | train_loss: 0.0223 | val_loss: 0.0311 | ROC_AUC: 0.99912


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 40 | train_loss: 0.0216 | val_loss: 0.0308 | ROC_AUC: 0.99915


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 41 | train_loss: 0.0174 | val_loss: 0.0326 | ROC_AUC: 0.99904


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 42 | train_loss: 0.0233 | val_loss: 0.0307 | ROC_AUC: 0.99925


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 43 | train_loss: 0.0250 | val_loss: 0.0288 | ROC_AUC: 0.99931


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch 00044: reducing learning rate of group 0 to 5.0000e-05.
Epoch: 44 | train_loss: 0.0148 | val_loss: 0.0297 | ROC_AUC: 0.99933


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 45 | train_loss: 0.0185 | val_loss: 0.0298 | ROC_AUC: 0.99921


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 46 | train_loss: 0.0210 | val_loss: 0.0309 | ROC_AUC: 0.99922


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 47 | train_loss: 0.0237 | val_loss: 0.0300 | ROC_AUC: 0.99925


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 48 | train_loss: 0.0184 | val_loss: 0.0277 | ROC_AUC: 0.99944


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 49 | train_loss: 0.0200 | val_loss: 0.0283 | ROC_AUC: 0.99940


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 50 | train_loss: 0.0145 | val_loss: 0.0290 | ROC_AUC: 0.99946


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 51 | train_loss: 0.0239 | val_loss: 0.0294 | ROC_AUC: 0.99939


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 52 | train_loss: 0.0233 | val_loss: 0.0296 | ROC_AUC: 0.99937


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 53 | train_loss: 0.0166 | val_loss: 0.0311 | ROC_AUC: 0.99929


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 54 | train_loss: 0.0158 | val_loss: 0.0317 | ROC_AUC: 0.99921


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 55 | train_loss: 0.0134 | val_loss: 0.0331 | ROC_AUC: 0.99928


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 56 | train_loss: 0.0171 | val_loss: 0.0334 | ROC_AUC: 0.99928


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 57 | train_loss: 0.0113 | val_loss: 0.0338 | ROC_AUC: 0.99930


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 58 | train_loss: 0.0153 | val_loss: 0.0329 | ROC_AUC: 0.99933


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 59 | train_loss: 0.0105 | val_loss: 0.0326 | ROC_AUC: 0.99933


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 60 | train_loss: 0.0152 | val_loss: 0.0305 | ROC_AUC: 0.99941


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 61 | train_loss: 0.0181 | val_loss: 0.0288 | ROC_AUC: 0.99942


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 62 | train_loss: 0.0198 | val_loss: 0.0308 | ROC_AUC: 0.99939


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 63 | train_loss: 0.0154 | val_loss: 0.0332 | ROC_AUC: 0.99927


  0%|          | 0/32 [00:00<?, ?it/s]

Epoch: 64 | train_loss: 0.0173 | val_loss: 0.0321 | ROC_AUC: 0.99929
early_stopping
[BEST_METRIC]: 0.9994586110115051
load checkpoint:  ckpts/best_fold_0_3.ckpt


In [13]:
sub = pd.DataFrame()
sub['filename'] = df_test['filename'].apply(lambda x : os.path.split(x)[1])
sub['blur'] = pd.Series(preds).round(5)
sub.to_csv('submission18.csv', index=False)
sub[:3]

Unnamed: 0,filename,blur
0,bnxzvzqlzlnnbxfkcuin.jpg,0.001
1,powqsnpoynygwqsciedp.jpg,2.041
2,zpjlbfhurhygjnqccpii.jpg,0.005
