# CytologIA Data Challenge - Inference notebook

My approach leverages a two-stage modeling pipeline to effectively detect and classify white blood cells. Stage 1 is a universal White Blood Cell Detection. This stage employs an object detection model to identify white blood cells, irrespective of their underlying class. The goal is to robustly locate and distinguish white blood cells from other elements in the image. Stage 2 is classification once white blood cells are detected in the first stage, a second-stage model classifies them into specific categories. Alternatively, this stage can directly learn class labels when working with full images in the absence of explicit white cell detections. This modular pipeline ensures accurate localization and precise classification, optimizing performance for varying data scenarios and improving the adaptability of the solution.

<i>About inference timing (extract from online requirements):
Each image must be processed with an inference time of less than 500 ms. While GPU usage is permitted to accelerate processing, the solution must be compatible with regular GPUs typically available in standard setups, without requiring specialized hardware like high-end server-grade GPUs. Solutions optimized for CPU-based inference are preferred to ensure broader applicability, but CPU optimization is not mandatory.</i>

- Data: 20751 RGB images, 22689 boxes to submit (i.e maximum inference time allowed is around **2h52min**)
- Hardware (standard): Single RTX3090/24GBVRAM - SSD disk - 48GB RAM - Intel CPUCoreI9.
- Software: Python 3.10 / Pytorch 2.5 / Timm.
- Models: YoloX, CNN, Transformers.
- Licenses: All libraries and models weights are under open source Apache2 license.

Source code and trained weights to use for this inference notebook can be downloaded from: https://cytologia.s3.amazonaws.com/submission_inference_6.zip
- Test images must be under TRUSTII/images_cytologia folder
- Test CSV file must be under TRUSTII/test.csv
- CV: 0.93125
- Public LB: 0.93785
- Private LB: 0.93716

### Stage1: Bounding Box detection (white blood cells) - Total time = 1h
- Execute YoloX models
- Apply weighted boxes fusion to merge predictions
- Fit predictions to submission format
- Dump predictions in a folder as PNG images

### Stage2: White blood cells classification - Total time = 1h 41min
- Execute multi-classes models (on detected bounding boxes)
- Execute multi-labels models (on full images)
- Ensemble multi-classes and multi-labels models

In [1]:
# Uncomment and run this cell and restart kernel and run it again if you get an error about missing YoloX installation, it will work on the second run.
# !pip install loguru
# !pip install seaborn
# !pip install scikit-learn
# !pip install wandb
# !pip install pytorch-lightning
# !pip install timm
# !pip install albumentations
# !pip install scikit-image
# !pip install pyarrow
# !cd code/src_object_detector/YOLOX; pip install -v -e .

In [2]:
import sys
sys.path.append("./code/src_object_detector/YOLOX")  # sys.path.append("../code/src_object_detector/YOLOX")
sys.path.append("./code/src")  # sys.path.append("../code/src")
import cdc
from cdc.common.utils import *
from cdc.common.constants import *
from cdc.models.pl.classifier import *
from cdc.models.pl.dataset import *
from cdc.utils.imaging import *
from cdc.script.inferv1 import *
from cdc.yolo.tools import *
from cdc.yolo.inference import *

  check_for_updates()


In [3]:
import glob, os, time, random, gc, sys, math, re
import numpy as np
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt
from tqdm import tqdm
pd.set_option('display.max_columns', 100)
pd.set_option('display.max_colwidth', None)
sns.set(style='whitegrid', rc={"grid.linewidth": 0.1})
sns.set_context("paper", font_scale=0.8) 
import PIL
from PIL import Image
import cv2
import torch
import torch.nn as nn
import torch.nn.functional as F
import transformers
import wandb
import albumentations as A
from albumentations.pytorch import ToTensorV2

import pytorch_lightning as L
from pytorch_lightning.loggers import WandbLogger
from pytorch_lightning.loggers import CSVLogger
from pytorch_lightning.callbacks import ModelCheckpoint
from pytorch_lightning.callbacks import LearningRateMonitor
from pytorch_lightning import seed_everything

In [4]:
print("Python", sys.version)
print("Numpy", np.__version__)
print("Pandas", pd.__version__)
print("Torch", torch.__version__)
print("Transformers", transformers.__version__)
print("Lightning", L.__version__)
print("Albumentations", A.__version__)
print("CDC", cdc.__version__)

Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0]
Numpy 1.26.4
Pandas 2.2.2
Torch 2.5.1+cu124
Transformers 4.44.2
Lightning 2.4.0
Albumentations 1.4.22
CDC 1.0.0


## Stage1: WBC bounding boxes detection
- Execute Yolo models
- Apply weighted boxes fusion to merge predictions
- Fit predictions to submission format
- Dump predictions in a folder as PNG images

In [5]:
%%time
SEEDS = [42]
seed_everything_now(SEEDS[0])
seed_everything(SEEDS[0], workers=True)
torch.set_float32_matmul_precision('high')

DATA_ROOT = "./data" # "../data"
TRUSTII_HOME = os.path.join(DATA_ROOT, "TRUSTII")  # DATA_ROOT
TEST_FILE = os.path.join(TRUSTII_HOME, "test.csv")
TEST_HOME = os.path.join(TRUSTII_HOME, "images_cytologia")
MODELS_HOME = "./models" #  "../models"

VERSION = "v3.1"
INFERENCE_NAME = "top1_inference_yolox%s_5models"%VERSION
MODEL_YOLOX_HOME = "./code/src_object_detector/YOLOX/YOLOX_outputs"  # "../code/src_object_detector/YOLOX/YOLOX_outputs"
IMG_SIZE = 512 if VERSION == "v3.1" else 640
DEVICE = "gpu" # "cpu"
TEST_CONFIDENCE = 0.001
PADDING_POLICY = "full"
WBF_IOU = 0.25
ROOT = "yolox_s"
NMS_THRESHOLD = 0.30

# Dump BBX
CROP_MARGINS = 16
CROP_FILE_TEST_CLEANED = os.path.join(TRUSTII_HOME, "boxes_%s"%INFERENCE_NAME, "%s_%d_%s_%.4f_%.3f_%.3f_%s_m%d_seed42_test_cleaned.parquet"%(ROOT, IMG_SIZE, VERSION, TEST_CONFIDENCE, NMS_THRESHOLD, WBF_IOU, PADDING_POLICY, CROP_MARGINS))
CROP_HOME_TEST = os.path.join(TRUSTII_HOME, "boxes_%s"%INFERENCE_NAME, "%s_%d_%s_%.4f_%.3f_%.3f_%s_m%d_seed42_test_cleaned"%(ROOT, IMG_SIZE, VERSION, TEST_CONFIDENCE, NMS_THRESHOLD, WBF_IOU, PADDING_POLICY, CROP_MARGINS))
os.makedirs(CROP_HOME_TEST, exist_ok=True)

MODELS = {
    "yolox_s_%s_%d_seed_42"%(VERSION, IMG_SIZE): {
        "fold0":  (MODEL_YOLOX_HOME + "/yolox_s_%s_%d_seed_42_fold0"%(VERSION.replace(".", ""), IMG_SIZE), IMG_SIZE),
        "fold1":  (MODEL_YOLOX_HOME + "/yolox_s_%s_%d_seed_42_fold1"%(VERSION.replace(".", ""), IMG_SIZE), IMG_SIZE),
        "fold2":  (MODEL_YOLOX_HOME + "/yolox_s_%s_%d_seed_42_fold2"%(VERSION.replace(".", "") ,IMG_SIZE), IMG_SIZE),
        "fold3":  (MODEL_YOLOX_HOME + "/yolox_s_%s_%d_seed_42_fold3"%(VERSION.replace(".", ""), IMG_SIZE), IMG_SIZE),
    }
}

Seed set to 42


CPU times: user 3.01 ms, sys: 1.18 ms, total: 4.19 ms
Wall time: 16.3 ms


In [6]:
%%time
# Execute BBx models
test_pd = pd.read_csv(TEST_FILE)
files = [os.path.join(TEST_HOME, f) for f in test_pd["NAME"].values]
for name, _ in MODELS.items():
    test_roi_predictions = []        
    for fold, info in MODELS[name].items():
        path, imgsize = info
        print("Path:", path)
        if "yolox" in path:
            roi_predictions_ = predict_yolox(None, path + "/best_ckpt.pth", test_conf=TEST_CONFIDENCE, nmsthre=NMS_THRESHOLD, image_size=imgsize, files=files, device=DEVICE)
        else:
            roi_predictions_ = predict_yolo(None, path + "/weights/best.pt", test_conf=TEST_CONFIDENCE, nmsthre=NMS_THRESHOLD, image_size=imgsize, files=files, device=DEVICE)                
        test_roi_predictions.append(roi_predictions_)
    test_roi_predictions = pd.concat(test_roi_predictions, ignore_index=True)
test_roi_predictions_pd = test_roi_predictions.copy()
test_roi_predictions_pd["roi_width"] = test_roi_predictions_pd["bbx_xbr"] - test_roi_predictions_pd["bbx_xtl"]
test_roi_predictions_pd["roi_height"] = test_roi_predictions_pd["bbx_ybr"] - test_roi_predictions_pd["bbx_ytl"]
test_roi_predictions_pd["roi_surface"] = test_roi_predictions_pd["roi_width"]*test_roi_predictions_pd["roi_height"]
test_roi_predictions_pd["roi_surface_ratio"] = test_roi_predictions_pd["roi_surface"]*100./(test_roi_predictions_pd["slide_width"]*test_roi_predictions_pd["slide_height"])
print(test_roi_predictions_pd.shape)
test_roi_predictions_pd.head()

Path: ./code/src_object_detector/YOLOX/YOLOX_outputs/yolox_s_v31_512_seed_42_fold0


  ckpt = torch.load(ckpt_file, map_location="cpu")
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 22689/22689 [12:11<00:00, 31.00it/s]


Path: ./code/src_object_detector/YOLOX/YOLOX_outputs/yolox_s_v31_512_seed_42_fold1


100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 22689/22689 [11:18<00:00, 33.42it/s]


Path: ./code/src_object_detector/YOLOX/YOLOX_outputs/yolox_s_v31_512_seed_42_fold2


100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 22689/22689 [12:03<00:00, 31.35it/s]


Path: ./code/src_object_detector/YOLOX/YOLOX_outputs/yolox_s_v31_512_seed_42_fold3


100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 22689/22689 [12:08<00:00, 31.13it/s]

(142213, 13)
CPU times: user 32min 12s, sys: 7min 2s, total: 39min 15s
Wall time: 47min 47s





Unnamed: 0,filename,slide_width,slide_height,bbx_xtl,bbx_ytl,bbx_xbr,bbx_ybr,score,class,roi_width,roi_height,roi_surface,roi_surface_ratio
0,000455d4-8.jpg,360,363,93.142822,97.396729,265.869141,265.160156,0.967818,0.0,172.726318,167.763428,28977.160156,22.174135
1,000455d4-8.jpg,360,363,235.205566,0.0,325.601074,6.558105,0.00262,0.0,90.395508,6.558105,592.823303,0.453645
2,0007ccec-2.jpg,360,367,91.078003,82.700439,272.203613,283.134766,0.97317,0.0,181.12561,200.434326,36303.789062,27.477891
3,00080027-c.jpg,368,369,111.348633,106.664062,257.471191,261.254883,0.976595,0.0,146.122559,154.59082,22589.207031,16.635153
4,00084489-e.jpg,368,370,117.792969,114.902344,239.921875,284.726562,0.89666,0.0,122.128906,169.824219,20740.445312,15.232407


In [7]:
%%time
# Run Weighted-Boxes-Fusion over the predictions
roi_predictions_wbf = run_wbf(test_roi_predictions_pd, iou_thr=WBF_IOU, skip_box_thr=0.0001)
roi_predictions_wbf["bbx_xtl"] = roi_predictions_wbf["bbx_xtl"].apply(lambda x: np.round(x)).astype(np.int32)
roi_predictions_wbf["bbx_xbr"] = roi_predictions_wbf["bbx_xbr"].apply(lambda x: np.round(x)).astype(np.int32)
roi_predictions_wbf["bbx_ybr"] = roi_predictions_wbf["bbx_ybr"].apply(lambda x: np.round(x)).astype(np.int32)
roi_predictions_wbf["bbx_ytl"] = roi_predictions_wbf["bbx_ytl"].apply(lambda x: np.round(x)).astype(np.int32)
roi_predictions_wbf["roi_width"] = roi_predictions_wbf["bbx_xbr"] - roi_predictions_wbf["bbx_xtl"]
roi_predictions_wbf["roi_height"] = roi_predictions_wbf["bbx_ybr"] - roi_predictions_wbf["bbx_ytl"]
roi_predictions_wbf["roi_surface"] = roi_predictions_wbf["roi_width"]*roi_predictions_wbf["roi_height"]
roi_predictions_wbf["roi_surface_ratio"] = roi_predictions_wbf["roi_surface"]*100./(roi_predictions_wbf["slide_width"]*roi_predictions_wbf["slide_height"])
roi_predictions_wbf["predict_bb"] = roi_predictions_wbf[["bbx_xtl", "bbx_xbr", "bbx_ytl", "bbx_ybr", "class", "roi_surface_ratio", "score"]].apply(lambda x: (x[0], x[1], x[2], x[3], x[4], x[5], x[6]), axis=1)
print(roi_predictions_wbf.shape)
roi_predictions_wbf.head()



(29460, 14)
CPU times: user 10.7 s, sys: 6.3 ms, total: 10.7 s
Wall time: 10.7 s


Unnamed: 0,filename,slide_width,slide_height,bbx_xtl,bbx_ytl,bbx_xbr,bbx_ybr,score,class,roi_width,roi_height,roi_surface,roi_surface_ratio,predict_bb
0,000455d4-8.jpg,360,363,93,98,266,265,0.96794,0.0,173,167,28891,22.108203,"(93.0, 266.0, 98.0, 265.0, 0.0, 22.10820324456688, 0.9679403901100159)"
1,000455d4-8.jpg,360,363,234,0,325,7,0.00202,0.0,91,7,637,0.48745,"(234.0, 325.0, 0.0, 7.0, 0.0, 0.4874502601775329, 0.002019635634496808)"
2,0007ccec-2.jpg,360,367,91,82,272,283,0.971822,0.0,181,201,36381,27.536331,"(91.0, 272.0, 82.0, 283.0, 0.0, 27.536330608537693, 0.9718216061592102)"
3,00080027-c.jpg,368,369,111,107,257,261,0.97695,0.0,146,154,22484,16.557676,"(111.0, 257.0, 107.0, 261.0, 0.0, 16.55767644632968, 0.9769502282142639)"
4,00084489-e.jpg,368,370,118,115,241,269,0.879252,0.0,123,154,18942,13.911575,"(118.0, 241.0, 115.0, 269.0, 0.0, 13.911574618096358, 0.8792518973350525)"


In [8]:
%%time
print(test_pd.shape)
test_image_pd = test_pd.groupby("NAME")[["trustii_id"]].agg(trustii_ids=("trustii_id", list), trustii_bbs=("trustii_id", 'count')).reset_index()
# Fit to trustii IDs
roi_image_predictions_wbf = roi_predictions_wbf.sort_values(["filename", "score", "roi_surface_ratio"], ascending=[True, False, False]).reset_index(drop=True)
roi_image_predictions_wbf = roi_image_predictions_wbf.groupby(["filename", "slide_width", "slide_height"])[["predict_bb"]].agg(list).reset_index().rename(columns={'filename':'NAME'})
roi_image_predictions_wbf["predict_bbs"] = roi_image_predictions_wbf["predict_bb"].apply(lambda x: len(x))
roi_image_predictions_wbf = pd.merge(roi_image_predictions_wbf, test_image_pd, on="NAME", how="inner")
print("Too much predictions:", roi_image_predictions_wbf[roi_image_predictions_wbf["predict_bbs"] > roi_image_predictions_wbf["trustii_bbs"]].shape)
print("Not enough predictions:", roi_image_predictions_wbf[roi_image_predictions_wbf["predict_bbs"] < roi_image_predictions_wbf["trustii_bbs"]].shape)
print("Equal predictions:", roi_image_predictions_wbf[roi_image_predictions_wbf["predict_bbs"] == roi_image_predictions_wbf["trustii_bbs"]].shape)
not_enough_predict = roi_image_predictions_wbf[roi_image_predictions_wbf["predict_bbs"] < roi_image_predictions_wbf["trustii_bbs"]]["NAME"].unique()
# Keep top N predictions based on confidence.
roi_image_predictions_wbf["predict_bb"] = roi_image_predictions_wbf[["predict_bb", "trustii_bbs", "slide_width", "slide_height"]].apply(lambda x: fit_format(x[0], x[1], x[2], x[3], padding=True, padding_policy=PADDING_POLICY), axis=1)
roi_image_predictions_wbf["predict_bbs"] = roi_image_predictions_wbf["predict_bb"].apply(lambda x: len(x))
roi_image_predictions_wbf["predict_score_avg"] = roi_image_predictions_wbf["predict_bb"].apply(lambda x: compute_bbx_avg_score(x))
print("Too much predictions:", roi_image_predictions_wbf[roi_image_predictions_wbf["predict_bbs"] > roi_image_predictions_wbf["trustii_bbs"]].shape)
print("Not enough predictions:", roi_image_predictions_wbf[roi_image_predictions_wbf["predict_bbs"] < roi_image_predictions_wbf["trustii_bbs"]].shape)
print("Equal predictions:", roi_image_predictions_wbf[roi_image_predictions_wbf["predict_bbs"] == roi_image_predictions_wbf["trustii_bbs"]].shape)
roi_image_predictions_wbf.head()

(22689, 2)
Too much predictions: (4497, 7)
Not enough predictions: (10, 7)
Equal predictions: (16244, 7)




Too much predictions: (0, 8)
Not enough predictions: (0, 8)
Equal predictions: (20751, 8)
CPU times: user 1.27 s, sys: 9.65 ms, total: 1.28 s
Wall time: 1.28 s


Unnamed: 0,NAME,slide_width,slide_height,predict_bb,predict_bbs,trustii_ids,trustii_bbs,predict_score_avg
0,000455d4-8.jpg,360,363,"[(93.0, 266.0, 98.0, 265.0, 0.0, 22.10820324456688, 0.9679403901100159)]",1,[23798],1,0.96794
1,0007ccec-2.jpg,360,367,"[(91.0, 272.0, 82.0, 283.0, 0.0, 27.536330608537693, 0.9718216061592102)]",1,[22386],1,0.971822
2,00080027-c.jpg,368,369,"[(111.0, 257.0, 107.0, 261.0, 0.0, 16.55767644632968, 0.9769502282142639)]",1,[59769],1,0.97695
3,00084489-e.jpg,368,370,"[(118.0, 241.0, 115.0, 269.0, 0.0, 13.911574618096358, 0.8792518973350525)]",1,[61484],1,0.879252
4,000cfe84-e.jpg,352,357,"[(100.0, 251.0, 91.0, 264.0, 0.0, 20.787974280621338, 0.9656392931938171)]",1,[36896],1,0.965639


In [9]:
%%time
# To submission format
submission_pd = roi_image_predictions_wbf[["NAME", "slide_width", "slide_height", "predict_score_avg", "predict_bb", "trustii_ids"]].explode(['predict_bb', 'trustii_ids']).rename(columns={'trustii_ids':'trustii_id', 'slide_height':'img_height', 'slide_width': 'img_width'})
submission_pd["pred_x1"] = submission_pd["predict_bb"].apply(lambda x: x[0]).astype(np.int32)
submission_pd["pred_y1"] = submission_pd["predict_bb"].apply(lambda x: x[2]).astype(np.int32)
submission_pd["pred_x2"] = submission_pd["predict_bb"].apply(lambda x: x[1]).astype(np.int32)
submission_pd["pred_y2"] = submission_pd["predict_bb"].apply(lambda x: x[3]).astype(np.int32)
submission_pd["pred_score"] = submission_pd["predict_bb"].apply(lambda x: x[-1])
submission_pd["pred_width"] = submission_pd["pred_x2"] - submission_pd["pred_x1"]
submission_pd["pred_height"] = submission_pd["pred_y2"] - submission_pd["pred_y1"]
submission_pd = submission_pd.reset_index(drop=True)

# Dump BBx in a folder
dump_boxes(submission_pd, TEST_HOME, CROP_HOME_TEST, margins=CROP_MARGINS)
submission_pd.to_parquet(CROP_FILE_TEST_CLEANED)
display(submission_pd)

# Debug only
# submission_csv_pd = submission_pd[["trustii_id", "NAME", "pred_x1", "pred_y1", "pred_x2", "pred_y2"]].copy().rename(columns={'pred_x1':'x1', 'pred_y1':'y1', 'pred_x2':'x2', 'pred_y2':'y2'})
# submission_csv_pd["class"] = "PNN"
# # Keep same order
# submission_csv_pd = pd.merge(test_pd, submission_csv_pd, on=["trustii_id", "NAME"], how="left")
# submission_csv_pd.to_csv(CROP_FILE_TEST_CLEANED.replace(".parquet", ".csv"), index=False)
# submission_csv_pd.head()

100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 22689/22689 [09:58<00:00, 37.89it/s]


Unnamed: 0,NAME,img_width,img_height,predict_score_avg,predict_bb,trustii_id,pred_x1,pred_y1,pred_x2,pred_y2,pred_score,pred_width,pred_height,filename
0,000455d4-8.jpg,360,363,0.967940,"(93.0, 266.0, 98.0, 265.0, 0.0, 22.10820324456688, 0.9679403901100159)",23798,93,98,266,265,0.967940,173,167,000455d4-8-23798.png
1,0007ccec-2.jpg,360,367,0.971822,"(91.0, 272.0, 82.0, 283.0, 0.0, 27.536330608537693, 0.9718216061592102)",22386,91,82,272,283,0.971822,181,201,0007ccec-2-22386.png
2,00080027-c.jpg,368,369,0.976950,"(111.0, 257.0, 107.0, 261.0, 0.0, 16.55767644632968, 0.9769502282142639)",59769,111,107,257,261,0.976950,146,154,00080027-c-59769.png
3,00084489-e.jpg,368,370,0.879252,"(118.0, 241.0, 115.0, 269.0, 0.0, 13.911574618096358, 0.8792518973350525)",61484,118,115,241,269,0.879252,123,154,00084489-e-61484.png
4,000cfe84-e.jpg,352,357,0.965639,"(100.0, 251.0, 91.0, 264.0, 0.0, 20.787974280621338, 0.9656392931938171)",36896,100,91,251,264,0.965639,151,173,000cfe84-e-36896.png
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
22684,fffff491-3.jpg,360,360,0.247647,"(132.0, 242.0, 126.0, 244.0, 0.0, 10.015432098765432, 0.9708890914916992)",45140,132,126,242,244,0.970889,110,118,fffff491-3-45140.png
22685,fffff491-3.jpg,360,360,0.247647,"(0.0, 46.0, 0.0, 42.0, 0.0, 1.4907407407407407, 0.1545945703983307)",16280,0,0,46,42,0.154595,46,42,fffff491-3-16280.png
22686,fffff491-3.jpg,360,360,0.247647,"(334.0, 359.0, 3.0, 94.0, 0.0, 1.7554012345679013, 0.05295475572347641)",44984,334,3,359,94,0.052955,25,91,fffff491-3-44984.png
22687,fffff491-3.jpg,360,360,0.247647,"(277.0, 356.0, 63.0, 148.0, 0.0, 5.181327160493828, 0.037306852638721466)",41305,277,63,356,148,0.037307,79,85,fffff491-3-41305.png


CPU times: user 3min 3s, sys: 17.1 s, total: 3min 20s
Wall time: 9min 59s


## Stage2: White blood cells classification
- Execute multi-classes models (on detected bounding boxes)
- Execute multi-labels models (on full images)
- Ensemble multi-classes and multi-labels models

In [10]:
# YoloX v3.1 - CV=0.9313, LB=0.93785
models_dict = {

    # Multiclasses models
    'cnn_and_transformers-bb-multiclass': {
        # CV=0.9182
        'root_dir_mc_512/0.25': [
            (f'{MODELS_HOME}/TRUSTII-RGB/timm_tf_efficientnetv2_m.in21k_512_None_v1.4.0-pl-crop-m16/stage2/seed42', None), # CV=0.9182, rd2, LB=0.9323 - 10min,
        ],
        # CV=0.9191 with Dino
        'root_dir_mc_224/0.25': [
            (f'{MODELS_HOME}/TRUSTII-RGB/timm_vit_large_patch16_224.augreg_in21k_ft_in1k_224_None_v1.2.5-pl-crop-m16/stage1/seed42', [[hflip]]), # with background, CV=0.9037 LB(HFlip)=0.9254 LB(NoTTA)=0.9260 - 16min
            (f'{MODELS_HOME}/TRUSTII-RGB/timm_vit_large_patch16_224.augreg_in21k_ft_in1k_224_None_v1.2.4-pl-crop-m16/stage1/seed42', [[hflip]]), # CV=0.9061 LB(HFlip)=0.9265 LB(NoTTA)=0.9269 - 16min
            (f'{MODELS_HOME}/TRUSTII-RGB/foundation_dinov2_vitb14_DinoBloom-B.pth_224_None_v1.4.0.6-pl-crop-m16/stage1/seed42', [[hflip]]),  # CV=0.9133, rd2, LB(HFlip)=0.9277 - 7min
        ],
        # CV=0.9165
        'root_dir_mc_384/0.25': [
            (f'{MODELS_HOME}/TRUSTII-RGB/timm_nextvit_large.bd_ssld_6m_in1k_384_384_None_v1.4.0-pl-crop-m16/stage2/seed42', [[hflip]]), # CV=0.9165 rd2, LB(HFlip)=0.9279 - 24min
        ],
        # CV=0.9181
        'root_dir_mc_tr_512/0.25': [
            (f'{MODELS_HOME}/TRUSTII-RGB/timm_tiny_vit_21m_512.dist_in22k_ft_in1k_512_None_v1.5.0-pl-crop-m16/stage1/seed42', None), # CV=0.9181 rd2, LB=0.9291 - 8min
        ],          
    },    

    # Multilabels models
    'cnn_and_transformers-multilabel': {
        # CV=0.9279
        'root_dir_ml_512/1.0': [
            (f'{MODELS_HOME}/TRUSTII-RGB/timm_tf_efficientnetv2_m.in21k_512_None_v1.3.0-pl/stage3/seed42', [[hflip]]), # with background rd2, CV=0.9301/0.9279, HFlip - 20min            
        ]
    },
}

ALPHA = 0.457

### Execute models

In [11]:
%%time
for name, info in models_dict.items():
    root_dirs = [c for c in info.keys() if c. startswith('root_dir')]    
    for root_dir_name in root_dirs:
        root_dir = info.get(root_dir_name)        
        if isinstance(root_dir, list):
            for root_dir_, tta in root_dir:
                models_pl = {name: {"root_dir": root_dir_}}
                if '-bb-multiclass' in name:
                    test_pd = pd.read_parquet(CROP_FILE_TEST_CLEANED).reset_index(drop=True) # .head(512)
                    print("Executing multi-classes model:", models_pl, test_pd.shape, "TTA:", tta)
                    final_test_pd = infer_model(models_pl, test_pd, tta=tta, images_home=CROP_HOME_TEST)                
                    final_test_pd.to_parquet(os.path.join(root_dir_, "test_predictions_%s.parquet"%INFERENCE_NAME))
                    # Debug
                    logits_col = [c for c in final_test_pd.columns if "logits_" in c]
                    if len(logits_col) > 23:        
                        final_test_pd["preds"] = final_test_pd[logits_col[0:23]].values.argmax(axis=1).astype(np.int32)                    
                    submission_csv_pd = final_test_pd[["trustii_id", "NAME", "pred_x1", "pred_y1", "pred_x2", "pred_y2", "preds"]].copy().rename(columns={'pred_x1':'x1', 'pred_y1':'y1', 'pred_x2':'x2', 'pred_y2':'y2', 'preds':'class'})
                    submission_csv_pd["x1"] = submission_csv_pd["x1"].astype(np.int32)
                    submission_csv_pd["y1"] = submission_csv_pd["y1"].astype(np.int32)
                    submission_csv_pd["x2"] = submission_csv_pd["x2"].astype(np.int32)
                    submission_csv_pd["y2"] = submission_csv_pd["y2"].astype(np.int32)
                    submission_csv_pd["class"] = submission_csv_pd["class"].astype(np.int32)
                    submission_csv_pd["class"] = submission_csv_pd["class"].map(class_mapping)
                    submission_csv_pd = pd.merge(pd.read_csv(TEST_FILE), submission_csv_pd, on=["trustii_id", "NAME"], how="left")
                    submission_csv_pd.to_csv(os.path.join(root_dir_, "submission_%s.csv"%INFERENCE_NAME), index=False)
                elif '-multilabel' in name:
                    test_pd = pd.read_csv(TEST_FILE) # .head(512)
                    test_pd["filename"] = test_pd["NAME"]           
                    print("Executing multi-labels model:", models_pl, test_pd.shape, "TTA:", tta)
                    final_test_pd = infer_model(models_pl, test_pd, tta=tta, images_home=TEST_HOME)
                    final_test_pd.to_parquet(os.path.join(root_dir_, "test_predictions_%s.parquet"%INFERENCE_NAME))                     
        else:
            raise(Exception("List expected:%s"%root_dir))

Executing multi-classes model: {'cnn_and_transformers-bb-multiclass': {'root_dir': './models/TRUSTII-RGB/timm_tf_efficientnetv2_m.in21k_512_None_v1.4.0-pl-crop-m16/stage2/seed42'}} (22689, 14) TTA: None
Loading: ./models/TRUSTII-RGB/timm_tf_efficientnetv2_m.in21k_512_None_v1.4.0-pl-crop-m16/stage2/seed42/fold0/best_epoch=21-val_f1=0.9192.ckpt


  model_dump = torch.load(best_weights, map_location='cpu')
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_tf_efficientnetv2_m.in21k_512_None_v1.4.0-pl-crop-m16/stage2/seed42/fold1/best_epoch=19-val_f1=0.9202.ckpt


  model_dump = torch.load(best_weights, map_location='cpu')
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_tf_efficientnetv2_m.in21k_512_None_v1.4.0-pl-crop-m16/stage2/seed42/fold2/best_epoch=23-val_f1=0.9187.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_tf_efficientnetv2_m.in21k_512_None_v1.4.0-pl-crop-m16/stage2/seed42/fold3/best_epoch=22-val_f1=0.9150.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Duration: 596.0879874229431
Executing multi-classes model: {'cnn_and_transformers-bb-multiclass': {'root_dir': './models/TRUSTII-RGB/timm_vit_large_patch16_224.augreg_in21k_ft_in1k_224_None_v1.2.5-pl-crop-m16/stage1/seed42'}} (22689, 14) TTA: [[<function hflip at 0x7fae90bcb400>]]
Loading: ./models/TRUSTII-RGB/timm_vit_large_patch16_224.augreg_in21k_ft_in1k_224_None_v1.2.5-pl-crop-m16/stage1/seed42/fold0/best_epoch=28-val_f1=0.9114.ckpt


  model_dump = torch.load(best_weights, map_location='cpu')
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_vit_large_patch16_224.augreg_in21k_ft_in1k_224_None_v1.2.5-pl-crop-m16/stage1/seed42/fold1/best_epoch=28-val_f1=0.9116.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_vit_large_patch16_224.augreg_in21k_ft_in1k_224_None_v1.2.5-pl-crop-m16/stage1/seed42/fold2/best_epoch=30-val_f1=0.9037.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_vit_large_patch16_224.augreg_in21k_ft_in1k_224_None_v1.2.5-pl-crop-m16/stage1/seed42/fold3/best_epoch=31-val_f1=0.9040.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Duration: 1004.2678382396698
Executing multi-classes model: {'cnn_and_transformers-bb-multiclass': {'root_dir': './models/TRUSTII-RGB/timm_vit_large_patch16_224.augreg_in21k_ft_in1k_224_None_v1.2.4-pl-crop-m16/stage1/seed42'}} (22689, 14) TTA: [[<function hflip at 0x7fae90bcb400>]]
Loading: ./models/TRUSTII-RGB/timm_vit_large_patch16_224.augreg_in21k_ft_in1k_224_None_v1.2.4-pl-crop-m16/stage1/seed42/fold0/best_epoch=28-val_f1=0.9027.ckpt


  model_dump = torch.load(best_weights, map_location='cpu')
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_vit_large_patch16_224.augreg_in21k_ft_in1k_224_None_v1.2.4-pl-crop-m16/stage1/seed42/fold1/best_epoch=31-val_f1=0.9135.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_vit_large_patch16_224.augreg_in21k_ft_in1k_224_None_v1.2.4-pl-crop-m16/stage1/seed42/fold2/best_epoch=28-val_f1=0.9042.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_vit_large_patch16_224.augreg_in21k_ft_in1k_224_None_v1.2.4-pl-crop-m16/stage1/seed42/fold3/best_epoch=31-val_f1=0.9037.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Duration: 1005.8970401287079
Executing multi-classes model: {'cnn_and_transformers-bb-multiclass': {'root_dir': './models/TRUSTII-RGB/foundation_dinov2_vitb14_DinoBloom-B.pth_224_None_v1.4.0.6-pl-crop-m16/stage1/seed42'}} (22689, 14) TTA: [[<function hflip at 0x7fae90bcb400>]]
Loading: ./models/TRUSTII-RGB/foundation_dinov2_vitb14_DinoBloom-B.pth_224_None_v1.4.0.6-pl-crop-m16/stage1/seed42/fold0/best_epoch=28-val_f1=0.9137.ckpt


Using cache found in /home/mpware/.cache/torch/hub/facebookresearch_dinov2_main


Loading (dinov2_vitb14) weights: DinoBloom-B.pth


  pretrained = torch.load(modelpath, map_location=torch.device('cpu'))


Freezing model, dinov2_vitb14/DinoBloom-B.pth device: cuda
Freezing full model
Unfreezing -5 blocks, last 5 over 12
Override prepare stage


  model_dump = torch.load(best_weights, map_location='cpu')
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/foundation_dinov2_vitb14_DinoBloom-B.pth_224_None_v1.4.0.6-pl-crop-m16/stage1/seed42/fold1/best_epoch=30-val_f1=0.9170.ckpt


Using cache found in /home/mpware/.cache/torch/hub/facebookresearch_dinov2_main


Loading (dinov2_vitb14) weights: DinoBloom-B.pth
Freezing model, dinov2_vitb14/DinoBloom-B.pth device: cuda
Freezing full model
Unfreezing -5 blocks, last 5 over 12
Override prepare stage


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/foundation_dinov2_vitb14_DinoBloom-B.pth_224_None_v1.4.0.6-pl-crop-m16/stage1/seed42/fold2/best_epoch=29-val_f1=0.9129.ckpt


Using cache found in /home/mpware/.cache/torch/hub/facebookresearch_dinov2_main


Loading (dinov2_vitb14) weights: DinoBloom-B.pth
Freezing model, dinov2_vitb14/DinoBloom-B.pth device: cuda
Freezing full model
Unfreezing -5 blocks, last 5 over 12
Override prepare stage


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/foundation_dinov2_vitb14_DinoBloom-B.pth_224_None_v1.4.0.6-pl-crop-m16/stage1/seed42/fold3/best_epoch=28-val_f1=0.9095.ckpt


Using cache found in /home/mpware/.cache/torch/hub/facebookresearch_dinov2_main


Loading (dinov2_vitb14) weights: DinoBloom-B.pth
Freezing model, dinov2_vitb14/DinoBloom-B.pth device: cuda
Freezing full model
Unfreezing -5 blocks, last 5 over 12
Override prepare stage


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Duration: 432.63022470474243
Executing multi-classes model: {'cnn_and_transformers-bb-multiclass': {'root_dir': './models/TRUSTII-RGB/timm_nextvit_large.bd_ssld_6m_in1k_384_384_None_v1.4.0-pl-crop-m16/stage2/seed42'}} (22689, 14) TTA: [[<function hflip at 0x7fae90bcb400>]]
Loading: ./models/TRUSTII-RGB/timm_nextvit_large.bd_ssld_6m_in1k_384_384_None_v1.4.0-pl-crop-m16/stage2/seed42/fold0/best_epoch=23-val_f1=0.9154.ckpt


  model_dump = torch.load(best_weights, map_location='cpu')
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_nextvit_large.bd_ssld_6m_in1k_384_384_None_v1.4.0-pl-crop-m16/stage2/seed42/fold1/best_epoch=22-val_f1=0.9217.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_nextvit_large.bd_ssld_6m_in1k_384_384_None_v1.4.0-pl-crop-m16/stage2/seed42/fold2/best_epoch=21-val_f1=0.9153.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_nextvit_large.bd_ssld_6m_in1k_384_384_None_v1.4.0-pl-crop-m16/stage2/seed42/fold3/best_epoch=23-val_f1=0.9138.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Duration: 1347.8589000701904
Executing multi-classes model: {'cnn_and_transformers-bb-multiclass': {'root_dir': './models/TRUSTII-RGB/timm_tiny_vit_21m_512.dist_in22k_ft_in1k_512_None_v1.5.0-pl-crop-m16/stage1/seed42'}} (22689, 14) TTA: None
Loading: ./models/TRUSTII-RGB/timm_tiny_vit_21m_512.dist_in22k_ft_in1k_512_None_v1.5.0-pl-crop-m16/stage1/seed42/fold0/best_epoch=35-val_f1=0.9192.ckpt


  model_dump = torch.load(best_weights, map_location='cpu')
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_tiny_vit_21m_512.dist_in22k_ft_in1k_512_None_v1.5.0-pl-crop-m16/stage1/seed42/fold1/best_epoch=31-val_f1=0.9235.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_tiny_vit_21m_512.dist_in22k_ft_in1k_512_None_v1.5.0-pl-crop-m16/stage1/seed42/fold2/best_epoch=32-val_f1=0.9174.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_tiny_vit_21m_512.dist_in22k_ft_in1k_512_None_v1.5.0-pl-crop-m16/stage1/seed42/fold3/best_epoch=34-val_f1=0.9127.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Duration: 633.6524655818939
Executing multi-labels model: {'cnn_and_transformers-multilabel': {'root_dir': './models/TRUSTII-RGB/timm_tf_efficientnetv2_m.in21k_512_None_v1.3.0-pl/stage3/seed42'}} (22689, 3) TTA: [[<function hflip at 0x7fae90bcb400>]]
Loading: ./models/TRUSTII-RGB/timm_tf_efficientnetv2_m.in21k_512_None_v1.3.0-pl/stage3/seed42/fold0/best_epoch=22-val_f1=0.9300.ckpt


  model_dump = torch.load(best_weights, map_location='cpu')
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_tf_efficientnetv2_m.in21k_512_None_v1.3.0-pl/stage3/seed42/fold1/best_epoch=21-val_f1=0.9343.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_tf_efficientnetv2_m.in21k_512_None_v1.3.0-pl/stage3/seed42/fold2/best_epoch=22-val_f1=0.9302.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Loading: ./models/TRUSTII-RGB/timm_tf_efficientnetv2_m.in21k_512_None_v1.3.0-pl/stage3/seed42/fold3/best_epoch=23-val_f1=0.9289.ckpt


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Ensemble of 1 model(s), tta=[[<function hflip at 0x7fae90bcb400>]]


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: |                                                                                                 …

Duration: 1158.1916420459747
CPU times: user 1h 34min 56s, sys: 10min 49s, total: 1h 45min 45s
Wall time: 1h 43min


### Ensemble models

In [12]:
# Convert logits to probabilities and average them
def ensemble_multiclass_multilabels(multiclass_single_wbc_pd, multilabels_pd, weights=[0.5, 0.5]):
    # Move from logits to probabilities (softmax because of multiclasses) - Need to be followed by argmax
    logits_col = [c for c in multiclass_single_wbc_pd.columns if "logits_" in c][0:23]
    sprobs_col = [c.replace("logits_", "sprobs_") for c in multiclass_single_wbc_pd.columns if "logits_" in c][0:23]
    multiclass_single_wbc_pd[sprobs_col] = torch.softmax(torch.from_numpy(multiclass_single_wbc_pd[logits_col].values), dim=1).numpy()
    sprobs_pd = multiclass_single_wbc_pd[["NAME"] + sprobs_col]
    # display(sprobs_pd)

    # Move from logits to probabilities (sigmoid because of multilabels) - Need to be followed by threshold
    logits_col = [c for c in multilabels_pd.columns if "logits_" in c][0:23]
    mprobs_col = [c.replace("logits_", "mprobs_") for c in multilabels_pd.columns if "logits_" in c][0:23]
    multilabels_pd[mprobs_col] = torch.sigmoid(torch.from_numpy(multilabels_pd[logits_col].values)).numpy()
    mprobs_pd = multilabels_pd[["NAME"] + mprobs_col]
    # display(mprobs_pd)

    # Merge both on NAME and ensemble with average
    probs_pd = pd.merge(sprobs_pd, mprobs_pd, on="NAME", how="inner")
    # print(probs_pd.shape)
    probs_pd[sprobs_col].shape, probs_pd[mprobs_col].shape
    ensemble_probs = [probs_pd[sprobs_col].values, probs_pd[mprobs_col].values]
    # ensemble_probs = [probs_pd[mprobs_col].values]
    print(probs_pd[sprobs_col].values.shape, probs_pd[mprobs_col].values.shape)
    # print("Ensemble of:", np.stack(ensemble_probs).shape, weights)
    # ensemble_probs = np.nanmean(np.stack(ensemble_probs), axis=0)
    ensemble_probs = np.average(np.stack(ensemble_probs), axis=0, weights=weights)
    
    # print(ensemble_probs.shape)
    probs_col = [c.replace("sprobs_", "eprobs_") for c in probs_pd.columns if "sprobs_" in c]
    probs_pd[probs_col] = ensemble_probs
    # display(probs_pd[["NAME"] + probs_col])

    return probs_pd

In [13]:
%%time
thr = 0.5
best_thrs = None
for name, info in models_dict.items():
    root_dirs = [c for c in info.keys() if c. startswith('root_dir')]
    root_weights = [float(x.split("/")[-1]) for x in root_dirs]
    root_dirs_ensemble_logits = []
    for root_dir_name in root_dirs:
        root_dir = info.get(root_dir_name)
        if isinstance(root_dir, list):
            ensemble_logits = []
            for root_dir_, _ in root_dir:
                test_pd_ = pd.read_parquet(os.path.join(root_dir_, "test_predictions_%s.parquet"%INFERENCE_NAME))
                logits_col = [c for c in test_pd_.columns if "logits_" in c][0:23]
                ensemble_logits.append(test_pd_[logits_col].values)
            ensemble_logits = np.nanmean(np.stack(ensemble_logits), axis=0)
            # Ensemble OOF with new logits and predictions
            test_pd = test_pd_.copy()
            test_pd[logits_col] = ensemble_logits
            if LABEL in name:
                test_pd[[c for c in test_pd.columns if "preds_" in c][0:23]] = (torch.sigmoid(torch.from_numpy(ensemble_logits)).numpy() > thr).astype(int)
            else:
                test_pd["preds"] = ensemble_logits.argmax(1)
        else:
            raise(Exception("List expected:%s"%root_dir))
        root_dirs_ensemble_logits.append(test_pd[logits_col].values)
        print("Loading:", root_dir_name, root_dir, test_pd.shape)
    
    # root_dirs_ensemble_logits = np.nanmean(np.stack(root_dirs_ensemble_logits), axis=0)
    root_dirs_ensemble_logits = np.average(np.stack(root_dirs_ensemble_logits), axis=0, weights=root_weights)
    test_pd[logits_col] = root_dirs_ensemble_logits 
    if LABEL in name:
        test_pd[[c for c in test_pd.columns if "preds_" in c][0:23]] = (torch.sigmoid(torch.from_numpy(root_dirs_ensemble_logits)).numpy()> thr).astype(int)
    else:
        test_pd["preds"] = test_pd[logits_col].values.argmax(1)
            
    if LABEL in name:
        test_pd = test_pd.drop(columns=["trustii_id"]).groupby(["NAME","filename"]).first().reset_index()
    
    print("Ensemble(x%d):"%len(root_dirs), name, test_pd.shape, "Weights:", root_weights)
    # display(test_pd.head())
    info["test_pd"] = test_pd

Loading: root_dir_mc_512/0.25 [('./models/TRUSTII-RGB/timm_tf_efficientnetv2_m.in21k_512_None_v1.4.0-pl-crop-m16/stage2/seed42', None)] (22689, 38)
Loading: root_dir_mc_224/0.25 [('./models/TRUSTII-RGB/timm_vit_large_patch16_224.augreg_in21k_ft_in1k_224_None_v1.2.5-pl-crop-m16/stage1/seed42', [[<function hflip at 0x7fae90bcb400>]]), ('./models/TRUSTII-RGB/timm_vit_large_patch16_224.augreg_in21k_ft_in1k_224_None_v1.2.4-pl-crop-m16/stage1/seed42', [[<function hflip at 0x7fae90bcb400>]]), ('./models/TRUSTII-RGB/foundation_dinov2_vitb14_DinoBloom-B.pth_224_None_v1.4.0.6-pl-crop-m16/stage1/seed42', [[<function hflip at 0x7fae90bcb400>]])] (22689, 38)
Loading: root_dir_mc_384/0.25 [('./models/TRUSTII-RGB/timm_nextvit_large.bd_ssld_6m_in1k_384_384_None_v1.4.0-pl-crop-m16/stage2/seed42', [[<function hflip at 0x7fae90bcb400>]])] (22689, 38)
Loading: root_dir_mc_tr_512/0.25 [('./models/TRUSTII-RGB/timm_tiny_vit_21m_512.dist_in22k_ft_in1k_512_None_v1.5.0-pl-crop-m16/stage1/seed42', None)] (22689,

In [14]:
%%time
multiclasses_key = "cnn_and_transformers-bb-multiclass"
multilabels_key = "cnn_and_transformers-multilabel"
multilabels_pd_ = models_dict[multilabels_key]["test_pd"] # Image level
multiclass_pd_ = models_dict[multiclasses_key]["test_pd"] # Box level
multiclass_pd_["wbc"] = multiclass_pd_.groupby(["NAME"])["trustii_id"].transform('count')
multiclass_single_wbc_pd_ = multiclass_pd_[multiclass_pd_["wbc"] == 1].reset_index(drop=True)
print("Images with single WBC", multiclass_single_wbc_pd_.shape, "%.2f%%"%(multiclass_single_wbc_pd_.shape[0]*100/multiclass_pd_.shape[0]))
# Run ensemble
ensemble_probs_pd = ensemble_multiclass_multilabels(multiclass_single_wbc_pd_, multilabels_pd_, weights=[ALPHA, 1-ALPHA])
eprobs_col = [c for c in ensemble_probs_pd.columns if "eprobs_" in c]
multiclass_single_wbc_ensemble_pd = pd.merge(multiclass_single_wbc_pd_, ensemble_probs_pd[["NAME"] + eprobs_col], on="NAME", how="inner")
mpreds = (multiclass_single_wbc_ensemble_pd[eprobs_col].values > thr).astype(int)
print("Threshold %.2f predictions has %d WBC with multilabels" % (thr, np.sum(np.sum(mpreds, axis=1) > 1)))
multiclass_single_wbc_ensemble_pd["preds_ensemble_argmax"] = multiclass_single_wbc_ensemble_pd[eprobs_col].values.argmax(axis=1)
multiclass_single_wbc_ensemble_pd["preds"] = multiclass_single_wbc_ensemble_pd["preds"].astype(np.int32)
changes_argmax_pd = multiclass_single_wbc_ensemble_pd[multiclass_single_wbc_ensemble_pd["preds"] != multiclass_single_wbc_ensemble_pd["preds_ensemble_argmax"]][["NAME", "preds", "preds_ensemble_argmax"]]
print("Changes argmax (%d): %.2f%%" % (changes_argmax_pd.shape[0], changes_argmax_pd.shape[0]*100/multiclass_single_wbc_ensemble_pd.shape[0]))

if best_thrs is not None:
    # Apply best threshold per label
    for binary_class in all_classes:
        thr_ = best_thrs[binary_class]
        mpreds = (multiclass_single_wbc_ensemble_pd["eprobs_%d"%binary_class].values > thr_).astype(int)
        multiclass_single_wbc_ensemble_pd["epreds_%d"%binary_class] = mpreds
    multiclass_single_wbc_ensemble_pd["preds_ensemble_thr"] = multiclass_single_wbc_ensemble_pd[["epreds_%d"%c for c in all_classes]].values.argmax(axis=1)
    for binary_class in all_classes:
        del multiclass_single_wbc_ensemble_pd["epreds_%d"%binary_class]    
    changes_thr_pd = multiclass_single_wbc_ensemble_pd[multiclass_single_wbc_ensemble_pd["preds"] != multiclass_single_wbc_ensemble_pd["preds_ensemble_thr"]][["NAME", "preds", "preds_ensemble_thr"]]
    print("Changes thr (%d): %.2f%%" % (changes_thr_pd.shape[0], changes_thr_pd.shape[0]*100/multiclass_single_wbc_ensemble_pd.shape[0]))

preds_ensembles_cols =  [c for c in multiclass_single_wbc_ensemble_pd.columns if "preds_ensemble_" in c]
multiclass_single_wbc_ensemble_pd[["NAME", "preds"] + preds_ensembles_cols]

Images with single WBC (19373, 39) 85.38%
(19373, 23) (19373, 23)
Threshold 0.50 predictions has 4 WBC with multilabels
Changes argmax (251): 1.30%
CPU times: user 89.4 ms, sys: 2.7 ms, total: 92.1 ms
Wall time: 66.7 ms


Unnamed: 0,NAME,preds,preds_ensemble_argmax
0,000455d4-8.jpg,19,19
1,0007ccec-2.jpg,18,18
2,00080027-c.jpg,0,0
3,00084489-e.jpg,1,1
4,000cfe84-e.jpg,3,3
...,...,...,...
19368,ffef4aae-c.jpg,12,12
19369,fff4d913-4.jpg,8,8
19370,fff50857-a.jpg,0,0
19371,fff97dfe-0.jpg,15,15


In [15]:
%%time
# Merge back to get final predictions
ensemble_multiclass_pd = pd.merge(multiclass_pd_, multiclass_single_wbc_ensemble_pd[["NAME"] + preds_ensembles_cols], on='NAME', how='left')
ensemble_multiclass_pd.loc[ensemble_multiclass_pd["wbc"] > 1, "preds_ensemble_argmax"] = ensemble_multiclass_pd["preds"]
ensemble_multiclass_pd["preds_ensemble_argmax"] = ensemble_multiclass_pd["preds_ensemble_argmax"].astype('Int32')
if 'preds_ensemble_thr' in ensemble_multiclass_pd.columns:
    ensemble_multiclass_pd.loc[ensemble_multiclass_pd["wbc"] > 1, "preds_ensemble_thr"] = ensemble_multiclass_pd["preds"]
    ensemble_multiclass_pd["preds_ensemble_thr"] = ensemble_multiclass_pd["preds_ensemble_thr"].astype('Int32')
ensemble_multiclass_pd[["trustii_id", "NAME", "pred_x1", "pred_y1", "pred_x2", "pred_y2"] + preds_ensembles_cols]

CPU times: user 15.8 ms, sys: 644 µs, total: 16.5 ms
Wall time: 16.6 ms


Unnamed: 0,trustii_id,NAME,pred_x1,pred_y1,pred_x2,pred_y2,preds_ensemble_argmax
0,23798,000455d4-8.jpg,93,98,266,265,19
1,22386,0007ccec-2.jpg,91,82,272,283,18
2,59769,00080027-c.jpg,111,107,257,261,0
3,61484,00084489-e.jpg,118,115,241,269,1
4,36896,000cfe84-e.jpg,100,91,251,264,3
...,...,...,...,...,...,...,...
22684,45140,fffff491-3.jpg,132,126,242,244,11
22685,16280,fffff491-3.jpg,0,0,46,42,11
22686,44984,fffff491-3.jpg,334,3,359,94,11
22687,41305,fffff491-3.jpg,277,63,356,148,7


In [16]:
%%time
# Submission file (argmax)
submission_csv_pd = ensemble_multiclass_pd[["trustii_id", "NAME", "pred_x1", "pred_y1", "pred_x2", "pred_y2", "preds_ensemble_argmax"]].copy().rename(columns={'pred_x1':'x1', 'pred_y1':'y1', 'pred_x2':'x2', 'pred_y2':'y2', 'preds_ensemble_argmax':'class'})
submission_csv_pd["class"] = submission_csv_pd["class"].astype('Int32')
submission_csv_pd["class"] = submission_csv_pd["class"].map(class_mapping)
# Keep same order
submission_csv_pd = pd.merge(pd.read_csv(TEST_FILE), submission_csv_pd, on=["trustii_id", "NAME"], how="left")
folder = os.path.join("submissions", INFERENCE_NAME, "_".join(models_dict.keys()))
os.makedirs(folder, exist_ok=True)
submission_csv_pd.to_csv(os.path.join(folder, "submission_argmax.csv"), index=False)
display(submission_csv_pd)

Unnamed: 0,trustii_id,NAME,x1,y1,x2,y2,class
0,43232,681daf42-3.jpg,116,116,248,242,LLC
1,65979,172bf8a5-e.jpg,100,96,229,269,Lysee
2,60083,179a21ee-4.jpg,102,94,273,278,PM
3,7302,f15e265d-6.jpg,101,99,261,267,M
4,31846,94cbe9cc-3.jpg,118,112,244,254,PNN
...,...,...,...,...,...,...,...
22684,13601,7d1b6d4a-3.jpg,10,156,88,233,LF
22685,62911,33b1749a-b.jpg,2,1,89,34,LyB
22686,16139,d11c35af-a.jpg,41,145,171,284,MoB
22687,47444,d0fb06c0-3.jpg,102,112,262,249,B


CPU times: user 47.5 ms, sys: 2.38 ms, total: 49.9 ms
Wall time: 75.5 ms
