# VinBigData detectron2 prediction


**Following from the training kernel [VinBigData detectron2 train](https://www.kaggle.com/corochann/vinbigdata-detectron2-train), I will try prediction with the `detectron2` trained model**

`detectron2` is one of the famous pytorch object detection library, I will introduce how to use this library to predict bounding boxes with the trained model.

 - https://github.com/facebookresearch/detectron2

> Detectron2 is Facebook AI Research's next generation software system that implements state-of-the-art object detection algorithms. It is a ground-up rewrite of the previous version, Detectron, and it originates from maskrcnn-benchmark.
![](https://user-images.githubusercontent.com/1381301/66535560-d3422200-eace-11e9-9123-5535d469db19.png)


## Version history

2021/1/22: Update to add 2-class filter in troduced in [VinBigData 🌟2 Class Filter🌟](https://www.kaggle.com/awsaf49/vinbigdata-2-class-filter) by @awsaf49 <br/>
I also wrote kernel to train 2-class model: [📸VinBigData 2-class classifier complete pipeline](https://www.kaggle.com/corochann/vinbigdata-2-class-classifier-complete-pipeline)

2021/2/6: Updated trained model [vinbigdata-alb-aug-512-cos](https://www.kaggle.com/corochann/vinbigdata-alb-aug-512-cos).<br/>
Updated prediction kernel to align training kernel [VinBigData detectron2 train](https://www.kaggle.com/corochann/vinbigdata-detectron2-train), which uses customized augmentation.<br/>
Apply 2-class filter in [📸VinBigData 2-class classifier complete pipeline](https://www.kaggle.com/corochann/vinbigdata-2-class-classifier-complete-pipeline).

# Table of Contents

** [Prediction method implementations](#pred_method)** <br/>
** [Prediction scripts](#pred_scripts)** <br/>
** [Apply 2 class filter](#2class)** <br/>
** [Other kernels](#ref)** <br/>

Since first setup part is same with the training kernel, I skipped listing on ToC.

# Dataset preparation

Preprocessing x-ray image format (dicom) into normal png image format is already done by @xhlulu in the below discussion:
 - [Multiple preprocessed datasets: 256/512/1024px, PNG and JPG, modified and original ratio](https://www.kaggle.com/c/vinbigdata-chest-xray-abnormalities-detection/discussion/207955).

Here I will just use the dataset [VinBigData Chest X-ray Resized PNG (256x256)](https://www.kaggle.com/xhlulu/vinbigdata-chest-xray-resized-png-256x256) to skip the preprocessing and focus on modeling part. Please upvote the dataset as well!

In [2]:
import gc
import os
from pathlib import Path
import random
import sys

from tqdm.notebook import tqdm
import numpy as np
import pandas as pd
import scipy as sp


import matplotlib.pyplot as plt
import seaborn as sns

from IPython.core.display import display, HTML

# --- plotly ---
from plotly import tools, subplots
import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
import plotly.express as px
import plotly.figure_factory as ff
import plotly.io as pio
pio.templates.default = "plotly_dark"

# --- models ---
from sklearn import preprocessing
from sklearn.model_selection import KFold
import lightgbm as lgb
import xgboost as xgb
import catboost as cb



  from IPython.core.display import display, HTML


# Installation

detectron2 is not pre-installed in this kaggle docker, so let's install it. 
We can follow [installation instruction](https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md), we need to know CUDA and pytorch version to install correct `detectron2`.

In [3]:
!nvidia-smi

Fri May 24 14:19:57 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   44C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  Tesla T4                       Off | 00000000:00:05.0 Off |  

In [4]:
!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0


In [5]:
import torch

torch.__version__

'2.1.2'

It seems CUDA=10.2 and torch==1.7.0 is used in this kaggle docker image.

See [installation](https://detectron2.readthedocs.io/tutorials/install.html) for details.

<a id="pred_method"></a>
# Prediction method implementations

Basically we don't need to implement neural network part, `detectron2` already implements famous architectures and provides its pre-trained weights. We can finetune these pre-trained architectures.

These models are summarized in [MODEL_ZOO.md](https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md).

In this competition, we need object detection model, I will choose [R50-FPN](https://github.com/facebookresearch/detectron2/blob/master/configs/COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml) for this kernel.

## Data preparation

`detectron2` provides high-level API for training custom dataset.

To define custom dataset, we need to create **list of dict** where each dict contains following:

 - file_name: file name of the image.
 - image_id: id of the image, index is used here.
 - height: height of the image.
 - width: width of the image.
 - annotation: This is the ground truth annotation data for object detection, which contains following
     - bbox: bounding box pixel location with shape (n_boxes, 4)
     - bbox_mode: `BoxMode.XYXY_ABS` is used here, meaning that absolute value of (xmin, ymin, xmax, ymax) annotation is used in the `bbox`.
     - category_id: class label id for each bounding box, with shape (n_boxes,)

`get_vinbigdata_dicts` is for train dataset preparation and `get_vinbigdata_dicts_test` is for test dataset preparation.

In [7]:
import pickle
from pathlib import Path
from typing import Optional

import cv2
import numpy as np
import pandas as pd
from detectron2.structures import BoxMode
from tqdm import tqdm


def get_vinbigdata_dicts(
    imgdir: Path,
    train_df: pd.DataFrame,
    train_data_type: str = "original",
    use_cache: bool = True,
    debug: bool = True,
    target_indices: Optional[np.ndarray] = None,
    use_class14: bool = False,
):
    debug_str = f"_debug{int(debug)}"
    train_data_type_str = f"_{train_data_type}"
    class14_str = f"_14class{int(use_class14)}"
    cache_path = Path(".") / f"dataset_dicts_cache{train_data_type_str}{class14_str}{debug_str}.pkl"
    if not use_cache or not cache_path.exists():
        print("Creating data...")
        train_meta = pd.read_csv("/kaggle/input/amia-public-challenge-2024/img_size.csv")
        df_train = pd.read_csv("/kaggle/input/amia-public-challenge-2024/train.csv")
        
        
        
        train_meta = train_meta[train_meta['image_id'].isin(df_train['image_id'])]
        
        
        if debug:
            train_meta = train_meta.iloc[:500]  # For debug....

        # Load 1 image to get image size.
        image_id = "00JgsY3R0C6VQrT7VDFcoqW2J7dOfULr"
        image_path = str(imgdir / f"{image_id}.png")
        image = cv2.imread(image_path)
        resized_height, resized_width, ch = image.shape
        print(f"image shape: {image.shape}")

        dataset_dicts = []
        for index, train_meta_row in tqdm(train_meta.iterrows(), total=len(train_meta)):
            record = {}

            image_id, height, width = train_meta_row.values
            filename = str(imgdir /  f"{image_id}.png")
            record["file_name"] = filename
            record["image_id"] = image_id
            record["height"] = resized_height
            record["width"] = resized_width
            objs = []
            for index2, row in train_df.query("image_id == @image_id").iterrows():
                # print(row)
                # print(row["class_name"])
                # class_name = row["class_name"]
                class_id = row["class_id"]
                if class_id == 14:
                    # It is "No finding"
                    if use_class14:
                        # Use this No finding class with the bbox covering all image area.
                        bbox_resized = [0, 0, resized_width, resized_height]
                        obj = {
                            "bbox": bbox_resized,
                            "bbox_mode": BoxMode.XYXY_ABS,
                            "category_id": class_id,
                        }
                        objs.append(obj)
                    else:
                        # This annotator does not find anything, skip.
                        pass
                else:
                    # bbox_original = [int(row["x_min"]), int(row["y_min"]), int(row["x_max"]), int(row["y_max"])]
                    h_ratio = resized_height / height
                    w_ratio = resized_width / width
                    bbox_resized = [
                        float(row["x_min"]) * w_ratio,
                        float(row["y_min"]) * h_ratio,
                        float(row["x_max"]) * w_ratio,
                        float(row["y_max"]) * h_ratio,
                    ]
                    obj = {
                        "bbox": bbox_resized,
                        "bbox_mode": BoxMode.XYXY_ABS,
                        "category_id": class_id,
                    }
                    objs.append(obj)
            record["annotations"] = objs
            dataset_dicts.append(record)
        with open(cache_path, mode="wb") as f:
            pickle.dump(dataset_dicts, f)

    print(f"Load from cache {cache_path}")
    with open(cache_path, mode="rb") as f:
        dataset_dicts = pickle.load(f)
    if target_indices is not None:
        dataset_dicts = [dataset_dicts[i] for i in target_indices]
    return dataset_dicts


def get_vinbigdata_dicts_test(
    imgdir: Path, test_meta: pd.DataFrame, use_cache: bool = True, debug: bool = True,
):
    debug_str = f"_debug{int(debug)}"
    cache_path = Path(".") / f"dataset_dicts_cache_test{debug_str}.pkl"
    if not use_cache or not cache_path.exists():
        print("Creating data...")
        # test_meta = pd.read_csv(imgdir / "test_meta.csv")
        if debug:
            test_meta = test_meta.iloc[:500]  # For debug....

        # Load 1 image to get image size.
        image_id = "00JgsY3R0C6VQrT7VDFcoqW2J7dOfULr"
        image_path = "/kaggle/input/amia-public-challenge-2024/train/train/00JgsY3R0C6VQrT7VDFcoqW2J7dOfULr.png"
        image = cv2.imread(image_path)
        resized_height, resized_width, ch = image.shape
        print(f"image shape: {image.shape}")

        dataset_dicts = []
        for index, test_meta_row in tqdm(test_meta.iterrows(), total=len(test_meta)):
            record = {}

            image_id, height, width = test_meta_row.values
            filename = str(imgdir / "test" / f"{image_id}.png")
            record["file_name"] = filename
            # record["image_id"] = index
            record["image_id"] = image_id
            record["height"] = resized_height
            record["width"] = resized_width
            # objs = []
            # record["annotations"] = objs
            dataset_dicts.append(record)
        with open(cache_path, mode="wb") as f:
            pickle.dump(dataset_dicts, f)

    print(f"Load from cache {cache_path}")
    with open(cache_path, mode="rb") as f:
        dataset_dicts = pickle.load(f)
    return dataset_dicts


Methods for prediction for this competition

In [8]:
# Methods for prediction for this competition
from math import ceil
from typing import Any, Dict, List

import cv2
import detectron2
import numpy as np
from numpy import ndarray
import pandas as pd
import torch
from detectron2 import model_zoo
from detectron2.config import get_cfg
from detectron2.data import DatasetCatalog, MetadataCatalog, build_detection_test_loader
from detectron2.engine import DefaultPredictor
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2.structures import BoxMode
from detectron2.utils.logger import setup_logger
from detectron2.utils.visualizer import ColorMode, Visualizer
from tqdm import tqdm


def format_pred(labels: ndarray, boxes: ndarray, scores: ndarray) -> str:
    pred_strings = []
    for label, score, bbox in zip(labels, scores, boxes):
        xmin, ymin, xmax, ymax = bbox.astype(np.int64)
        pred_strings.append(f"{label} {score} {xmin} {ymin} {xmax} {ymax}")
    return " ".join(pred_strings)


def predict_batch(predictor: DefaultPredictor, im_list: List[ndarray]) -> List:
    with torch.no_grad():  # https://github.com/sphinx-doc/sphinx/issues/4258
        inputs_list = []
        for original_image in im_list:
            # Apply pre-processing to image.
            if predictor.input_format == "RGB":
                # whether the model expects BGR inputs or RGB
                original_image = original_image[:, :, ::-1]
            height, width = original_image.shape[:2]
            # Do not apply original augmentation, which is resize.
            # image = predictor.aug.get_transform(original_image).apply_image(original_image)
            image = original_image
            image = torch.as_tensor(image.astype("float32").transpose(2, 0, 1))
            inputs = {"image": image, "height": height, "width": width}
            inputs_list.append(inputs)
        predictions = predictor.model(inputs_list)
        return predictions

In [9]:
# --- utils ---
from pathlib import Path
from typing import Any, Union

import yaml


def save_yaml(filepath: Union[str, Path], content: Any, width: int = 120):
    with open(filepath, "w") as f:
        yaml.dump(content, f, width=width)


def load_yaml(filepath: Union[str, Path]) -> Any:
    with open(filepath, "r") as f:
        content = yaml.full_load(f)
    return content


In [10]:
# --- configs ---
thing_classes = [
    "Aortic enlargement",
    "Atelectasis",
    "Calcification",
    "Cardiomegaly",
    "Consolidation",
    "ILD",
    "Infiltration",
    "Lung Opacity",
    "Nodule/Mass",
    "Other lesion",
    "Pleural effusion",
    "Pleural thickening",
    "Pneumothorax",
    "Pulmonary fibrosis"
]
category_name_to_id = {class_name: index for index, class_name in enumerate(thing_classes)}


This `Flags` class is to manage experiments. I will tune these parameters through the competition to improve model's performance.

In [11]:
# --- flags ---
from dataclasses import dataclass, field
from typing import Dict


@dataclass
class Flags:
    # General
    debug: bool = True
    outdir: str = "results/det"

    # Data config
    imgdir_name: str = "/kaggle/input/amia-public-challenge-2024/test/test"
    split_mode: str = "all_train"  # all_train or valid20
    seed: int = 111
    train_data_type: str = "original"  # original or wbf
    use_class14: bool = False
    # Training config
    iter: int = 10000
    ims_per_batch: int = 2  # images per batch, this corresponds to "total batch size"
    num_workers: int = 4
    lr_scheduler_name: str = "WarmupMultiStepLR"  # WarmupMultiStepLR (default) or WarmupCosineLR
    base_lr: float = 0.00025
    roi_batch_size_per_image: int = 512
    eval_period: int = 10000
    aug_kwargs: Dict = field(default_factory=lambda: {})

    def update(self, param_dict: Dict) -> "Flags":
        # Overwrite by `param_dict`
        for key, value in param_dict.items():
            if not hasattr(self, key):
                raise ValueError(f"[ERROR] Unexpected key for flag = {key}")
            setattr(self, key, value)
        return self

<a id="pred_scripts"></a>
# Prediction scripts

Now the methods are ready. Main training scripts starts from here.

In [12]:
inputdir = Path("/kaggle/input")
traineddir = inputdir / "tunned-model-v1"

# flags = Flags()
flags: Flags = Flags().update(load_yaml(str(traineddir/"tunned_flags.yaml")))
flags.imgdir_name = "amia-public-challenge-2024/test/"
print("flags", flags)
debug = flags.debug
# flags_dict = dataclasses.asdict(flags)
outdir = Path(flags.outdir)
os.makedirs(str(outdir), exist_ok=True)


# --- Read data ---
datadir = inputdir / "amia-public-challenge-2024"
imgdir = inputdir / flags.imgdir_name

# Read in the data CSV files
# train = pd.read_csv(datadir / "train.csv")
test_meta = pd.read_csv("/kaggle/input/amia-public-challenge-2024/img_size.csv")
df_train = pd.read_csv("/kaggle/input/amia-public-challenge-2024/test.csv")
test_meta = test_meta[test_meta['image_id'].isin(df_train['image_id'])]
sample_submission = pd.read_csv(datadir / "sample_submission.csv")

flags Flags(debug=False, outdir='results/v9', imgdir_name='amia-public-challenge-2024/test/', split_mode='valid20', seed=111, train_data_type='original', use_class14=False, iter=679, ims_per_batch=14, num_workers=4, lr_scheduler_name='WarmupCosineLR', base_lr=0.0023230241019715457, roi_batch_size_per_image=472, eval_period=1000, aug_kwargs={'HorizontalFlip': {'p': 0.5}, 'RandomBrightnessContrast': {'p': 0.5}, 'ShiftScaleRotate': {'p': 0.5, 'rotate_limit': 10, 'scale_limit': 0.15}})


In [13]:
cfg = get_cfg()
original_output_dir = cfg.OUTPUT_DIR
cfg.OUTPUT_DIR = str(outdir)
print(f"cfg.OUTPUT_DIR {original_output_dir} -> {cfg.OUTPUT_DIR}")

cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
cfg.DATASETS.TRAIN = ("vinbigdata_train",)
cfg.DATASETS.TEST = ()
# cfg.DATASETS.TEST = ("vinbigdata_train",)
# cfg.TEST.EVAL_PERIOD = 50
cfg.DATALOADER.NUM_WORKERS = 2
# Let training initialize from model zoo
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml")
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = flags.base_lr  # pick a good LR
cfg.SOLVER.MAX_ITER = flags.iter
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = flags.roi_batch_size_per_image
cfg.MODEL.ROI_HEADS.NUM_CLASSES = len(thing_classes)
# NOTE: this config means the number of classes, but a few popular unofficial tutorials incorrect uses num_classes+1 here.

### --- Inference & Evaluation ---
# Inference should use the config with parameters that are used in training
# cfg now already contains everything we've set previously. We changed it a little bit for inference:
# path to the model we just trained
cfg.MODEL.WEIGHTS = str(traineddir/"tunning_v1.pth")
print("Original thresh", cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST)  # 0.05
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.15  # set a custom testing threshold
print("Changed  thresh", cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST)
predictor = DefaultPredictor(cfg)

DatasetCatalog.register(
    "vinbigdata_test", lambda: get_vinbigdata_dicts_test(imgdir, test_meta, debug=debug)
)
MetadataCatalog.get("vinbigdata_test").set(thing_classes=thing_classes)
metadata = MetadataCatalog.get("vinbigdata_test")
dataset_dicts = get_vinbigdata_dicts_test(imgdir, test_meta, debug=debug)

if debug:
    dataset_dicts = dataset_dicts[:100]

results_list = []
index = 0
batch_size = 4

for i in tqdm(range(ceil(len(dataset_dicts) / batch_size))):
    inds = list(range(batch_size * i, min(batch_size * (i + 1), len(dataset_dicts))))
    dataset_dicts_batch = [dataset_dicts[i] for i in inds]
    im_list = [cv2.imread(d["file_name"]) for d in dataset_dicts_batch]
    outputs_list = predict_batch(predictor, im_list)

    for im, outputs, d in zip(im_list, outputs_list, dataset_dicts_batch):
        resized_height, resized_width, ch = im.shape
        # outputs = predictor(im)
        if index < 5:
            # format is documented at https://detectron2.readthedocs.io/tutorials/models.html#model-output-format
            v = Visualizer(
                im[:, :, ::-1],
                metadata=metadata,
                scale=0.5,
                instance_mode=ColorMode.IMAGE_BW
                # remove the colors of unsegmented pixels. This option is only available for segmentation models
            )
            out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
            # cv2_imshow(out.get_image()[:, :, ::-1])
            cv2.imwrite(str(outdir / f"pred_{index}.jpg"), out.get_image()[:, :, ::-1])

        image_id, dim0, dim1 = test_meta.iloc[index].values

        instances = outputs["instances"]
        if len(instances) == 0:
            # No finding, let's set 14 1 0 0 1 1x.
            result = {"image_id": image_id, "PredictionString": "14 1.0 0 0 1 1"}
        else:
            # Find some bbox...
            # print(f"index={index}, find {len(instances)} bbox.")
            fields: Dict[str, Any] = instances.get_fields()
            pred_classes = fields["pred_classes"]  # (n_boxes,)
            pred_scores = fields["scores"]
            # shape (n_boxes, 4). (xmin, ymin, xmax, ymax)
            pred_boxes = fields["pred_boxes"].tensor

            h_ratio = dim0 / resized_height
            w_ratio = dim1 / resized_width
            pred_boxes[:, [0, 2]] *= w_ratio
            pred_boxes[:, [1, 3]] *= h_ratio

            pred_classes_array = pred_classes.cpu().numpy()
            pred_boxes_array = pred_boxes.cpu().numpy()
            pred_scores_array = pred_scores.cpu().numpy()

            result = {
                "image_id": image_id,
                "PredictionString": format_pred(
                    pred_classes_array, pred_boxes_array, pred_scores_array
                ),
            }
        results_list.append(result)
        index += 1

cfg.OUTPUT_DIR ./output -> results/v9
Original thresh 0.05
Changed  thresh 0.15
Creating data...
image shape: (1024, 1024, 3)


100%|██████████| 6427/6427 [00:00<00:00, 14672.09it/s]


Load from cache dataset_dicts_cache_test_debug0.pkl



torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /usr/local/src/pytorch/aten/src/ATen/native/TensorShape.cpp:3526.)

100%|██████████| 1607/1607 [15:24<00:00,  1.74it/s]


Here I set `cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.0` to produce **all the detection box prediction even if confidence score is very low**.<br/>
Actually it affects a lot to score, since competition metric is AP (Average-Precision) which is calculated using the boxes with confidence score = 0~100%.

In [17]:
# This submission includes only detection model's predictions
submission_det = pd.DataFrame(results_list, columns=['image_id', 'PredictionString'])
#df = df.rename(columns={'image_id': 'ID', 'PredictionString': 'TARGET'})
submission_det.to_csv("/kaggle/working/submission.csv", index=False)
submission_det

Unnamed: 0,image_id,PredictionString
0,j8ucb4pF6s210AWzYbWtWDjkHDfEIvqh,14 1.0 0 0 1 1
1,llXsB94LSKfiFDTmVqQ8qK5dYtQEyJdN,14 1.0 0 0 1 1
2,Lvh5LHHZvSPb9N6LVXv1Ez8NMf1XlTki,14 1.0 0 0 1 1
3,4OPuciU1DBQZJA7kWMFPeCZ58q2bI7w7,14 1.0 0 0 1 1
4,ZyaXVXMe31nPYUV81ZzE0fe2VAFnhkJu,14 1.0 0 0 1 1
...,...,...
6422,iHqtVNSRodeKd1bbFUyjLQnoXpRvzVs8,14 1.0 0 0 1 1
6423,URqMbGEvFkUabTn3fWS84sUJw7OPbeFF,14 1.0 0 0 1 1
6424,89jErvxxE1rxBKnDoAYD12exyELelZ1A,14 1.0 0 0 1 1
6425,pSMAnMlp97Jc2sGxIeQJU81jysvRUDn8,14 1.0 0 0 1 1


# MODEL TUNNING

In [13]:
!pip install optuna



In [None]:
import os
import json
import torch
import shutil
from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
from detectron2.data import MetadataCatalog, DatasetCatalog, build_detection_test_loader
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2 import model_zoo
import optuna
from detectron2.checkpoint import DetectionCheckpointer
from detectron2.data.datasets import register_coco_instances
from detectron2.data import MetadataCatalog, DatasetCatalog


import json
import pandas as pd
import numpy as np
import os
from PIL import Image

def convert_csv_to_coco_json(csv_path, img_dir_path, output_json_path):
    df = pd.read_csv(csv_path)

    images = []
    annotations = []
    categories = [{"id": i, "name": name} for i, name in enumerate([
        "Aortic enlargement", "Atelectasis", "Calcification", "Cardiomegaly", "Consolidation", "ILD", "Infiltration",
        "Lung Opacity", "Nodule/Mass", "Other lesion", "Pleural effusion", "Pleural thickening", "Pneumothorax",
        "Pulmonary fibrosis", "No finding"
    ])]

    image_ids = df['image_id'].unique()
    image_id_map = {image_id: idx for idx, image_id in enumerate(image_ids)}

    for image_id in image_ids:
        img_path = os.path.join(img_dir_path, f"{image_id}.png")
        with Image.open(img_path) as img:
            width, height = img.size
        images.append({
            "id": int(image_id_map[image_id]),
            "file_name": f"{image_id}.png",
            "width": int(width),
            "height": int(height)
        })

    for idx, row in df.iterrows():
        annotations.append({
            "id": int(idx),
            "image_id": int(image_id_map[row['image_id']]),
            "category_id": int(row['class_id']),
            "bbox": [float(row['x_min']), float(row['y_min']), float(row['x_max'] - row['x_min']), float(row['y_max'] - row['y_min'])],
            "area": float((row['x_max'] - row['x_min']) * (row['y_max'] - row['y_min'])),
            "iscrowd": 0
        })

    coco_format = {
        "images": images,
        "annotations": annotations,
        "categories": categories
    }

    with open(output_json_path, 'w') as f:
        json.dump(coco_format, f)


# Paths to your files
train_csv_path = "/kaggle/input/amia-public-challenge-2024/train.csv"
train_image_dir = "/kaggle/input/amia-public-challenge-2024/train/train"
test_csv_path = "/kaggle/input/amia-public-challenge-2024/test.csv"
test_image_dir = "/kaggle/input/amia-public-challenge-2024/test/test"
train_json_path = "/kaggle/working/train_annotations.json"
test_json_path = "/kaggle/working/test_annotations.json"

# Convert CSV to COCO JSON
convert_csv_to_coco_json(train_csv_path, train_image_dir, train_json_path)
convert_csv_to_coco_json(test_csv_path, test_image_dir, test_json_path)

# Register the datasets
register_coco_instances("lung_train", {}, train_annotations_path, train_image_dir)
register_coco_instances("lung_val", {}, test_annotations_path, test_image_dir)

# Define the objective function for Optuna
def objective(trial):
    # Configuration settings
    cfg = get_cfg()
    cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
    cfg.DATASETS.TRAIN = ("lung_train",)
    cfg.DATASETS.TEST = ("lung_val",)  # Validation set
    cfg.DATALOADER.NUM_WORKERS = trial.suggest_int('num_workers', 2, 8)  # Modify num_workers here
    
    # Hyperparameters to tune
    cfg.SOLVER.BASE_LR = trial.suggest_loguniform('base_lr', 1e-6, 1e-2)
    cfg.SOLVER.IMS_PER_BATCH = trial.suggest_int('ims_per_batch', 1, 16)
    cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = trial.suggest_int('batch_size_per_image', 128, 512)
    cfg.SOLVER.MAX_ITER = trial.suggest_int('max_iter', 100, 1000)
    
    # New parameters
    cfg.SOLVER.CHECKPOINT_PERIOD = trial.suggest_int('eval_period', 500, 2000)  # Checkpoint saving interval
    cfg.SOLVER.LR_SCHEDULER_NAME = trial.suggest_categorical('lr_scheduler_name', ['WarmupCosineLR', 'WarmupMultiStepLR'])
    cfg.DATALOADER.FILTER_EMPTY_ANNOTATIONS = trial.suggest_categorical('use_class14', [True, False])
    
    # Other configurations
    cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml")
    cfg.MODEL.ROI_HEADS.NUM_CLASSES = 14 if not cfg.DATALOADER.FILTER_EMPTY_ANNOTATIONS else 15  # 14 classes or 15 if including "No finding"
    cfg.OUTPUT_DIR = "./output"
    
    # Trainer
    trainer = DefaultTrainer(cfg)
    trainer.resume_or_load(resume=False)
    trainer.train()
    
    # Save the model
    checkpointer = DetectionCheckpointer(trainer.model)
    checkpointer.save("model_final")
    
    # Evaluator
    evaluator = COCOEvaluator("lung_val", cfg, False, output_dir="./output/")
    val_loader = build_detection_test_loader(cfg, "lung_val")
    metrics = inference_on_dataset(trainer.model, val_loader, evaluator)
    
    return metrics["bbox"]["AP"]  # Use the Average Precision metric

# Create and optimize the study
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=10)

# Save the best trial's hyperparameters
best_trial = study.best_trial
best_params = best_trial.params
torch.save(best_params, "tunned_params_v2.pth")

# Save the model to a file
final_model_path = "/kaggle/working/model_tunning_v2.pth"
checkpointer.save(final_model_path)




In [None]:
import os
import json
import torch
import shutil
from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
from detectron2.data import MetadataCatalog, DatasetCatalog, build_detection_test_loader
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2 import model_zoo
import optuna
from detectron2.checkpoint import DetectionCheckpointer
from detectron2.data.datasets import register_coco_instances
from detectron2.data import MetadataCatalog, DatasetCatalog

import json
import pandas as pd
import numpy as np
from PIL import Image

def convert_csv_to_coco_json(csv_path, img_dir_path, output_json_path):
    df = pd.read_csv(csv_path)
    
    # Debugging step to print columns
    print("Columns in CSV:", df.columns)

    images = []
    annotations = []
    categories = [{"id": i, "name": name} for i, name in enumerate([
        "Aortic enlargement", "Atelectasis", "Calcification", "Cardiomegaly", "Consolidation", "ILD", "Infiltration",
        "Lung Opacity", "Nodule/Mass", "Other lesion", "Pleural effusion", "Pleural thickening", "Pneumothorax",
        "Pulmonary fibrosis", "No finding"
    ])]

    image_ids = df['image_id'].unique()
    image_id_map = {image_id: idx for idx, image_id in enumerate(image_ids)}

    for image_id in image_ids:
        img_path = os.path.join(img_dir_path, f"{image_id}.png")
        with Image.open(img_path) as img:
            width, height = img.size
        images.append({
            "id": int(image_id_map[image_id]),
            "file_name": f"{image_id}.png",
            "width": int(width),
            "height": int(height)
        })

    for idx, row in df.iterrows():
        annotations.append({
            "id": int(idx),
            "image_id": int(image_id_map[row['image_id']]),
            "category_id": int(row['class_id']),
            "bbox": [float(row['x_min']), float(row['y_min']), float(row['x_max'] - row['x_min']), float(row['y_max'] - row['y_min'])],
            "area": float((row['x_max'] - row['x_min']) * (row['y_max'] - row['y_min'])),
            "iscrowd": 0
        })

    coco_format = {
        "images": images,
        "annotations": annotations,
        "categories": categories
    }

    with open(output_json_path, 'w') as f:
        json.dump(coco_format, f)

# Paths to your files
train_csv_path = "/kaggle/input/amia-public-challenge-2024/train.csv"
train_image_dir = "/kaggle/input/amia-public-challenge-2024/train/train"
test_csv_path = "/kaggle/input/amia-public-challenge-2024/test.csv"
test_image_dir = "/kaggle/input/amia-public-challenge-2024/test/test"
train_json_path = "/kaggle/working/train_annotations.json"
test_json_path = "/kaggle/working/test_annotations.json"

# Convert CSV to COCO JSON
convert_csv_to_coco_json(train_csv_path, train_image_dir, train_json_path)
convert_csv_to_coco_json(test_csv_path, test_image_dir, test_json_path)

# Register the datasets
register_coco_instances("lung_train", {}, train_json_path, train_image_dir)
register_coco_instances("lung_val", {}, test_json_path, test_image_dir)

# Define the objective function for Optuna
def objective(trial):
    # Configuration settings
    cfg = get_cfg()
    cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
    cfg.DATASETS.TRAIN = ("lung_train",)
    cfg.DATASETS.TEST = ()  # Validation set
    cfg.DATALOADER.NUM_WORKERS = trial.suggest_int('num_workers', 2, 8)  # Modify num_workers here
    
    # Hyperparameters to tune
    cfg.SOLVER.BASE_LR = trial.suggest_loguniform('base_lr', 1e-6, 1e-2)
    cfg.SOLVER.IMS_PER_BATCH = trial.suggest_int('ims_per_batch', 1, 16)
    cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = trial.suggest_int('batch_size_per_image', 128, 512)
    cfg.SOLVER.MAX_ITER = trial.suggest_int('max_iter', 100, 1000)
    
    # New parameters
    cfg.SOLVER.CHECKPOINT_PERIOD = trial.suggest_int('eval_period', 500, 2000)  # Checkpoint saving interval
    cfg.SOLVER.LR_SCHEDULER_NAME = trial.suggest_categorical('lr_scheduler_name', ['WarmupCosineLR', 'WarmupMultiStepLR'])
    cfg.DATALOADER.FILTER_EMPTY_ANNOTATIONS = trial.suggest_categorical('use_class14', [True, False])
    
    # Other configurations
    cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml")
    cfg.MODEL.ROI_HEADS.NUM_CLASSES = 14 if not cfg.DATALOADER.FILTER_EMPTY_ANNOTATIONS else 15  # 14 classes or 15 if including "No finding"
    cfg.OUTPUT_DIR = "./output"
    
    # Trainer
    trainer = DefaultTrainer(cfg)
    trainer.resume_or_load(resume=False)
    trainer.train()
    
    # Save the model
    checkpointer = DetectionCheckpointer(trainer.model)
    checkpointer.save("model_final")
    
    # Evaluator
    evaluator = COCOEvaluator("lung_val", cfg, False, output_dir="./output/")
    val_loader = build_detection_test_loader(cfg, "lung_val")
    metrics = inference_on_dataset(trainer.model, val_loader, evaluator)
    
    return metrics["bbox"]["AP"]  # Use the Average Precision metric

# Create and optimize the study
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=10)

# Save the best trial's hyperparameters
best_trial = study.best_trial
best_params = best_trial.params
torch.save(best_params, "tunned_params_v2.pth")

# Save the model to a file
final_model_path = "/kaggle/working/model_tunning_v2.pth"
checkpointer.save(final_model_path)


In [None]:
# Print the best hyperparameters
print("Best hyperparameters: ", study.best_params)
print("Best AP: ", study.best_value)


In [19]:
# Train the final model using the best hyperparameters
best_params = study.best_params
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
cfg.SOLVER.BASE_LR = best_params['base_lr']
cfg.SOLVER.IMS_PER_BATCH = best_params['ims_per_batch']
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = best_params['batch_size_per_image']
cfg.SOLVER.MAX_ITER = best_params['max_iter']

cfg.DATASETS.TRAIN = ("lung_train3",)
cfg.DATASETS.TEST = ()
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml")
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 15

trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()