## Bounding Box Fusion
In this competition, we are given annotated bounding boxes by 3 radiologists for each image in the training set. We will be scored on our predictions for the test set, which was annotated by the **consensus** of **5** radiologists.  From the paper:
> Each scan in the training set was independently labeled by 3 radiologists, while each scan in the test set was labeled by the consensus of 5 radiologists.

It may be safe to assume that annotations from a consensus of 5 radiologists would be similar in number and shape to the annotations of only 1 radiologist. If so, it may make sense to transform the annotations in the training set to something approximating the output of only 1 radiologist.


### Implications:
If each radiologist in the training set...
* had an identical # of annotations with equal bounding boxes -> we could use the annotations from 1 radiologist for each image
* had an identical # of annotations but slightly different bounding boxes -> we could merge the bounding boxes for each annotation
* had a differing # of annotations, and thus different bounding boxes -> what to do?

The last scenario above is what we have. There are many options, which mostly depend on how to merge bounding boxes based on how much they overlap and/or assigning weight/confidence to each bounding box (aka we may assign 33% confidence to each annotation - which would boost overlapping boxes).  Or if we want to get fancy: start normalizing radiologists' annotations (perhaps one radiologist sees "Nodules" everywhere but others do not).

Additionally, as noted in our [disussion](https://www.kaggle.com/c/vinbigdata-chest-xray-abnormalities-detection/discussion/211035), it seems prudent to assume that only 1 aortic enlargement and only 1 cardiomegaly can appear in any image (we only have one aortic valve and one heart (unless having a very very rare condition)).  The aortic enlargement annotations seem to be uniformly located around the aortic arch.

I have written code below to combine annotations so that there is only 1 aortic enlargement and 1 cardiomegaly in each image.  The code also combines bounding boxes with an intersection over union of > .55.

Let me know what you think!

Inspiration and code from:

[@backtracking](https://www.kaggle.com/backtracking): https://www.kaggle.com/backtracking/bounding-boxes-optimization

Weighted-Boxes-Fusion: https://github.com/ZFTurbo/Weighted-Boxes-Fusion

https://www.kaggle.com/raddar/convert-dicom-to-np-array-the-correct-way

In [None]:
!pip install -U iterative-stratification
!pip install -U datatable
!pip install -U ensemble-boxes
import copy
import warnings
import numpy as np
import pandas as pd
import datatable as dt
import pydicom
from pydicom.pixel_data_handlers.util import apply_voi_lut
from random import randint
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from PIL import Image
from ensemble_boxes import *

In [None]:
# Load data
vin_dir = '/kaggle/input/vinbigdata-chest-xray-abnormalities-detection'
sizes_dir = '/kaggle/input/vindrimagesizes'
train = pd.read_csv(f'{vin_dir}/train.csv')
train_orig = copy.deepcopy(train)
sizes = pd.read_csv(f'{sizes_dir}/sizes.csv')
sizes.columns = ['image_id', 'width', 'height']
sizes.image_id = sizes.image_id.str[:-4]
train = train.set_index('image_id').join(sizes.set_index('image_id')).reset_index()
class_id_to_name_dict = train[['class_name', 'class_id']].drop_duplicates().iloc[:,[1,0]].set_index('class_id').to_dict()['class_name']

In [None]:
# Translate our bounding boxes into those readable by ZFTurbo and merge
def get_fused_boxes(image_id, records, only_one=False):
    boxes = records[['x_min', 'y_min', 'x_max', 'y_max']].values
    pix_multiplier = pd.DataFrame([records.width,records.height,records.width,records.height]).T
    boxes = [(boxes/(pix_multiplier)).values.tolist()]
    labels = [records["class_id"].tolist()]
    scores = [[1]*len(records)]
    weights = [1]

    iou_thr = 0.55
    skip_box_thr = 0
    sigma = 0.1
    # If we demand only one of the label per image, we set iou threshold to 0
    if only_one:
        boxes, scores, labels = weighted_boxes_fusion(boxes, scores, labels, weights=weights, iou_thr=0, skip_box_thr=skip_box_thr)
    else:
        boxes, scores, labels = weighted_boxes_fusion(boxes, scores, labels, weights=weights, iou_thr=iou_thr, skip_box_thr=skip_box_thr)
    boxes = boxes * pix_multiplier.iloc[:len(boxes),:]
    boxes.columns = ['x_min', 'y_min', 'x_max', 'y_max']
    boxes['class_id'] = labels.astype(int)
    boxes['image_id'] = image_id
    return boxes

In [None]:
# Merge annotations
train_w_finding = train.loc[train.class_id != 14]
train_w_finding_image_ids = list(train_w_finding.image_id.unique())

train_fused = []
# For each image with a finding:
for image_id in train_w_finding_image_ids:
    records = copy.deepcopy(train_w_finding.loc[train_w_finding.image_id == image_id,:])
    # make only one each of cardiomegaly and aortic enlargement boxes
    for class_id in [0,3]:
        idx = records.loc[records.class_id==class_id,:].index
        if len(idx) > 1:
            finding = records.loc[idx,:]
            boxes = get_fused_boxes(image_id, finding, only_one=True)
            train_fused.append(boxes)
            records = records.drop(idx)
    
    boxes = get_fused_boxes(image_id, records)
    train_fused.append(boxes)

train_fused = pd.concat(train_fused)
train_fused['class_name'] = train_fused.class_id.map(class_id_to_name_dict)
train_fused = train_fused[['image_id','class_name', 'class_id', 'x_min', 'y_min', 'x_max', 'y_max']]

# Add 'no findings' back in
no_finding_idx = train.loc[train.class_id == 14].image_id.drop_duplicates().index
no_finding_df = copy.deepcopy(train.iloc[no_finding_idx,:])
no_finding_df[['x_min', 'y_min', 'x_max', 'y_max']] = [0,0,1,1]
no_finding_df = no_finding_df[['image_id','class_name', 'class_id', 'x_min', 'y_min', 'x_max', 'y_max']]
train_fused = pd.concat([no_finding_df,train_fused])

In [None]:
def read_xray(path, rescale_color=True):

    dicom = pydicom.read_file(path)
    with warnings.catch_warnings():
        warnings.simplefilter('ignore')
        data = apply_voi_lut(dicom.pixel_array, dicom)
    if dicom.PhotometricInterpretation == "MONOCHROME1":
        data = np.amax(data) - data
    if rescale_color:
        data = data - np.min(data)
        data = data / np.max(data)
        data = (data * 255).astype(np.uint8)
    return data

def plot_scale_compare(imageid, t_old, t_new, size=None):
    img = read_xray('/kaggle/input/vinbigdata-chest-xray-abnormalities-detection/train/{}.dicom'.format(imageid))
    if size is not None:
        img_new = resize(img, size)
    else:
        img_new = img
    
    fig, (ax1, ax2) = plt.subplots(1, 2)
    for t, img, ax in zip([t_old, t_new], [img, img_new], [ax1, ax2]): 
        infos = t[t['image_id'] == imageid].sort_values(by='class_id')
        class_ids = infos['class_id'].unique()
        label2color = {class_id:[randint(0,255)/255 for i in range(3)] for class_id in class_ids}
        for index, row in infos.iterrows():
            rec_min = (row['x_min'], row['y_min'])
            color = label2color[row['class_id']]
            rect = patches.Rectangle(rec_min,row['x_max']-row['x_min'],row['y_max']-row['y_min'],
                                     linewidth=2,edgecolor= color,facecolor='none',label=row['class_name'])
            ax.add_patch(rect)
            ax.legend()
        ax.imshow(img, 'gray')
    fig.set_size_inches(22,16)
    plt.show()

In [None]:
# Visualize.  Old on left, new on right

if False: # Only show images with >2 cardiomegalies or aortic enlargements
    t = train.loc[train.class_id.isin([0,3]),:].groupby('image_id').count().reset_index()
    image_ids = list(t.loc[t.class_name>1,:].image_id.unique())
else:
    image_ids = train.image_id.unique()

for image_id in image_ids[:20]:
    plot_scale_compare(image_id, train, train_fused)    