# Get Model Predictions and Ensemble the results
In this notebook, we create predictions using the faster rcnn and cascade rcnn model weights. After getting the predictions, we ensemble the predictions of the two models using weighted boxes fusion.

**We need a GPU for this notebook to get model predictions**

In [2]:
# ensemble_boxes library is required for ensembling the results of the two models
!pip install ensemble_boxes

Collecting ensemble_boxes
  Downloading ensemble_boxes-1.0.9-py3-none-any.whl (23 kB)
Installing collected packages: ensemble-boxes
Successfully installed ensemble-boxes-1.0.9


In [3]:
import cv2
import numpy as np
import pandas as pd
from tqdm import tqdm
import sys

In [4]:
sys.path.append('../src')    # Add the source directory to the PYTHONPATH. This allows to import local functions and modules.

In [5]:
from detection_util import create_predictions
from gdsc_util import PROJECT_DIR
from training_frcnn_5k_r101 import load_config as load_config_frcnn
from training_crcnn_5k_r101 import load_config as load_config_crcnn
from merge_ensemble_results import generate_test_results

In [6]:
data_folder = str(PROJECT_DIR / 'data')

In [7]:
# Load configs for faster RCNN and cascade RCNN models
cfg_frcnn, base_file_frcnn = load_config_frcnn(data_folder)
cfg_crcnn, base_file_crcnn = load_config_crcnn(data_folder)

In [8]:
# Get the test filenames
test_files_path = f'{data_folder}/test_files.csv'
file_names = pd.read_csv(test_files_path, sep=';', header=None)[0].values

In [9]:
# Specify path of the two model weights relative to the data folder
with open(f'{PROJECT_DIR}/experiment_frcnn_5k_r101_epoch_24.txt', 'r') as f:
    experiment_name_frcnn = f.read()

with open(f'{PROJECT_DIR}/experiment_crcnn_5k_r101_epoch_24.txt', 'r') as f:
    experiment_name_crcnn = f.read()
    
frcnn_model_weight = f'{experiment_name_frcnn}/frcnn_epoch_24.pth'
crcnn_model_weight = f'{experiment_name_crcnn}/crcnn_epoch_24.pth'

In [10]:
# Create predictions for faster RCNN model
checkpoint_frcnn = f'{data_folder}/{frcnn_model_weight}'
prediction_df_frcnn = create_predictions(file_names, cfg_frcnn, checkpoint_frcnn, device='cuda')

load checkpoint from local path: /home/sagemaker-user/gdsc5-tutorials-public/data/training-frcnn-5k-r101-2022-07-20-08-53-07-587/frcnn_epoch_24.pth




[2022-08-17 06:21:45.810 gdsc5-smstudio-cust-ml-g4dn-xlarge-21531be2e6472c39ba6c0447db92:45 INFO utils.py:27] RULE_JOB_STOP_SIGNAL_FILENAME: None
[2022-08-17 06:21:45.845 gdsc5-smstudio-cust-ml-g4dn-xlarge-21531be2e6472c39ba6c0447db92:45 INFO profiler_config_parser.py:102] Unable to find config at /opt/ml/input/config/profilerconfig.json. Profiler is disabled.


  1%|▏         | 1/73 [00:09<10:54,  9.10s/it]INFO:detection_util:Processing file: 100_C.jpg
  3%|▎         | 2/73 [00:11<06:23,  5.40s/it]INFO:detection_util:Processing file: 100_B.jpg
  4%|▍         | 3/73 [00:13<04:30,  3.86s/it]INFO:detection_util:Processing file: 100_AA.jpg
  5%|▌         | 4/73 [00:16<03:42,  3.23s/it]INFO:detection_util:Processing file: 100_A.jpg
  7%|▋         | 5/73 [00:18<03:24,  3.01s/it]INFO:detection_util:Processing file: 101_DD.jpg
  8%|▊         | 6/73 [00:21<03:05,  2.78s/it]INFO:detection_util:Processing file: 101_C.jpg
 10%|▉         | 7/73 [00:23<02:56,  2.67s/it]INFO:detection_util:Processing file: 101_B.jpg
 11%|█         | 8/73 [00:25<02:47,  2.57s/it]INFO:detection_util:Processing file: 101_AA.jpg
 12%|█▏        | 9/73 [00:28<02:39,  2.49s/it]INFO:detection_util:Processing file: 101_A.jpg
 14%|█▎        | 10/73 [00:30<02:31,  2.41s/it]INFO:detection_util:Processing file: 86_D.jpg
 15%|█▌        | 11/73 [00:33<02:42,  2.62s/it]INFO:detection_util:

In [11]:
# Save Faster RCNN model predictions
frcnn_result_path = f'{data_folder}/results_frcnn_test.csv'
prediction_df_frcnn.to_csv(frcnn_result_path, sep=';')

In [12]:
# Create predictions for faster RCNN model
checkpoint_crcnn = f'{data_folder}/{crcnn_model_weight}'
prediction_df_crcnn = create_predictions(file_names, cfg_crcnn, checkpoint_crcnn, device='cuda')

load checkpoint from local path: /home/sagemaker-user/gdsc5-tutorials-public/data/training-crcnn-5k-r101-2022-07-21-09-59-02-369/crcnn_epoch_24.pth


INFO:detection_util:Creating predictions for 73 files
  0%|          | 0/73 [00:00<?, ?it/s]INFO:detection_util:Processing file: 100_D.jpg
  1%|▏         | 1/73 [00:02<03:18,  2.75s/it]INFO:detection_util:Processing file: 100_C.jpg
  3%|▎         | 2/73 [00:05<03:10,  2.69s/it]INFO:detection_util:Processing file: 100_B.jpg
  4%|▍         | 3/73 [00:07<02:47,  2.40s/it]INFO:detection_util:Processing file: 100_AA.jpg
  5%|▌         | 4/73 [00:09<02:42,  2.36s/it]INFO:detection_util:Processing file: 100_A.jpg
  7%|▋         | 5/73 [00:12<02:47,  2.47s/it]INFO:detection_util:Processing file: 101_DD.jpg
  8%|▊         | 6/73 [00:14<02:41,  2.42s/it]INFO:detection_util:Processing file: 101_C.jpg
 10%|▉         | 7/73 [00:17<02:41,  2.44s/it]INFO:detection_util:Processing file: 101_B.jpg
 11%|█         | 8/73 [00:19<02:36,  2.41s/it]INFO:detection_util:Processing file: 101_AA.jpg
 12%|█▏        | 9/73 [00:21<02:32,  2.38s/it]INFO:detection_util:Processing file: 101_A.jpg
 14%|█▎        | 10/7

In [13]:
# Save Cascade RCNN model predictions
crcnn_result_path = f'{data_folder}/results_crcnn_test.csv'
prediction_df_crcnn.to_csv(crcnn_result_path, sep=';')

## Ensemble the results of the two models

In [14]:
ensemble_df = generate_test_results(frcnn_result_path, crcnn_result_path, file_names)

Getting dimensions from Images


100%|██████████| 73/73 [01:05<00:00,  1.11it/s]


Merging Results from the two models


100%|██████████| 73/73 [00:01<00:00, 71.14it/s]


In [15]:
restricted_ensemble_df = ensemble_df[ensemble_df.detection_score>0.5]
restricted_ensemble_df.to_csv(f'{data_folder}/frcnn_crcnn_ensemble_r101_detection_score_50.csv', sep=';')

INFO:numexpr.utils:NumExpr defaulting to 4 threads.


In [16]:
restricted_ensemble_df

Unnamed: 0,xmin,ymin,xmax,ymax,detection_score,file_name,section_id
0,382.0,1289.0,678.0,1686.0,0.996147,100_D.jpg,100_D.jpg@382.0-678.0-1289.0-1686.0
1,1152.0,2169.0,1749.0,2706.0,0.996121,100_D.jpg,100_D.jpg@1152.0-1749.0-2169.0-2706.0
2,3070.0,6516.0,3245.0,6701.0,0.994554,100_D.jpg,100_D.jpg@3070.0-3245.0-6516.0-6701.0
3,2902.0,3139.0,3085.0,3325.0,0.994287,100_D.jpg,100_D.jpg@2902.0-3085.0-3139.0-3325.0
4,3049.0,4090.0,3274.0,4314.0,0.994223,100_D.jpg,100_D.jpg@3049.0-3274.0-4090.0-4314.0
...,...,...,...,...,...,...,...
6979,2525.0,3925.0,2699.0,4107.0,0.543512,99_A.jpg,99_A.jpg@2525.0-2699.0-3925.0-4107.0
6980,4627.0,2275.0,4702.0,2362.0,0.538379,99_A.jpg,99_A.jpg@4627.0-4702.0-2275.0-2362.0
6981,4704.0,1410.0,4764.0,1467.0,0.520698,99_A.jpg,99_A.jpg@4704.0-4764.0-1410.0-1467.0
6982,4834.0,1768.0,4926.0,1889.0,0.511173,99_A.jpg,99_A.jpg@4834.0-4926.0-1768.0-1889.0
