# Evaluation verification

The purpose of this notebook is to show that the code used for evaluation is the same as the one used in the original CCMCT and CMC dataset.

## CCMCT dataset
### Download prediction result from the baseline git

We used the prediction result of the detection stage of the ODAEL variant of the CCMCT dataset as an example.

In [1]:
!wget -P ./ https://github.com/DeepPathology/MITOS_WSI_CCMCT/raw/master/results/RetinaNet-ODAEL-export.pth-CCMCT_ODAEL-inference_results_boxes.p.bz2

20.205.243.166
Connecting to github.com (github.com)|20.205.243.166|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/DeepPathology/MITOS_WSI_CCMCT/master/results/RetinaNet-ODAEL-export.pth-CCMCT_ODAEL-inference_results_boxes.p.bz2 [following]
--2022-07-18 11:31:00--  https://raw.githubusercontent.com/DeepPathology/MITOS_WSI_CCMCT/master/results/RetinaNet-ODAEL-export.pth-CCMCT_ODAEL-inference_results_boxes.p.bz2
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.108.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 11145255 (11M) [application/octet-stream]
Saving to: ‘./RetinaNet-ODAEL-export.pth-CCMCT_ODAEL-inference_results_boxes.p.bz2.5’


2022-07-18 11:31:01 (11.0 MB/s) - ‘./RetinaNet-ODAEL-export.pth-CCMCT_ODAEL-inference_results_boxes.p.bz2.5’ s

### Decompress .p.bz2 file and format the output

In [2]:
import bz2
import pickle
import numpy as np


filepath = './RetinaNet-ODAEL-export.pth-CCMCT_ODAEL-inference_results_boxes.p.bz2'
zipfile = bz2.BZ2File(filepath) 
data = zipfile.read() 
newfilepath = filepath[:-4]
open(newfilepath, 'wb').write(data)
pred = pickle.load(open(newfilepath, 'rb'))
template = pickle.load(open('../detection/mmdetection/data/database/CCMCT_label.pkl', 'rb'))

for i in pred: 
    template[i]['pred_bbox'] = []
    bboxs = []
    for j in pred[i]:
        xmin, ymin, xmax, ymax, cls, conf = j
        bboxs.append(np.array([xmin, ymin, xmax, ymax, conf], dtype = np.float32))
    template[i]['pred_bbox'] = bboxs
pickle.dump(template, open('./CCMCT_baseline_git.pkl', 'wb'))

### Evaluate the prediction result

At the threshold of 0.79, the model achieves 0.628, 0.577, and 0.689 F1, precision, and recall, respectively. This match the result shown in https://github.com/DeepPathology/MITOS_WSI_CCMCT/blob/master/Evaluation.ipynb.

In [3]:
!python3 tools/eval/calculate_F1.py -ip ./CCMCT_baseline_git.pkl

precision = 0.000, recall = 0.000
precision = 0.000, recall = 0.000
precision = 0.000, recall = 0.000
precision = 0.000, recall = 0.000
At thresh = 0.96 TP = 80, FN = 22396, FP = 88, F1 = 0.71
precision = 0.476, recall = 0.004
At thresh = 0.95 TP = 147, FN = 22329, FP = 144, F1 = 1.29
precision = 0.505, recall = 0.007
At thresh = 0.94 TP = 288, FN = 22188, FP = 235, F1 = 2.50
precision = 0.551, recall = 0.013
At thresh = 0.93 TP = 528, FN = 21948, FP = 381, F1 = 4.52
precision = 0.581, recall = 0.023
At thresh = 0.92 TP = 990, FN = 21486, FP = 596, F1 = 8.23
precision = 0.624, recall = 0.044
At thresh = 0.91 TP = 1663, FN = 20813, FP = 871, F1 = 13.30
precision = 0.656, recall = 0.074
At thresh = 0.90 TP = 2595, FN = 19881, FP = 1244, F1 = 19.72
precision = 0.676, recall = 0.115
At thresh = 0.89 TP = 3856, FN = 18620, FP = 1650, F1 = 27.56
precision = 0.700, recall = 0.172
At thresh = 0.88 TP = 5256, FN = 17220, FP = 2209, F1 = 35.11
precision = 0.704, recall = 0.234
At thresh = 0.87 T

At thresh = 0.12 TP = 22202, FN = 274, FP = 228777, F1 = 16.24
precision = 0.088, recall = 0.988
At thresh = 0.11 TP = 22202, FN = 274, FP = 228777, F1 = 16.24
precision = 0.088, recall = 0.988
At thresh = 0.10 TP = 22202, FN = 274, FP = 228777, F1 = 16.24
precision = 0.088, recall = 0.988
At thresh = 0.09 TP = 22202, FN = 274, FP = 228777, F1 = 16.24
precision = 0.088, recall = 0.988
At thresh = 0.08 TP = 22202, FN = 274, FP = 228777, F1 = 16.24
precision = 0.088, recall = 0.988
At thresh = 0.07 TP = 22202, FN = 274, FP = 228777, F1 = 16.24
precision = 0.088, recall = 0.988
At thresh = 0.06 TP = 22202, FN = 274, FP = 228777, F1 = 16.24
precision = 0.088, recall = 0.988
Best F1 = 62.82 at thresh = 0.79


## CMC dataset
### Download prediction result from the baseline git

We used the prediction result of the detection stage of the CODAEL variant of the CMC dataset as an example.

In [4]:
!wget -P ./ https://github.com/DeepPathology/MITOS_WSI_CMC/raw/master/results/test_RetinaNet-CMC-CODAEL-512sh-b1.pth-CODAEL-val-inference_results_boxes.p.bz2

--2022-07-18 11:43:43--  https://github.com/DeepPathology/MITOS_WSI_CMC/raw/master/results/test_RetinaNet-CMC-CODAEL-512sh-b1.pth-CODAEL-val-inference_results_boxes.p.bz2
Resolving github.com (github.com)... 20.205.243.166
Connecting to github.com (github.com)|20.205.243.166|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/DeepPathology/MITOS_WSI_CMC/master/results/test_RetinaNet-CMC-CODAEL-512sh-b1.pth-CODAEL-val-inference_results_boxes.p.bz2 [following]
--2022-07-18 11:43:44--  https://raw.githubusercontent.com/DeepPathology/MITOS_WSI_CMC/master/results/test_RetinaNet-CMC-CODAEL-512sh-b1.pth-CODAEL-val-inference_results_boxes.p.bz2
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.108.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 396507 (387K) 

### Decompress .p.bz2 file and format the output

In [5]:
import bz2
import pickle
import numpy as np

filepath = './test_RetinaNet-CMC-CODAEL-512sh-b1.pth-CODAEL-val-inference_results_boxes.p.bz2'
zipfile = bz2.BZ2File(filepath) # open the file
data = zipfile.read() # get the decompressed data
newfilepath = filepath[:-4] # assuming the filepath ends with .bz2
open(newfilepath, 'wb').write(data)
pred = pickle.load(open(newfilepath, 'rb'))
template = pickle.load(open('../detection/mmdetection/data/database/CMC_label.pkl', 'rb'))

for i in pred: 
    template[i]['pred_bbox'] = []
    bboxs = []
    for j in pred[i]:
        xmin, ymin, xmax, ymax, cls, conf = j
        bboxs.append(np.array([xmin, ymin, xmax, ymax, conf], dtype = np.float32))
    template[i]['pred_bbox'] = bboxs
pickle.dump(template, open('./CMC_baseline_git.pkl', 'wb'))

### Evaluate the prediction result

At the threshold of 0.62, the model achieves 0.726, 0.697, and 0.758 F1, precision, and recall, respectively. This match the result shown in https://github.com/DeepPathology/MITOS_WSI_CMC/blob/master/Evaluation.ipynb.

In [6]:
!python3 tools/eval/calculate_F1.py -ip ./CMC_baseline_git.pkl

precision = 0.000, recall = 0.000
precision = 0.000, recall = 0.000
precision = 0.000, recall = 0.000
precision = 0.000, recall = 0.000
precision = 0.000, recall = 0.000
precision = 0.000, recall = 0.000
precision = 0.000, recall = 0.000
precision = 0.000, recall = 0.000
precision = 0.000, recall = 0.000
precision = 0.000, recall = 0.000
At thresh = 0.90 TP = 91, FN = 3321, FP = 0, F1 = 5.20
precision = 1.000, recall = 0.027
At thresh = 0.89 TP = 139, FN = 3273, FP = 1, F1 = 7.83
precision = 0.993, recall = 0.041
At thresh = 0.88 TP = 203, FN = 3209, FP = 1, F1 = 11.23
precision = 0.995, recall = 0.059
At thresh = 0.87 TP = 288, FN = 3124, FP = 3, F1 = 15.55
precision = 0.990, recall = 0.084
At thresh = 0.86 TP = 375, FN = 3037, FP = 7, F1 = 19.77
precision = 0.982, recall = 0.110
At thresh = 0.85 TP = 467, FN = 2945, FP = 11, F1 = 24.01
precision = 0.977, recall = 0.137
At thresh = 0.84 TP = 570, FN = 2842, FP = 13, F1 = 28.54
precision = 0.978, recall = 0.167
At thresh = 0.83 TP = 68

At thresh = 0.06 TP = 3308, FN = 104, FP = 10784, F1 = 37.80
precision = 0.235, recall = 0.970
Best F1 = 73.05 at thresh = 0.63
