Skip to content

Latest commit

 

History

History
29 lines (17 loc) · 4.91 KB

README.md

File metadata and controls

29 lines (17 loc) · 4.91 KB

Reproducing Test Evaluation in CoronaCases and SemiSeg

Follow the main README to install MEDPSeg.

After having installed MEDPSeg, that is, the command "medpseg" works in your environment, this will guide you through reproducing our published results on the public CoronaCases and SemiSeg datasets. These datasets were chosen for this due to their ease of access without requiring registration to medical imaging segmentation challenges. CoronaCases was used as an external test dataset, not included in training, being a fair comparison site for other methods. SemiSeg was used in training validation and testing, following Inf-Net's splits.

All metrics are calculated using the code in medpseg/seg_metrics.py or medpseg/atm_evaluation.py for BD and TD.

The remainder of this README will guide you through reproducing our evaluation metrics on CoronaCases and SemiSeg test split.

Evaluation Reproduction

The file code in reproduce_evaluation.py will reproduce our evaluation results in the CoronaCases and SemiSeg datasets. Just running:

python reproduce_evaluation.py

Following is a description of each step:

  1. Donwload data for CoronaCases and SemiSeg
  2. Unpack and preprocess.
  3. Run predictions in both datasets using the MEDPSeg CLI.
  4. Compute metrics using implementations in seg_metrics.py.
  5. Generate a results table.

The following table is what we have achieved through running the script in February 2024:

semiseg_2d_ggosemiseg_2d_consolidationcoronacases_3d_inf
dicefalse_negative_errorfalse_positive_errorsensitivityspecificity
meanstd
0.64938876373411560.20984685178535936
meanstd
0.34867840228109510.20904111222696362
meanstd
0.299670160072803450.194346475130671
meanstd
0.63679333095719260.20060453882862272
meanstd
0.98932108898930390.011094264979578237
dicefalse_negative_errorfalse_positive_errorsensitivityspecificity
meanstd
0.57693098545566850.29307313426454484
meanstd
0.36347703646507750.2696565459942699
meanstd
0.32942360912773560.2810979814012969
meanstd
0.53400379940374680.21319093055192054
meanstd
0.9943679996826880.01025442270113634
dicefalse_negative_errorfalse_positive_errorsensitivityspecificity
meanstd
0.81352918587954790.06652757121479057
meanstd
0.16528269584555930.1218182096414322
meanstd
0.185481974242646760.08417444667991547
meanstd
0.83471730415444070.1218182096414322
meanstd
0.99882123833957960.0010165322636769925

The results generated in the table should be equal to what an user achieves through running the script on their environment. Evaluation in different datasets can be easily achieved through minor editing to reproduce_evaluation.py, mainly downloading and data organization code. Note that some results are slightly different (0.00x variations) when using pre-processed .png instead of direct HU values, due to conversions from int16 to uint8 and back.