Reproducing Test Evaluation in CoronaCases and SemiSeg

Follow the main README to install MEDPSeg.

After having installed MEDPSeg, that is, the command "medpseg" works in your environment, this will guide you through reproducing our published results on the public CoronaCases and SemiSeg datasets. These datasets were chosen for this due to their ease of access without requiring registration to medical imaging segmentation challenges. CoronaCases was used as an external test dataset, not included in training, being a fair comparison site for other methods. SemiSeg was used in training validation and testing, following Inf-Net's splits.

All metrics are calculated using the code in medpseg/seg_metrics.py or medpseg/atm_evaluation.py for BD and TD.

The remainder of this README will guide you through reproducing our evaluation metrics on CoronaCases and SemiSeg test split.

Evaluation Reproduction

The file code in reproduce_evaluation.py will reproduce our evaluation results in the CoronaCases and SemiSeg datasets. Just running:

python reproduce_evaluation.py

Following is a description of each step:

Donwload data for CoronaCases and SemiSeg
Unpack and preprocess.
Run predictions in both datasets using the MEDPSeg CLI.
Compute metrics using implementations in seg_metrics.py.
Generate a results table.

The following table is what we have achieved through running the script in February 2024:

semiseg_2d_ggo

semiseg_2d_consolidation

coronacases_3d_inf

dice

false_negative_error

false_positive_error

sensitivity

specificity

mean	std
0.6493887637341156	0.20984685178535936

mean	std
0.3486784022810951	0.20904111222696362

mean	std
0.29967016007280345	0.194346475130671

mean	std
0.6367933309571926	0.20060453882862272

mean	std
0.9893210889893039	0.011094264979578237

dice

false_negative_error

false_positive_error

sensitivity

specificity

mean	std
0.5769309854556685	0.29307313426454484

mean	std
0.3634770364650775	0.2696565459942699

mean	std
0.3294236091277356	0.2810979814012969

mean	std
0.5340037994037468	0.21319093055192054

mean	std
0.994367999682688	0.01025442270113634

dice

false_negative_error

false_positive_error

sensitivity

specificity

mean	std
0.8135291858795479	0.06652757121479057

mean	std
0.1652826958455593	0.1218182096414322

mean	std
0.18548197424264676	0.08417444667991547

mean	std
0.8347173041544407	0.1218182096414322

mean	std
0.9988212383395796	0.0010165322636769925

The results generated in the table should be equal to what an user achieves through running the script on their environment. Evaluation in different datasets can be easily achieved through minor editing to reproduce_evaluation.py, mainly downloading and data organization code. Note that some results are slightly different (0.00x variations) when using pre-processed .png instead of direct HU values, due to conversions from int16 to uint8 and back.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Reproducing Test Evaluation in CoronaCases and SemiSeg

Evaluation Reproduction

Files

README.md

Latest commit

History

README.md

File metadata and controls

Reproducing Test Evaluation in CoronaCases and SemiSeg

Evaluation Reproduction