Checklist for responsible deep learning modeling of medical images based on COVID-19 detection studies

Aim of this repository

This repository is created to initiate the process of standards establishment for creating reliable AI solutions.

Everyone can join us to build a community focusing on responsible AI for medical applications.

How to add paper / data resource?

In the repository, we created the possibility to add new paper and datasets. In pull request the person who would like to add a new item should attach a JSON file. You can find specific JSON’s template in folders: datasets_checklist, papers_checklist, datasets_information. In JSON file, please tell exactly which points from the checklist are fulfilled. Please justify your statement by putting comments on it in pull request description. Your submitted pull request will be veryfied by community members. They can ask for corrections or clarifications. All discussions will be visible to the public.

Explanation of symbols

Following denotements are applied for the tables:

Summary showing which points from the checklist are fulfilled by studies.

	Study	[D] Is the data preprocessing described?	[D] Are artifacts (such as captions) removed?	[D] Are the lungs fully present after transformations?	[R] Are lung structures visible after brightness or contrast transformations?	[D] Are only sensible transformations applied?	[D] Is the transfer learning procedure described?	[D] Is the applied transfer learning appropriate for this case?	[D] Are at least a few metrics used?	[D] Is the model validated on a different database than the one used for training?	[R] Are other structures (i.e., bowel loops) misinterpreted as lungs in segmentation?	[R] All the areas marked as highly explanatory are located inside the lungs?	[R] Are artifacts misidentified as part of the explanations?	[R] Are areas indicated as explanations consistent with opinions of radiologists?	[R] Do explanations accurately indicate lesions?
0	template
1	L. Brunese, F. Mercaldo, A. Reginelli, A. Santone, Explainable Deep Learning for Pulmonary Disease and CoronavirusCOVID-19 Detection from X-rays, Computer Methods and Programs in Biomedicine 196 (2020) 105608.doi:10.1016/j.cmpb.2020.105608.	Y	?	?	n/a	Y	Y	N	Y	N	n/a	Y	Y	N	Y?
2	Z. Han, B. Wei, Y. Hong, T. Li, J. Cong, X. Zhu, H. Wei, W. Zhang, Accurate Screening of COVID-19 Using Attention-Based Deep 3D Multiple Instance Learning, IEEE Transactions on Medical Imaging 39 (8) (2020) 2584–2594.doi:10.1109/TMI.2020.2996256.	N	?	?	?	?	n/a	n/a	Y	N	n/a	n/a	n/a	n/a	n/a
3	M. R. Karim, T. D ̈ohmen, M. Cochez, O. Beyan, D. Rebholz-Schuhmann, S. Decker, DeepCOVIDExplainer: ExplainableCOVID-19 Diagnosis from Chest X-ray Images, in: IEEE International Conference on Bioinformatics and Biomedicine(BIBM), 2020, pp. 1034–1037.doi:10.1109/BIBM49941.2020.9313304.	Y	Y	?	n/a	Y	n/a	n/a	Y	N	n/a	Y		n/a	Y?
4	E. Matsuyama, A Deep Learning Interpretable Model for Novel Coronavirus Disease (COVID-19) Screening with ChestCT Images, Journal of Biomedical Science and Engineering 13 (7) (2020) 140–152.	Y	Y	n/a	n/a	n/a	Y?	N	Y	N	n/a	n/a	N	n/a	n/a
5	Y. Oh, S. Park, J. C. Ye, Deep Learning COVID-19 Features on CXR using Limited Training Data Sets, IEEE Transactionson Medical Imaging 0062 (c) (2020) 1–1.doi:10.1109/tmi.2020.2993291.	Y	Y	Y?	n/a	Y	Y	N	Y	Y	N	Y	n/a	n/a	Y?
6	X. Ouyang, J. Huo, L. Xia, F. Shan, J. Liu, Z. Mo, F. Yan, Z. Ding, Q. Yang, B. Song, F. Shi, H. Yuan, Y. Wei, X. Cao,670Y. Gao, D. Wu, Q. Wang, D. Shen, Dual-Sampling Attention Network for Diagnosis of COVID-19 from CommunityAcquired Pneumonia, IEEE Transactions on Medical Imaging 39 (XX) (2020) 1–1.arXiv:2005.02690,doi:10.1109/tmi.2020.2995508.	Y	n/a	Y	Y	Y	n/a	n/a	Y	Y	Y?	Y	n/a	n/a	Y
7	T. Ozturk, M. Talo, E. A. Yildirim, U. B. Baloglu, O. Yildirim, U. Rajendra Acharya, Automated detection of COVID-19cases using deep neural networks with X-ray images, Computers in Biology and Medicine 121 (April) (2020) 103792.675doi:10.1016/j.compbiomed.2020.103792.	Y	?	n/a	n/a	n/a	Y	N	Y	N	n/a		n/a	n/a	N
8	S. Tabik, A. Gómez-Ríos, J. L. Martín-Rodríguez, I. Sevillano-García, M. Rey-Area, D. Charte, E. Guirado, J. L. Suárez, J. Luengo, M. A. Valero-González, P. García-Villanova, E. Olmedo-Sánchez, F. Herrera, COVIDGR Dataset and COVID-SDNet Methodology for Predicting COVID-19 Based on Chest X-Ray Images, IEEE Journal of Biomedical and HealthInformatics 24 (12) (2020) 3595–3605.doi:10.1109/JBHI.2020.3037127.680	Y	Y	?	?	?	Y	Y?	Y	N	N	Y?	n/a	n/a	N
9	N. Tsiknakis, E. Trivizakis, E. Vassalou, G. Papadakis, D. Spandidos, A. Tsatsakis, J. Sánchez-García, R. López-González,N. Papanikolaou, A. Karantanas, K. Marias, Interpretable artificial intelligence framework for COVID-19 screening onchest X-rays, Experimental and Therapeutic Medicine (2020) 727–735doi:10.3892/etm.2020.8797.	Y	N	?	n/a	?	Y	N	Y	N	n/a	Y	N	Y	N
10	F. Ucar, D. Korkmaz, COVIDiagnosis-Net: Deep Bayes-SqueezeNet based diagnosis of the coronavirus disease 2019(COVID-19) from X-ray images, Medical Hypotheses 140 (April) (2020) 109761.doi:10.1016/j.mehy.2020.109761.685	Y	N	?	?	N	Y	N	Y	N	n/a	N	n/a	n/a	Y?
11	L. Wang, Z. Q. Lin, A. Wong, COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19cases from chest X-ray images, Scientific Reports 10 (1) (2020) 19549.doi:10.1038/s41598-020-76550-z.	Y	Y	?	?	N	Y	N	Y	N	n/a	Y	n/a	n/a	N
12	Y. H. Wu, S. H. Gao, J. Mei, J. Xu, D. P. Fan, R. G. Zhang, M. M. Cheng, JCS: An Explainable COVID-19 DiagnosisSystem by Joint Classification and Segmentation, IEEE Trans Image Process 30 (2021) 3113–3126.	Y	n/a	N?	n/a	N	Y	N	Y	N	Y	Y	n/a	n/a	Y

Summary showing which points from the checklist are fulfilled by data resources.

	Institution	Link to dataset	[D] Does the data and its associated information provide sufficient diagnostic quality?	[R] Are the low quality images rejected?	[D] Is the dataset balanced in terms of sex and age?	[R] Does the dataset contain one type of images (CT or X-ray or the same projection)?	[R] Are the lung structures visible (“lung” window) on CT images?	[D] Are images of children and of adults labelled as such within the dataset?	[R] Are images correctly categorized in relation to class of pathology?	[D] Are AP/PA projections described for every X-ray image?
0	University of Waterloo	github.com/agchung/Figure1-COVID-chestxray-dataset	Y?	N	Y?	Y	n/a	not all	N	N
1	University of Waterloo	github.com/agchung/Actualmed-COVID-chestxray-dataset	N?	N	?	Y	n/a	N	N	Y
2	Qatar & Bangladesh Universities	kaggle.com/tawsifurrahman/covid19-radiography-database	N	N	?	Y	n/a	N	Y	N
3	University of Montreal	github.com/ieee8023/covid-chestxray-dataset	N?	N	Y?	N	n/a	Y?	N	Y
4	National Institutes of Health	kaggle.com/c/rsna-pneumonia-detection-challenge	N?	N	Y?	Y	n/a	Y	N	Y
5	National Institutes of Health	nihcc.app.box.com/v/ChestXray-NIHCC	N	N	Y	Y	n/a	Y	N	Y
6	National Institutes of Health	kaggle.com/nih-chest-xrays/sample	N?	N	Y	N	n/a	Y	N	Y
7	University of Montreal	kaggle.com/praveengovi/coronahack-chest-xraydataset	N	N	?	N	n/a	N	N	N
8	University of California San Diego	kaggle.com/paultimothymooney/chest-xray-pneumonia	N?	N	N	Y	n/a	Y	Y	Y
9	University of California San Diego	github.com/UCSD-AI4H/COVID-CT	N	n/a	N	Y	N	not all	N	n/a
10	University of California San Diego	data.mendeley.com/datasets/rscbjbr9sj/2	N	Y	?	Y	n/a	N	Y	N
11	Elazig in Turkey	github.com/muhammedtalo/COVID-19	N	N	?	Y	n/a	N	N	N
12	National Library of Medicine	openi.nlm.nih.gov/gridquery?it=xg&coll=cxr&m=1&n=100	Y	N	?	N	n/a	N	Y	Y
13	Hospital Universitario San Cecilio	github.com/ari-dasci/OD-covidgr	N?	Y?	Y?	Y	n/a	N	N?	Y
14	generated using data augmentation	kaggle.com/nabeelsajid917/covid-19-x-ray-10000-images	N	N	?	N	n/a	N	N	N
15	template

This table presents the data sources. The JPEG quality factor (QF) for most images has been set to 75, other cases are indicated.

	Institution	Link to dataset	Dynamic range of images	Data processing	Prepared for scientific experiments
0	University of Waterloo	github.com/lindawangg/COVID-Net b29	NaN	NaN	NaN
1	University of Waterloo	github.com/agchung/Figure1-COVID-chestxray-dataset	8 bits, 48 cases	JPG, PNG	X-ray database for research purposes only, continuously growing; Metadata: offset, sex, age. finding, survival temperature, pO2, saturation, view, modality, artifacts/distortion, notes; Categories: covid, pneumonia, no finding
2	University of Waterloo	github.com/agchung/Actualmed-COVID-chestxray-dataset	8 bits, 237 cases	PNG, BMP	X-ray database for research purposes only, continuously growing; Metadata: finding, view, modality, notes; Categories: covid, no finding
3	Qatar & Bangladesh Universities	kaggle.com/tawsifurrahman/covid19-radiography-database	8 bits, 21165 cases	PNG, resized	X-ray database; No metadata; Categories: COVID-19 positive cases (3616), normal (10,192), lung opacity (Non-COVID lung infection - 6,012), viral pneumonia (1,345)
4	University of Montreal	github.com/ieee8023/covid-chestxray-dataset	8 bits, 951 cases	JPG, PNG, resized	X-ray database; Metadata: covid severity scores, sex,age, finding, RT_PCR_positive, survival, intubated, intubation_present, went_icu, in_icu, needed_supplemental_O2, extubated, temperature, pO2_saturation, leukocyte_count, neutrophil_count, lymphocyte_count, clinical_notes, other_notes; Categories: covid, viral, bacterial, fungal, lipoid, aspiration, unknown
5	National Institutes of Health	kaggle.com/c/rsna-pneumonia-detection-challenge	8 bits, 30227 (training)+3000 (test) cases	DICOM, resized	X-ray database of Pneumonia Detection Challenge; No metadata; Categories: normal. lung opacity, no lung opacity/not normal
6	National Institutes of Health	nihcc.app.box.com/v/ChestXray-NIHCC	8 bits, 112120 cases	PNG, resized	X-ray database of Common Thorax Disease; Metadata: finding ROI; Categories: no findings and 14 disease categories (Atelectasis, Cardiomegaly, Effusion, Infiltration, Mass, Nodule, Pneumonia, Pneumothorax, Consolidation, Edema, Emphysema, Fibrosis, Pleural_Thickening, Hernia)
7	National Institutes of Health	kaggle.com/nih-chest-xrays/sample	8 bits, Random sample of 5606 from 112,120 images of 30,805 unique patients	PNG, resized	X-ray database; Metadata: finding labels, follow-up, age, gender, view; Categories: Atelectasis, Cardiomegaly, Effusion, Infiltration, Mass, Nodule, Pneumonia, Pleural_Thickening, Hernia, Pneumothorax, Consolidation, Edema, Emphysema, Fibrosis
8	University of Montreal	kaggle.com/praveengovi/coronahack-chest-xraydataset	8 bits, 5910 cases (normal-1576, covid 58, SARS-4, virus-1493, bacteria 2777, ARDS-2)	JPG,PNG-resized	Collection Chest X Ray (anterior-posterior) of Healthy vs Pneumonia (Corona) affected patients infected patients along with few other categories such as SARS (Severe Acute Respiratory Syndrome), Streptococcus & ARDS (Acute Respiratory Distress Syndrome); No metadata
9	University of California San Diego	kaggle.com/paultimothymooney/chest-xray-pneumonia	8 bits, 5863 cases	JPG	Chest X-ray images (anterior-posterior) were selected from retrospective cohorts of pediatric patients of one to five years old from Guangzhou Women and Children’s Medical Center, Guangzhou. All chest X-ray imaging was performed as part of patients’ routine clinical care.; Categories: normal and pneumonia; No metadata
10	University of California San Diego	github.com/UCSD-AI4H/COVID-CT	8 bits, 349 cases	Images collected (scanned) from covid-related and medical papers in PNG (covid) or JPG (normal)	This dataset has 349 CT images containing clinical findings of COVID-19 from 216 patients; Categories: covid and noncovid cases; Metadata: age, gender, location, medical history (unfortunately modest), time after the onset of illness, severity, other diseases
11	University of California San Diego	data.mendeley.com/datasets/rscbjbr9sj/2	8 bits, 5233 cases	JPG (QF=95 for normal and QF=75 for pneumonia)	Collection Chest X Ray; Categories: normal (1349 cases) vs pneumonia (3884 cases) including subcategories of bacteria and virus; No metadata
12	Elazig in Turkey	github.com/muhammedtalo/COVID-19	8 bits, 1125 cases	JPG (QF=90, subsampling2x2), PNG (resized)	X-Ray Images collection; No metadata; Categories: covid (125 cases), no findings (500 cases), pneumonia (500 cases)
13	National Library of Medicine	openi.nlm.nih.gov/gridquery?it=xg&coll=cxr&m=1&n=100	8 bits or full bits, 7470 cases	PNG (resized), Full DICOM	Chest X-rays collection with 3,955 radiology reports; Categories: 14 pulmonary categories; Metadata: time after the onset of illness, severity, other diseases, captions of symptoms as unstructured symptom description
14	Stanford University School of Medicine	stanfordmlgroup.github.io/competitions/chexpert	8 bits, 224,316 chest radiographs of 65,240 patients	JPG	Large dataset of chest X-rays which features uncertainty labels and radiologist-labeled reference standard evaluation sets; Categories: each report was labeled for the presence of 14 observations (no finding, enlarged cardiom., cardiomegaly, lesion, opacity, edema, consolidation, pneumonia, atelectasis, pneumothorax, pleural effusion, pleural other, fracture, support devices) as positive, negative, or uncertain; Metadata: related to above categories (blank for unmentioned, 0 for negative, -1 for uncertain, and 1 for positive)
15	Hospital San Juan de Alicante - University of Alicante	bimcv.cipf.es/bimcv-projects/padchest	8 bits, more than 160,000 images from 67,000 patients	PNG	PadChest: A large chest x-ray image dataset with multi-label annotated reports; the reports were labeled with 174 different radiographic findings, 19 differential diagnoses, and 104 anatomic locations; a 27% of the reports were manually annotated by trained physicians; Metadata: age, sex
16	Hospital Universitario San Cecilio	github.com/ari-dasci/OD-covidgr	8 bits, 852 images	JPEG (QF=90)	X-ray images: 426 positive covid cases and 426 negative cases; only the posterior-anterior view is considered; Categories: covid severity - normal-PCR+ (76), mild (100), moderate (171), severe (79); General metadata: positive images correspond to patients who have been tested positive with RT-PCR within a time span of at most 24h between the X-ray image and the test; every image has been taken using the same type of equipment and with the same format
17	Beth Israel Deaconess Medical Center in Boston	physionet.org/content/mimic-cxr/2.0.0	full bits, 227,835 imaging studies for 65,379 patients	full DICOM	Chest radiographs with metadata: electronic health record data, dicom metadata, free-text radiology reports Categories: 14 pulmonary observations with an additional “uncertain” category
18	Società Italiana di Radiologia Medica e Interventistica	sirm.org/category/senza-categoria/covid-19	8 bits	mostly JPG (QF=95, subsampling2x2)	Chest radiographs with free-text radiology and clinical reports, covid confirmation; Metadata includes selected information from electronic health record (e.g. symptoms, lab exams, ARDS, ventilatory assistance, previous exams); Categories: covid confirmation or no with 14 pulmonary observations
19	National Cancer Institute	wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI	full bits, 1308 cases	full DICOM	The Lung Image Database consists of diagnostic and lung cancer screening thoracic CT scans with marked-up annotated lesions (XML); it includes three categories ("nodule > or =3 mm," "nodule <3 mm," and "non-nodule > or =3 mm");
20	University of Brescia	brixia.github.io/#dataset	full bits, 4,707 cases	full DICOM	COVID-19 subjects, acquired with both CR and DX modalities, in AP or PA projection with highly expressive multi-zone COVID-19 severity score, fully annotated; Metadata: the multi-region 6-valued Brixia-score defined for six zones, sex, age
21	open-edit radiology resource	radiopaedia.org	8 bits, a significant number of cases, constantly updated	JPG with different QF, resized	Database of general radiological purposes; in selected cases free-text radiology and clinical reports, selected; generally, quantitatively and qualitatively differentiated case reports
22	generated using data augmentation	kaggle.com/nabeelsajid917/covid-19-x-ray-10000-images	8 bits, 104 cases	JPEG with different QF, resized	Corona Virus X-ray Dataset; Categories: covid and normal; No metadata
23	template

Reference

Paper for this work is avaliable at: https://www.sciencedirect.com/science/article/pii/S0031320321002223.

If you find our work useful, please cite our paper:

    @article{Hryniewska2021review,
          title = {Checklist for responsible deep learning modeling of medical images based on COVID-19 detection studies},
          journal = {Pattern Recognition},
          volume = {118},
          pages = {108035},
          year = {2021},
          issn = {0031-3203},
          doi = {https://doi.org/10.1016/j.patcog.2021.108035},
          url = {https://www.sciencedirect.com/science/article/pii/S0031320321002223},
          author = {Weronika Hryniewska and Przemysław Bombiński and Patryk Szatkowski and Paulina Tomaszewska and Artur Przelaskowski and Przemysław Biecek},
          keywords = {COVID-19, Lungs, Computed tomography, X-ray, Explainable AI, Deep learning},
    }

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github/workflows		.github/workflows
datasets_checklist		datasets_checklist
datasets_information		datasets_information
papers_checklist		papers_checklist
README.md		README.md
tablegenerator.py		tablegenerator.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Checklist for responsible deep learning modeling of medical images based on COVID-19 detection studies

Aim of this repository

How to add paper / data resource?

Explanation of symbols

Summary showing which points from the checklist are fulfilled by studies.

Summary showing which points from the checklist are fulfilled by data resources.

This table presents the data sources. The JPEG quality factor (QF) for most images has been set to 75, other cases are indicated.

Reference

About

Languages

Hryniewska/checklist

Folders and files

Latest commit

History

Repository files navigation

Checklist for responsible deep learning modeling of medical images based on COVID-19 detection studies

Aim of this repository

How to add paper / data resource?

Explanation of symbols

Summary showing which points from the checklist are fulfilled by studies.

Summary showing which points from the checklist are fulfilled by data resources.

This table presents the data sources. The JPEG quality factor (QF) for most images has been set to 75, other cases are indicated.

Reference

About

Topics

Resources

Stars

Watchers

Forks

Languages