Skip to content

Checklist for an analysis of various aspects of responsibility of models and data resources

Notifications You must be signed in to change notification settings

Hryniewska/checklist

Repository files navigation

Checklist for responsible deep learning modeling of medical images based on COVID-19 detection studies

Aim of this repository

This repository is created to initiate the process of standards establishment for creating reliable AI solutions.

Everyone can join us to build a community focusing on responsible AI for medical applications.

How to add paper / data resource?

In the repository, we created the possibility to add new paper and datasets. In pull request the person who would like to add a new item should attach a JSON file. You can find specific JSON’s template in folders: datasets_checklist, papers_checklist, datasets_information. In JSON file, please tell exactly which points from the checklist are fulfilled. Please justify your statement by putting comments on it in pull request description. Your submitted pull request will be veryfied by community members. They can ask for corrections or clarifications. All discussions will be visible to the public.

Explanation of symbols

Following denotements are applied for the tables:

    Y - yes (if an answer is probable 'Y?’)
    N - no (if an answer is probable ‘N?’)
    ? - no information provided
    n/a - issue does not apply to a particular publication / data resource

Summary showing which points from the checklist are fulfilled by studies.

Study [D] Is the data preprocessing described? [D] Are artifacts (such as captions) removed? [D] Are the lungs fully present after transformations? [R] Are lung structures visible after brightness or contrast transformations? [D] Are only sensible transformations applied? [D] Is the transfer learning procedure described? [D] Is the applied transfer learning appropriate for this case? [D] Are at least a few metrics used? [D] Is the model validated on a different database than the one used for training? [R] Are other structures (i.e., bowel loops) misinterpreted as lungs in segmentation? [R] All the areas marked as highly explanatory are located inside the lungs? [R] Are artifacts misidentified as part of the explanations? [R] Are areas indicated as explanations consistent with opinions of radiologists? [R] Do explanations accurately indicate lesions?
0 template
1 L. Brunese, F. Mercaldo, A. Reginelli, A. Santone, Explainable Deep Learning for Pulmonary Disease and CoronavirusCOVID-19 Detection from X-rays, Computer Methods and Programs in Biomedicine 196 (2020) 105608.doi:10.1016/j.cmpb.2020.105608. Y ? ? n/a Y Y N Y N n/a Y Y N Y?
2 Z. Han, B. Wei, Y. Hong, T. Li, J. Cong, X. Zhu, H. Wei, W. Zhang, Accurate Screening of COVID-19 Using Attention-Based Deep 3D Multiple Instance Learning, IEEE Transactions on Medical Imaging 39 (8) (2020) 2584–2594.doi:10.1109/TMI.2020.2996256. N ? ? ? ? n/a n/a Y N n/a n/a n/a n/a n/a
3 M. R. Karim, T. D ̈ohmen, M. Cochez, O. Beyan, D. Rebholz-Schuhmann, S. Decker, DeepCOVIDExplainer: ExplainableCOVID-19 Diagnosis from Chest X-ray Images, in: IEEE International Conference on Bioinformatics and Biomedicine(BIBM), 2020, pp. 1034–1037.doi:10.1109/BIBM49941.2020.9313304. Y Y ? n/a Y n/a n/a Y N n/a Y n/a Y?
4 E. Matsuyama, A Deep Learning Interpretable Model for Novel Coronavirus Disease (COVID-19) Screening with ChestCT Images, Journal of Biomedical Science and Engineering 13 (7) (2020) 140–152. Y Y n/a n/a n/a Y? N Y N n/a n/a N n/a n/a
5 Y. Oh, S. Park, J. C. Ye, Deep Learning COVID-19 Features on CXR using Limited Training Data Sets, IEEE Transactionson Medical Imaging 0062 (c) (2020) 1–1.doi:10.1109/tmi.2020.2993291. Y Y Y? n/a Y Y N Y Y N Y n/a n/a Y?
6 X. Ouyang, J. Huo, L. Xia, F. Shan, J. Liu, Z. Mo, F. Yan, Z. Ding, Q. Yang, B. Song, F. Shi, H. Yuan, Y. Wei, X. Cao,670Y. Gao, D. Wu, Q. Wang, D. Shen, Dual-Sampling Attention Network for Diagnosis of COVID-19 from CommunityAcquired Pneumonia, IEEE Transactions on Medical Imaging 39 (XX) (2020) 1–1.arXiv:2005.02690,doi:10.1109/tmi.2020.2995508. Y n/a Y Y Y n/a n/a Y Y Y? Y n/a n/a Y
7 T. Ozturk, M. Talo, E. A. Yildirim, U. B. Baloglu, O. Yildirim, U. Rajendra Acharya, Automated detection of COVID-19cases using deep neural networks with X-ray images, Computers in Biology and Medicine 121 (April) (2020) 103792.675doi:10.1016/j.compbiomed.2020.103792. Y ? n/a n/a n/a Y N Y N n/a n/a n/a N
8 S. Tabik, A. Gómez-Ríos, J. L. Martín-Rodríguez, I. Sevillano-García, M. Rey-Area, D. Charte, E. Guirado, J. L. Suárez, J. Luengo, M. A. Valero-González, P. García-Villanova, E. Olmedo-Sánchez, F. Herrera, COVIDGR Dataset and COVID-SDNet Methodology for Predicting COVID-19 Based on Chest X-Ray Images, IEEE Journal of Biomedical and HealthInformatics 24 (12) (2020) 3595–3605.doi:10.1109/JBHI.2020.3037127.680 Y Y ? ? ? Y Y? Y N N Y? n/a n/a N
9 N. Tsiknakis, E. Trivizakis, E. Vassalou, G. Papadakis, D. Spandidos, A. Tsatsakis, J. Sánchez-García, R. López-González,N. Papanikolaou, A. Karantanas, K. Marias, Interpretable artificial intelligence framework for COVID-19 screening onchest X-rays, Experimental and Therapeutic Medicine (2020) 727–735doi:10.3892/etm.2020.8797. Y N ? n/a ? Y N Y N n/a Y N Y N
10 F. Ucar, D. Korkmaz, COVIDiagnosis-Net: Deep Bayes-SqueezeNet based diagnosis of the coronavirus disease 2019(COVID-19) from X-ray images, Medical Hypotheses 140 (April) (2020) 109761.doi:10.1016/j.mehy.2020.109761.685 Y N ? ? N Y N Y N n/a N n/a n/a Y?
11 L. Wang, Z. Q. Lin, A. Wong, COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19cases from chest X-ray images, Scientific Reports 10 (1) (2020) 19549.doi:10.1038/s41598-020-76550-z. Y Y ? ? N Y N Y N n/a Y n/a n/a N
12 Y. H. Wu, S. H. Gao, J. Mei, J. Xu, D. P. Fan, R. G. Zhang, M. M. Cheng, JCS: An Explainable COVID-19 DiagnosisSystem by Joint Classification and Segmentation, IEEE Trans Image Process 30 (2021) 3113–3126. Y n/a N? n/a N Y N Y N Y Y n/a n/a Y

Summary showing which points from the checklist are fulfilled by data resources.

Institution Link to dataset [D] Does the data and its associated information provide sufficient diagnostic quality? [R] Are the low quality images rejected? [D] Is the dataset balanced in terms of sex and age? [R] Does the dataset contain one type of images (CT or X-ray or the same projection)? [R] Are the lung structures visible (“lung” window) on CT images? [D] Are images of children and of adults labelled as such within the dataset? [R] Are images correctly categorized in relation to class of pathology? [D] Are AP/PA projections described for every X-ray image?
0 University of Waterloo github.com/agchung/Figure1-COVID-chestxray-dataset Y? N Y? Y n/a not all N N
1 University of Waterloo github.com/agchung/Actualmed-COVID-chestxray-dataset N? N ? Y n/a N N Y
2 Qatar & Bangladesh Universities kaggle.com/tawsifurrahman/covid19-radiography-database N N ? Y n/a N Y N
3 University of Montreal github.com/ieee8023/covid-chestxray-dataset N? N Y? N n/a Y? N Y
4 National Institutes of Health kaggle.com/c/rsna-pneumonia-detection-challenge N? N Y? Y n/a Y N Y
5 National Institutes of Health nihcc.app.box.com/v/ChestXray-NIHCC N N Y Y n/a Y N Y
6 National Institutes of Health kaggle.com/nih-chest-xrays/sample N? N Y N n/a Y N Y
7 University of Montreal kaggle.com/praveengovi/coronahack-chest-xraydataset N N ? N n/a N N N
8 University of California San Diego kaggle.com/paultimothymooney/chest-xray-pneumonia N? N N Y n/a Y Y Y
9 University of California San Diego github.com/UCSD-AI4H/COVID-CT N n/a N Y N not all N n/a
10 University of California San Diego data.mendeley.com/datasets/rscbjbr9sj/2 N Y ? Y n/a N Y N
11 Elazig in Turkey github.com/muhammedtalo/COVID-19 N N ? Y n/a N N N
12 National Library of Medicine openi.nlm.nih.gov/gridquery?it=xg&coll=cxr&m=1&n=100 Y N ? N n/a N Y Y
13 Hospital Universitario San Cecilio github.com/ari-dasci/OD-covidgr N? Y? Y? Y n/a N N? Y
14 generated using data augmentation kaggle.com/nabeelsajid917/covid-19-x-ray-10000-images N N ? N n/a N N N
15 template

This table presents the data sources. The JPEG quality factor (QF) for most images has been set to 75, other cases are indicated.

Institution Link to dataset Dynamic range of images Data processing Prepared for scientific experiments
0 University of Waterloo github.com/lindawangg/COVID-Net b29 NaN NaN NaN
1 University of Waterloo github.com/agchung/Figure1-COVID-chestxray-dataset 8 bits, 48 cases JPG, PNG X-ray database for research purposes only, continuously growing; Metadata: offset, sex, age. finding, survival temperature, pO2, saturation, view, modality, artifacts/distortion, notes; Categories: covid, pneumonia, no finding
2 University of Waterloo github.com/agchung/Actualmed-COVID-chestxray-dataset 8 bits, 237 cases PNG, BMP X-ray database for research purposes only, continuously growing; Metadata: finding, view, modality, notes; Categories: covid, no finding
3 Qatar & Bangladesh Universities kaggle.com/tawsifurrahman/covid19-radiography-database 8 bits, 21165 cases PNG, resized X-ray database; No metadata; Categories: COVID-19 positive cases (3616), normal (10,192), lung opacity (Non-COVID lung infection - 6,012), viral pneumonia (1,345)
4 University of Montreal github.com/ieee8023/covid-chestxray-dataset 8 bits, 951 cases JPG, PNG, resized X-ray database; Metadata: covid severity scores, sex,age, finding, RT_PCR_positive, survival, intubated, intubation_present, went_icu, in_icu, needed_supplemental_O2, extubated, temperature, pO2_saturation, leukocyte_count, neutrophil_count, lymphocyte_count, clinical_notes, other_notes; Categories: covid, viral, bacterial, fungal, lipoid, aspiration, unknown
5 National Institutes of Health kaggle.com/c/rsna-pneumonia-detection-challenge 8 bits, 30227 (training)+3000 (test) cases DICOM, resized X-ray database of Pneumonia Detection Challenge; No metadata; Categories: normal. lung opacity, no lung opacity/not normal
6 National Institutes of Health nihcc.app.box.com/v/ChestXray-NIHCC 8 bits, 112120 cases PNG, resized X-ray database of Common Thorax Disease; Metadata: finding ROI; Categories: no findings and 14 disease categories (Atelectasis, Cardiomegaly, Effusion, Infiltration, Mass, Nodule, Pneumonia, Pneumothorax, Consolidation, Edema, Emphysema, Fibrosis, Pleural_Thickening, Hernia)
7 National Institutes of Health kaggle.com/nih-chest-xrays/sample 8 bits, Random sample of 5606 from 112,120 images of 30,805 unique patients PNG, resized X-ray database; Metadata: finding labels, follow-up, age, gender, view; Categories: Atelectasis, Cardiomegaly, Effusion, Infiltration, Mass, Nodule, Pneumonia, Pleural_Thickening, Hernia, Pneumothorax, Consolidation, Edema, Emphysema, Fibrosis
8 University of Montreal kaggle.com/praveengovi/coronahack-chest-xraydataset 8 bits, 5910 cases (normal-1576, covid 58, SARS-4, virus-1493, bacteria 2777, ARDS-2) JPG,PNG-resized Collection Chest X Ray (anterior-posterior) of Healthy vs Pneumonia (Corona) affected patients infected patients along with few other categories such as SARS (Severe Acute Respiratory Syndrome), Streptococcus & ARDS (Acute Respiratory Distress Syndrome); No metadata
9 University of California San Diego kaggle.com/paultimothymooney/chest-xray-pneumonia 8 bits, 5863 cases JPG Chest X-ray images (anterior-posterior) were selected from retrospective cohorts of pediatric patients of one to five years old from Guangzhou Women and Children’s Medical Center, Guangzhou. All chest X-ray imaging was performed as part of patients’ routine clinical care.; Categories: normal and pneumonia; No metadata
10 University of California San Diego github.com/UCSD-AI4H/COVID-CT 8 bits, 349 cases Images collected (scanned) from covid-related and medical papers in PNG (covid) or JPG (normal) This dataset has 349 CT images containing clinical findings of COVID-19 from 216 patients; Categories: covid and noncovid cases; Metadata: age, gender, location, medical history (unfortunately modest), time after the onset of illness, severity, other diseases
11 University of California San Diego data.mendeley.com/datasets/rscbjbr9sj/2 8 bits, 5233 cases JPG (QF=95 for normal and QF=75 for pneumonia) Collection Chest X Ray; Categories: normal (1349 cases) vs pneumonia (3884 cases) including subcategories of bacteria and virus; No metadata
12 Elazig in Turkey github.com/muhammedtalo/COVID-19 8 bits, 1125 cases JPG (QF=90, subsampling2x2), PNG (resized) X-Ray Images collection; No metadata; Categories: covid (125 cases), no findings (500 cases), pneumonia (500 cases)
13 National Library of Medicine openi.nlm.nih.gov/gridquery?it=xg&coll=cxr&m=1&n=100 8 bits or full bits, 7470 cases PNG (resized), Full DICOM Chest X-rays collection with 3,955 radiology reports; Categories: 14 pulmonary categories; Metadata: time after the onset of illness, severity, other diseases, captions of symptoms as unstructured symptom description
14 Stanford University School of Medicine stanfordmlgroup.github.io/competitions/chexpert 8 bits, 224,316 chest radiographs of 65,240 patients JPG Large dataset of chest X-rays which features uncertainty labels and radiologist-labeled reference standard evaluation sets; Categories: each report was labeled for the presence of 14 observations (no finding, enlarged cardiom., cardiomegaly, lesion, opacity, edema, consolidation, pneumonia, atelectasis, pneumothorax, pleural effusion, pleural other, fracture, support devices) as positive, negative, or uncertain; Metadata: related to above categories (blank for unmentioned, 0 for negative, -1 for uncertain, and 1 for positive)
15 Hospital San Juan de Alicante - University of Alicante bimcv.cipf.es/bimcv-projects/padchest 8 bits, more than 160,000 images from 67,000 patients PNG PadChest: A large chest x-ray image dataset with multi-label annotated reports; the reports were labeled with 174 different radiographic findings, 19 differential diagnoses, and 104 anatomic locations; a 27% of the reports were manually annotated by trained physicians; Metadata: age, sex
16 Hospital Universitario San Cecilio github.com/ari-dasci/OD-covidgr 8 bits, 852 images JPEG (QF=90) X-ray images: 426 positive covid cases and 426 negative cases; only the posterior-anterior view is considered; Categories: covid severity - normal-PCR+ (76), mild (100), moderate (171), severe (79); General metadata: positive images correspond to patients who have been tested positive with RT-PCR within a time span of at most 24h between the X-ray image and the test; every image has been taken using the same type of equipment and with the same format
17 Beth Israel Deaconess Medical Center in Boston physionet.org/content/mimic-cxr/2.0.0 full bits, 227,835 imaging studies for 65,379 patients full DICOM Chest radiographs with metadata: electronic health record data, dicom metadata, free-text radiology reports Categories: 14 pulmonary observations with an additional “uncertain” category
18 Società Italiana di Radiologia Medica e Interventistica sirm.org/category/senza-categoria/covid-19 8 bits mostly JPG (QF=95, subsampling2x2) Chest radiographs with free-text radiology and clinical reports, covid confirmation; Metadata includes selected information from electronic health record (e.g. symptoms, lab exams, ARDS, ventilatory assistance, previous exams); Categories: covid confirmation or no with 14 pulmonary observations
19 National Cancer Institute wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI full bits, 1308 cases full DICOM The Lung Image Database consists of diagnostic and lung cancer screening thoracic CT scans with marked-up annotated lesions (XML); it includes three categories ("nodule > or =3 mm," "nodule <3 mm," and "non-nodule > or =3 mm");
20 University of Brescia brixia.github.io/#dataset full bits, 4,707 cases full DICOM COVID-19 subjects, acquired with both CR and DX modalities, in AP or PA projection with highly expressive multi-zone COVID-19 severity score, fully annotated; Metadata: the multi-region 6-valued Brixia-score defined for six zones, sex, age
21 open-edit radiology resource radiopaedia.org 8 bits, a significant number of cases, constantly updated JPG with different QF, resized Database of general radiological purposes; in selected cases free-text radiology and clinical reports, selected; generally, quantitatively and qualitatively differentiated case reports
22 generated using data augmentation kaggle.com/nabeelsajid917/covid-19-x-ray-10000-images 8 bits, 104 cases JPEG with different QF, resized Corona Virus X-ray Dataset; Categories: covid and normal; No metadata
23 template

Reference

Paper for this work is avaliable at: https://www.sciencedirect.com/science/article/pii/S0031320321002223.

If you find our work useful, please cite our paper:

    @article{Hryniewska2021review,
          title = {Checklist for responsible deep learning modeling of medical images based on COVID-19 detection studies},
          journal = {Pattern Recognition},
          volume = {118},
          pages = {108035},
          year = {2021},
          issn = {0031-3203},
          doi = {https://doi.org/10.1016/j.patcog.2021.108035},
          url = {https://www.sciencedirect.com/science/article/pii/S0031320321002223},
          author = {Weronika Hryniewska and Przemysław Bombiński and Patryk Szatkowski and Paulina Tomaszewska and Artur Przelaskowski and Przemysław Biecek},
          keywords = {COVID-19, Lungs, Computed tomography, X-ray, Explainable AI, Deep learning},
    }