Skip to content

yun-liu/Tuberculosis

Repository files navigation

This is the official repository for "Revisiting Computer-Aided Tuberculosis Diagnosis".

Introduction

Tuberculosis (TB) is a major global health threat, causing millions of deaths annually. Although early diagnosis and treatment can greatly improve the chances of survival, it remains a major challenge, especially in developing countries. Recently, computer-aided tuberculosis diagnosis (CTD) using deep learning has shown promise, but progress is hindered by limited training data. To address this, we establish a large-scale dataset, namely the Tuberculosis X-ray (TBX11K) dataset, which contains 11,200 chest X-ray (CXR) images with corresponding bounding box annotations for TB areas. This dataset enables the training of sophisticated detectors for high-quality CTD. Furthermore, we leverage the bilateral symmetry property of CXR images to propose a strong baseline, SymFormer, for simultaneous CXR image classification and TB infection area detection. To promote future research on CTD, we build a benchmark by introducing evaluation metrics, evaluating baseline models reformed from existing detectors, and running an online challenge.

This work extends the preliminary CVPR 2020 version ("Rethinking Computer-aided Tuberculosis Diagnosis", CVPR 2020, Oral) by proposing a novel SymFormer framework for CTD and validating its effectiveness with extensive experiments.

Related Links

[PDF] [Project Page] [Dataset on Google Drive] [Dataset on Baidu Yunpan] [Online Challenge] [中译版]

Requirements:

  • torch==1.9.0
  • torchvision==0.10.0
  • mmcv==1.3.12

Run pip install -v -e . to install this repository.

TBX11K Dataset

Summary of publicly available TB datasets. The size of our dataset is about $17\times$ larger than that of the previous largest dataset. Besides, our dataset annotates TB infection areas with bounding boxes, instead of only image-level labels.

Datasets Pub. Year #Classes Annotations #Samples
MC 2014 2 Image-level 138
Shenzhen 2014 2 Image-level 662
DA 2014 2 Image-level 156
DB 2014 2 Image-level 150
TBX11K (Ours) 2020 & 2023 4 Bounding box 11,200

Split for the TBX11K dataset. Active & Latent TB refers to CXR images with both active and latent TB; Active TB refers to CXR images with only active TB; Latent TB refers to CXR images with only latent TB; Uncertain TB refers to TB CXR images where the type of TB infection cannot be recognized using current medical conditions.

Classes Train Val Test Total
Non-TB Healthy 3,000 800 1,200 5,000
Sick & Non-TB 3,000 800 1,200 5,000
TB Active TB 473 157 294 924
Latent TB 104 36 72 212
Active & Latent TB 23 7 24 54
Uncertain TB 0 0 10 10
Total 6,600 1,800 2,800 11,200

SymFormer

CXR image classification results on the TBX11K test data.

Methods Backbones Accuracy AUC (TB) Sensitivity Specificity AP AR Result
Deformable DETR ResNet-50 w/ FPN 91.3 97.6 89.2 95.3 89.8 91.0 [JSON] [TXT]
SymFormer w/ Deformable DETR ResNet-50 w/ FPN 94.3 98.5 87.3 97.3 93.2 93.2 [JSON] [TXT]
SymFormer w/ RetinaNet ResNet-50 w/ FPN 94.5 98.9 91.0 96.8 93.3 94.0 [JSON] [TXT]
SymFormer w/ RetinaNet P2T-Small w/ FPN 94.6 99.1 92.1 96.7 93.4 94.2 [JSON] [TXT]

TP: True Positives; TN: True Negatives; FP: False Positives; FN: False Negatives.

#Total denotes the total number of test CXR images. We test FPS on a single TITAN XP GPU. For the ground truths, the ratio of positives (TP + FN) is 19.6%, and the ratio of negatives (TN + FP) is 80.4%.

Methods Backbones #FLOPs #Params FPS $F_1$ TP/#Total TN/#Total FP/#Total FN/#Total
Deformable DETR ResNet-50 w/ FPN 54.07 52.67 23.0 85.6 17.5 76.6 3.8 2.1
SymFormer w/ Deformable DETR ResNet-50 w/ FPN 54.08 52.69 22.5 87.9 17.1 78.2 2.2 2.5
SymFormer w/ RetinaNet ResNet-50 w/ FPN 59.14 50.03 24.3 89.0 17.8 77.8 2.6 1.8
SymFormer w/ RetinaNet P2T-Small w/ FPN 55.46 45.10 17.9 89.6 18.1 77.7 2.7 1.5

TB infection area detection results on our TBX11K test set.

Methods Test Data Backbones Category-agnostic TB Active TB Latent TB
AP50bb APbb AP50bb APbb AP50bb APbb
Deformable DETR ALL ResNet-50 w/ FPN 51.7 22.0 48.9 21.2 7.1 1.9
SymFormer w/ Deformable DETR ResNet-50 w/ FPN 57.0 23.3 52.1 22.7 7.1 2.0
SymFormer w/ RetinaNet ResNet-50 w/ FPN 68.0 29.5 62.0 27.3 13.3 4.4
SymFormer w/ RetinaNet P2T-Small w/ FPN 70.4 30.0 63.6 26.9 11.4 4.3
Deformable DETR Only TB ResNet-50 w/ FPN 57.4 24.2 54.5 23.5 7.6 2.3
SymFormer w/ Deformable DETR ResNet-50 w/ FPN 60.8 24.5 55.2 23.8 9.2 2.6
SymFormer w/ RetinaNet ResNet-50 w/ FPN 73.4 31.5 67.1 29.2 14.7 4.8
SymFormer w/ RetinaNet P2T-Small w/ FPN 75.7 32.1 68.9 28.9 13.0 4.7

Visualization

Visualization of the learned deep features from CXR images using SymFormer w/ RetinaNet. We randomly select CXR images from the TBX11K test set. In each example, the infection areas of active TB, latent TB, and uncertain TB are indicated by boxes colored in green, red, and blue, respectively. The ground-truth boxes are displayed with thick lines, while the detected boxes are shown with thin lines.

Train

Here, we show the training/testing commands by using P2T-Small as the backbone network and RetinaNet as the base detector.

Download the ImageNet-pretrained model first: P2T-Small.

Use the following commands to train SymFormer:

# step I: train detection
CUDA_VISIBLE_DEVICE=0 python tools/train.py \
    configs/symformer/symformer_retinanet_p2t_fpn_2x_TBX11K.py \
    --work-dir work_dirs/symformer_retinanet_p2t/ \
    --no-validate

# step II: train classification
CUDA_VISIBLE_DEVICES=0 python tools/train.py \
    configs/symformer/symformer_retinanet_p2t_cls_fpn_1x_TBX11K.py \
    --work-dir work_dirs/symformer_retinanet_p2t_cls/ \
    --no-validate

Test

Use the following commands to generate results for the TBX11K test set:

CUDA_VISIBLE_DEVICES=0 python -W ignore tools/test.py \
    configs/symformer/symformer_retinanet_p2t_cls_fpn_1x_TBX11K.py \
    work_dirs/symformer_retinanet_p2t_cls/latest.pth \
    --out work_dirs/symformer_retinanet_p2t_cls/result/result.pkl \
    --format-only --cls-filter True \
    --options "jsonfile_prefix=work_dirs/symformer_retinanet_p2t_cls/result/bbox_result" \
    --txt work_dirs/symformer_retinanet_p2t_cls/result/cls_result.txt

Online Challenge

We only release the ground truths for the training and validation sets of our TBX11K dataset. The test set is retained as an online challenge for TB X-ray classification and TB infection area detection. To participate this challenge, you need to create an account on CodaLab and register for the TBX11K Tuberculosis Classification and Detection Challenge. Please refer to this webpage or our paper to see the evaluation metrics. Then, open the "Participate" tab to read the submission guidelines carefully. Next, you can upload your submission. Once uploaded, your submissions will be evaluated automatically.

Citation

If you are using the code/model/data provided here in a publication, please consider citing our papers:

@article{liu2023revisiting,
  title={Revisiting Computer-Aided Tuberculosis Diagnosis},
  author={Liu, Yun and Wu, Yu-Huan and Zhang, Shi-Chen and Liu, Li and Wu, Min and Cheng, Ming-Ming},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2023}
}

@inproceedings{liu2020rethinking,
  title={Rethinking Computer-aided Tuberculosis Diagnosis},
  author={Liu, Yun and Wu, Yu-Huan and Ban, Yunfeng and Wang, Huifang and Cheng, Ming-Ming},
  booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={2646--2655},
  year={2020}
}

This repository exemplifies the training/testing commands by using P2T-Small as the backbone network and RetinaNet as the base detector:

@article{wu2022p2t,
  title={P2T: Pyramid Pooling Transformer for Scene Understanding},
  author={Wu, Yu-Huan and Liu, Yun and Zhan, Xin and Cheng, Ming-Ming},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume={45},
  number={11},
  pages={12760--12771},
  year={2023},
  publisher={IEEE}
}

@inproceedings{lin2017focal,
  title={Focal Loss for Dense Object Detection},
  author={Lin, Tsung-Yi and Goyal, Priya and Girshick, Ross and He, Kaiming and Doll{\'a}r, Piotr},
  booktitle={IEEE International Conference on Computer Vision},,
  pages={2980--2988},
  year={2017}
}

About

Revisiting Computer-Aided Tuberculosis Diagnosis (IEEE TPAMI)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages