# Multi-Model Approach for Chest X-ray Classification: CNNs, Transformers, and Generative Models


Group 29: Wan Yiliang, Lam Wei Ren, Maria-Lorena Poupet, Shao Yanming

## Project Motivation

Medical image classification has become a critical application of deep learning, particularly in diagnosing diseases from radiological scans. The NIH ChestX-ray14 dataset, a widely used benchmark for thoracic disease classification, presents challenges due to its multi-label nature, imbalanced class distribution, and noisy annotations. Traditional convolutional neural networks (CNNs) and Transformers have demonstrated strong feature extraction capabilities for classification tasks. Besides, generative models show promising ability to detect the abnormal. Our project aims to explore the effectiveness of CNNs, Transformers, and generative models in classifying abnormalities in the NIH ChestX-ray14 dataset.

## Dataset Overview (Maria)


## Sampling Strategies (Shao Yanming)


## Preprocessing and Data Analysis (Maria & Shao Yanming)


## CNN Models (Yiliang & Maria)


## Transformer Models (Yiliang & Shao Yanming)


## Generative Models (Wei Ren & Shao Yanming)


## Conclusion and Future Work (All)

## ***Steps adapted from the requirement slides***

1. Identify a data analysis problem that can be solved with deep learning.

   - You may use your own field of expertise or your personal interests.

   - The problem is neither too easy nor too difficult!

2. Dataset collection

   - Use existing dataset(s)

   - Develop new dataset(s)

3. Data exploration (analyze your data, get insights)

   - Use statistics

   - Use visualization libraries

4. Pre-processing

   - Data cleaning (missing features)

   - Data normalization (unbalanced scaling)

   - Important and consuming step to prepare data as clean as possible for analysis

5. Data analysis with deep learning

   - Apply machine learning to solve your data problem

   - Compare different models

6. Numerical results

   - Analysis, interpretation, conclusion


## ***Review from Teaching Team***

Project plan outlines a very good baseline for the different steps in the deep learning pipeline, and also shows additional initiatives regarding the generative model. However, the project does not seem to consider the specific challenges of the application of AI in healthcare:

1. Dealing with limited labelled data. Could explore generative models to obtain more data, or self-supervised learning techniques.

2. Dealing with domain shift (Non-IID). The data generated from the different hospitals creates variations in the data distribution, which the model should be invariant to.

3. Explainability. This is very important for the application of AI in healthcare since we want to know what the model made a certain prediction. In the dataset, some images also contain the bounding boxes, which could be used to explore localization.

4. Evaluation metric. Since false negatives are significantly more expensive than false positives, the training of the model should take this into account. This is also exacerbated by the imbalanced class problem.

Besides implementing the baseline steps in the deep learning pipeline, do consider more advanced and problem-specific techniques/problems to tackle.


## References

+ Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, and Ronald M Summers. Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proceedings of The IEEE Conference on Computer Vision and Pattern Recognition, pages 2097–2106, 2017.

+ Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.

+ Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 2012.

+ Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

+ Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.

+ Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in Neural Information Processing Systems, 30, 2017.

+ Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.

+ Diederik P Kingma, Max Welling, et al. Auto-encoding variational bayes, 2013.

+ Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks. Communications of The ACM, 63(11):139–144, 2020.

+ Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.