Nandita Gautam*, Abhishek Basu*, and Ram Sarkar
* Equally contributing first authors
Published in Neural Computing and Applications (Nov 2023), Springer
Lung cancer remains a prevalent and deadly disease, claiming numerous lives annually. Early detection plays a pivotal role in significantly improving survival rates, by up to 50–70%. Therefore, developing a robust lung cancer detection system holds immense potential to positively impact human survival. Computed tomography (CT) scan images offer invaluable information about lung nodules, and the emergence of machine learning and deep learning techniques has empowered radiologists in their diagnostic tasks. In this study, we propose a new ensemble of deep learning models to accurately classify the severity of lung nodules. Our approach leverages deep transfer learning and adopts an ensemble learning approach. Specifically, three state-of-the-art convolutional neural networks (CNN) models, namely ResNet-152, DenseNet-169, and EfficientNet-B7, are employed. To enhance the ensemble method's performance, we introduce a novel scheme for selecting and assigning weights to each base model. Unlike conventional methods that often rely on manual experimentation to set weights, our approach fuses the scores of two standard assessment metrics, ROC-AUC score, and F1-score, for a more accurate weight vector determination. To evaluate the effectiveness of our method, we conduct extensive testing using the publicly available CT scan dataset, LIDC-IDRI. Our proposed ensemble achieves an accuracy of 97.23%, surpassing various recent methods and outperforming commonly used ensemble techniques. Furthermore, our novel weight optimization strategy significantly reduces false negatives, leading to a sensitivity of 98.6%. .
- A weighted average ensemble technique is proposed for boosting the performance of the base CNN models in lung cancer classification using CT scan images.
- The weights assigned to the classifiers are determined by fusing two evaluation metrics - F1-score and ROC-AUC score. Instead of setting the weights based solely on the accuracy of classifiers or according to the results of experiments, we have used a hyperbolic tangent function, and weights are then optimized using a novel technique using the recall score of the base models.
- The proposed model has been evaluated on the publicly available lung CT dataset, called the LIDC-IDRI dataset. The obtained results are found to be superior to those from state-of-the-art techniques, demonstrating the method’s applicability in the real world.
If you're using this article or code in your research or applications, please consider citing using this BibTeX:
@article{gautam2024lung,
title={Lung cancer detection from thoracic CT scans using an ensemble of deep learning models},
author={Gautam, Nandita and Basu, Abhishek and Sarkar, Ram},
journal={Neural Computing and Applications},
volume={36},
number={5},
pages={2459--2477},
year={2024},
publisher={Springer}
}