Pneumonia-XRay-Differentiation-from-Kaggle-Dataset

DATASET

The dataset for this notebook is available here: https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia

The data contains chest x-ray images with 3 different labels

NORMAL
BACTERIAL (pneumonia)
VIRAL (pneumonia)

IMPORTANT NOTE

Nothing of this project has clinical significance without clinical trials. These models should never be used diagnostically.

TASK

The original task was to differentiate between Normal and Pneumonia. This turned out to be quite simple to accomplish.
I chose the harder task of categorizing between Bacterial pneumonia and Viral pneumonia.

CHALLENGES

Several challenges presented themselves. The first challenge is the lack of data. The training set I used contained the following number of images per categegory:

len(bac_fnames) : 2772 len(viral_fnames) : 1493

The delta was 1279 images, or slightly more than half the larger set. I used oversampling to balance the dataset.
(See A systematic study of the class imbalance problem in convolutional neural networks by Buda, et al (arXiv:1710.05381v2) for why oversampling is often the best method for correcting imbalance issues.)

The second challenge was overfitting. I used a Resnet-50 architecture which tended to favor the training set over the validation set.
To overcome this, we applied a standard set of augmentation transforms provided by FastAI, with the exception of flipping (due to the need to differentiate the left and right lungs)

CURRENT RESULTS

At this point, I am only able to get about 83% Accuracy

FUTURE PROJECTS

Recombine the NORMAL category.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.Archive		.Archive
.ipynb_checkpoints		.ipynb_checkpoints
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
xray-20210106-best.ipynb		xray-20210106-best.ipynb
xray-20210106a-best.ipynb		xray-20210106a-best.ipynb
xray.ipynb		xray.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.Archive

.Archive

.ipynb_checkpoints

.ipynb_checkpoints

.DS_Store

.DS_Store

.gitignore

.gitignore

README.md

README.md

xray-20210106-best.ipynb

xray-20210106-best.ipynb

xray-20210106a-best.ipynb

xray-20210106a-best.ipynb

xray.ipynb

xray.ipynb

Repository files navigation

Pneumonia-XRay-Differentiation-from-Kaggle-Dataset

DATASET

IMPORTANT NOTE

TASK

CHALLENGES

CURRENT RESULTS

FUTURE PROJECTS

About

Releases

Packages

Languages

frankfletcher/Pneumonia-XRay-Differentiation-from-Kaggle-Dataset

Folders and files

Latest commit

History

Repository files navigation

Pneumonia-XRay-Differentiation-from-Kaggle-Dataset

DATASET

IMPORTANT NOTE

TASK

CHALLENGES

CURRENT RESULTS

FUTURE PROJECTS

About

Resources

Stars

Watchers

Forks

Languages