Skip to content

andrewhinh/chexray-v1

Repository files navigation

CheXray: Automatic Diagnosis of Chest X-Rays with AI

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. Contact
  7. Acknowledgements

About The Project

In the four teaching hospitals on Longwood Avenue in Boston, Massachusetts, there are more radiologists working there than there are in West Africa. In addition, it can take up to two years to train people to become radiologists. In those two years, one of the most important skills radiologists learn is how to write a diagnosis report for their patients and determining what diseases they need to treat their patients for. How do we give this service to populations who need it? Let's use AI to write and summarize these diagnostic reports so that everyone can start off on an equal playing field. Using PyTorch and fast.ai, I built two separate models, one based off of this research paper designed to generate radiologists reports and another based off of this forum and the fast.ai courses to summarize the generated report, the images, and other clinical data into a list of diseases the patient most likely needs to be checked out for. To use the service I've created for the above stated purpose, visit the Binder website: Binder

Built With

Getting Started

  • Go to the website link in the "About the Project" section if you just want to see the website. Otherwise, please wait: the files needed to reproduce the website need to be uploaded. In the meantime, the following instructions are not valid.

Prerequisites

  • Currently supported for Mac OS only
  • Latest version of Python
  • fastai==2.0.0
    pip install fastai==2.0.0
  • fastcore==1.0.0
    pip install fastcore==1.0.0
  • fastbook==0.0.8
    pip install fastbook==0.0.8
  • torch==1.6.0 for Jupyter Notebook
    • torch==1.7.0 for Google Colab
    pip install torch==1.6.0, pip install torch==1.7.0
  • spacy==2.2.4
    pip install spacy==2.2.4
  • Latest version of FastText
    git clone https://github.com/facebookresearch/fastText.git
    cd fastText
    pip install . 
  • Latest version of NLTK
    pip install nltk
  • Latest version of rouge-score
    pip install rouge-score

Installation:

  1. Follow the instructions on physionet.org to become a credentialed user and sign the data use agreement to get the MIMIC-CXR Dataset (make sure to have around 960GB of storage to store it)
  2. Clone the repo
    git clone https://github.com/AndrewJHinh/CheXray.git
  3. Go to local repo
    cd CheXray
  4. Install Dependencies (as mentioned in Prerequisites section)

Usage

  1. Open Jupyter Notebook

    jupyter notebook
  2. Follow mimic-cxr.ipynb 2a. For each "training model" section, go to the corresponding google colab notebook, change the hardware to GPU, and train the model (for google colab notebooks imgcapall.ipynb and sum.ipynb after changing the hardware, run "editing files" cells, restart the runtime, and continue to train the model)

  3. Go to production.ipynb to test out your models and check if everything works on Viola (replace "notebooks" in url with "viola/render")

  4. Make separate local repo within current directory containing everything in sample repo (named CheXray) within the repo

  5. Push to GitHub (using Git-LFS for files larger than 25mb)

  6. Copy GitHub link to Binder and deploy

Roadmap

  • Support models that take in more than two views
  • Make models better
  • Use other datasets such as CheXpert and Open-I

Contributing

Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Contact

Andrew Hinh - ajhinh@gmail.com - 4158103676

Project Link: https://github.com/AndrewJHinh/CheXray

Acknowledgements