This repository contains a dataset of MRI scans used for the detection of brain tumors. The dataset, sourced from Kaggle, includes a variety of MRI images that can be instrumental in training machine learning models for medical diagnostic processes.
mri_generated_video.mp4
The dataset from Kaggle includes MRI images that are labeled based on the presence of brain tumors. For detailed information and to download the dataset, please visit the Kaggle Brain Tumor MRI Dataset.
I am utilizing Google's Gemini Pro Vision to process the MRI images in this dataset. The purpose is to create a proof of concept that can assist radiologists in generating template reports.
The Python code snippet included in this repository demonstrates how to use the Gemini Pro Vision model with LangChain to prompt the model to analyze MRI images and generate preliminary medical reports, which can then be reviewed and finalized by medical professionals, such as radiologists. The impact of the work can be help with:
- Efficiency: Saves time for medical professionals by automating the initial analysis process
- Triaging: Helps prioritize urgent cases by quickly identifying abnormalities
- Workflow optimization: Increases efficiency in the healthcare system by streamlining workflows
- Diagnosis facilitation: Provides a starting point for further investigation and treatment planning
- Accuracy and consistency: Enhances accuracy and consistency in report generation
Once the reports are created, I use OpenAI's Whisper model to generate audio files to allow radiologists to have the flexibility to listen to the radiological reports instead of reading them. This application of Whisper's Text-to-Speech (TTS) model offers several potential impacts and opportunities in the medical field:
- Learning and Training: Audio files can be used in educational settings, allowing students to listen to sample reports and familiarize themselves with radiological terminology and diagnosis processes.
- Patient Communication: Simplified audio reports could be shared with patients to help them better understand their diagnoses in a more accessible format.
- Language Translation: Coupled with language translation models, the Whisper TTS model can generate reports in multiple languages, improving the inclusivity and accessibility for non-English speaking patients and professionals.
- Documentation and Archiving: Audio archives of radiological reports can complement written records, offering an additional layer of documentation for clinical cases.
- Set your
GOOGLE_API_KEY
andapi_key
(OpenAI's API Key) in the environment. - Process the MRI images using the Gemini Pro Vision model.
- Generate a medical report for each image in the gemini_w_mri_data.ipynb.
- Generate an audio file from OpenAI Whisper model in the whisper_audio_report_generation.ipynb.
Below is a conceptual example of how the output might look:
Audio Generation of Radiology Report Generated from OpenAI's Whisper:
example_radiology_audio.mp4
Case 1
- Patient: John Smith
- Age: 45
- Sex: Male
- Date: 01/01/2023
- Study: T2 MRI of the brain
- Diagnosis: Encephalomalacia
Findings:
- There is a large area of encephalomalacia in the right frontal lobe.
- The encephalomalacia is surrounded by a rim of gliosis.
- There is no evidence of mass effect.
Impression:
- Encephalomalacia in the right frontal lobe.
Differential Diagnosis
- Stroke
- Trauma
- Infection
- Tumor
Recommendations:
- The patient should be evaluated by a neurologist.
- The patient should undergo a CT scan of the brain to rule out any other pathology.
- The patient should be started on a course of corticosteroids to reduce inflammation.
radiology_report.mp4
Contributions to this project are welcome. Please ensure that you update tests as appropriate.
This project is licensed under the MIT License - see the LICENSE.md file for details.