We now provide a pretrained OmniFM-DR on June 9, 2023!
Click the image to have a try with OmniFM-DR around the chest DR images
This repository provides the official implementation of OmniFM-DR
key feature bulletin points here
- First multi-modality model for multi-task analysis of chest DR image
- The largest full labeled chest DR dataset
- Supoort 4 tpyes of downstream tasks
- Classification
- Disease Localization
- Segmentation
- Report Generation
We have built a multimodal multitask model for DR data, aiming to solve all tasks in this field with one model, such as report generation, disease detection, disease question answering, and even segmentation. Without any fine-tuning, our model has achieved satisfactory results in report generation, disease detection and question answering.
Instruction | Task |
---|---|
What disease does this image have? | Disease entity classification |
What is the level of {} ? | Disease severity classification |
Where is {} ? | Disease location classification |
Give the accurate bounding box of {}. | Disease localization |
Please segment the {} from the given image. | Segmentation |
Describe the image. | Report generation |
We utilize 10 public and 6 private datasets for pre-training and provide the download via the following links:
Public dataset:
- MIMIC-CXR
- VinDR
- ChestX-Det-Dataset
- ChestX-ray14
- CheXpert
- TBX11K
- object-CXR
- JSRT Database
- Shenzhen chest X-ray Set
- Montgomery County chest X-ray Set
Main Requirements
- python 3.7.4
- pytorch 1.8.1
- torchvision 0.9.1
- gradio 3.34.0
Installation
git clone https://github.com/MedHK23/OmniFM-DR.git
pip install -r requirements.txt
Training
### before training, please download the pretrained models and datasets and place them in their respective folders.
bash ./run_scripts/multi_tasks/train.sh
Testing
from demo_base import init_task, ask_answer
from PIL import Image
print('Initializing Chat')
init_task()
print('Initialization Finished')
instruction = 'describe this image'
image = Image.open('test.png').convert('RGB')
report = ask_answer(image, instruction)
This project is under the Apache License. See LICENSE for details.
A lot of code is modified from OFA.
If you find this repository useful, please consider citing this paper:
@misc{xu2023learning,
title={Learning A Multi-Task Transformer Via Unified And Customized Instruction Tuning For Chest Radiograph Interpretation},
author={Lijian Xu and Ziyu Ni and Xinglong Liu and Xiaosong Wang and Hongsheng Li and Shaoting Zhang},
year={2023},
eprint={2311.01092},
archivePrefix={arXiv},
primaryClass={cs.CV}
}