Skip to content

MedHK23/OmniFM-DR

Repository files navigation

OmniFM-DR

News

We now provide a pretrained OmniFM-DR on June 9, 2023!

Online Demo

Click the image to have a try with OmniFM-DR around the chest DR images

Key Features

This repository provides the official implementation of OmniFM-DR

key feature bulletin points here

  • First multi-modality model for multi-task analysis of chest DR image
  • The largest full labeled chest DR dataset
  • Supoort 4 tpyes of downstream tasks
    • Classification
    • Disease Localization
    • Segmentation
    • Report Generation

Links

Details

We have built a multimodal multitask model for DR data, aiming to solve all tasks in this field with one model, such as report generation, disease detection, disease question answering, and even segmentation. Without any fine-tuning, our model has achieved satisfactory results in report generation, disease detection and question answering.

Instructions

Instruction Task
What disease does this image have? Disease entity classification
What is the level of {} ? Disease severity classification
Where is {} ? Disease location classification
Give the accurate bounding box of {}. Disease localization
Please segment the {} from the given image. Segmentation
Describe the image. Report generation

Dataset Links

We utilize 10 public and 6 private datasets for pre-training and provide the download via the following links:

Public dataset:

Get Started

Main Requirements

  • python 3.7.4
  • pytorch 1.8.1
  • torchvision 0.9.1
  • gradio 3.34.0

Installation

git clone https://github.com/MedHK23/OmniFM-DR.git
pip install -r requirements.txt

Training

### before training, please download the pretrained models and datasets and place them in their respective folders.
bash ./run_scripts/multi_tasks/train.sh

Testing

from demo_base import init_task, ask_answer
from PIL import Image

print('Initializing Chat')
init_task()
print('Initialization Finished')

instruction = 'describe this image'
image = Image.open('test.png').convert('RGB')
report = ask_answer(image, instruction)

🛡️ License

This project is under the Apache License. See LICENSE for details.

🙏 Acknowledgement

A lot of code is modified from OFA.

📝 Citation

If you find this repository useful, please consider citing this paper:

@misc{xu2023learning,
      title={Learning A Multi-Task Transformer Via Unified And Customized Instruction Tuning For Chest Radiograph Interpretation}, 
      author={Lijian Xu and Ziyu Ni and Xinglong Liu and Xiaosong Wang and Hongsheng Li and Shaoting Zhang},
      year={2023},
      eprint={2311.01092},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}