Skip to content
forked from jbdel/vilmedic

ViLMedic (Vision-and-Language medical research) is a modular framework for vision and language multimodal research in the medical field

License

Notifications You must be signed in to change notification settings

CUCHon/vilmedic

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

New!🔥 Checkout our live radiology report generation 📝 space on HuggingFace🤗



Documentation Status MIT License


ViLMedic: a framework for research at the intersection of vision and language in medical AI

Installation

conda create --name vilmedic python==3.9 -y
git clone https://github.com/jbdel/vilmedic
python setup.py develop

Documentation

Learn more about ViLMedic here.

Model Zoo

ViLMedic hosts a zoo of pretrained models.

from vilmedic import AutoModel
model, processor = AutoModel.from_pretrained("selfsup/convirt-mimic")
batch = processor.inference(seq=["no acute cardiopulmonary process"],
                            image=["my_chest_xray.jpg"])

out = model(**batch)
print(out.keys())
# dict_keys(['loss', 'loss_l', 'loss_v', 'linguistic', 'visual'])
Name dataset Report preprocessing
Radiology report generation
rrg/biomed-roberta-baseline-mimic mimic-cxr r2gen
rrg/biomed-roberta-baseline-indiana indiana r2gen
rrg/baseline-padchest padchest -
Radiology report summarization
rrs/biomed-roberta-baseline-mimic mimic-cxr rouge
rrs/biomed-roberta-baseline-indiana indiana r2gen
Self-supervision
selfsup/convirt-mimic mimic-cxr r2gen
selfsup/convirt-mimic-balanced mimic-cxr r2gen
selfsup/convirt-padchest-16 padchest gloria
selfsup/convirt-padchest-32 padchest gloria
selfsup/convirt-indiana-16 indiana r2gen
selfsup/convirt-indiana-32 indiana r2gen
selfsup/convirt-indiana-64 indiana r2gen
selfsup/gloria-chexpert CheXpert gloria
selfsup/gloria-mimic-48 mimic-cxr r2gen
selfsup/simclr-mimic-16 mimic-cxr
selfsup/simclr-mimic-32 mimic-cxr
selfsup/simclr-mimic-64 mimic-cxr
selfsup/vae-mimic mimic-cxr
selfsup/vae-indiana indiana
selfsup/vae-padchest padchest
Medical VQA
mvqa/mvqa-imageclef ImageCLEF-VQAMed

Implemented solutions

ViLMedic replicates solutions from the multimodal medical literature.

Solutions
Medical Visual Question Answering
SYSU-HCP at VQA-Med 2021
Radiology report generation
Generating Radiology Reports via Memory-driven Transformer
Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports
Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation
Radiology report summarization
Multimodal Radiology Report Summarization
Multimodal self-supervised Learning
Contrastive Learning of Medical Visual Representations from Paired Images and Text
DALLE: Zero-Shot Text-to-Image Generation
CLIP: Learning Transferable Visual Models From Natural Language Supervision
SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-efficient Medical Image Recognition

Citation

If you use ViLMedic in your work or use any models published in ViLMedic, please cite:

@inproceedings{delbrouck-etal-2022-vilmedic,
    title = "{V}i{LM}edic: a framework for research at the intersection of vision and language in medical {AI}",
    author = "Delbrouck, Jean-benoit  and
      Saab, Khaled  and
      Varma, Maya  and
      Eyuboglu, Sabri  and
      Chambon, Pierre  and
      Dunnmon, Jared  and
      Zambrano, Juan  and
      Chaudhari, Akshay  and
      Langlotz, Curtis",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-demo.3",
    pages = "23--34",
}

License

ViLMedic is MIT-licensed. The license applies to the pre-trained models as well.

About

ViLMedic (Vision-and-Language medical research) is a modular framework for vision and language multimodal research in the medical field

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.4%
  • Perl 1.3%
  • C 1.2%
  • Other 0.1%