Skip to content
Yuwei (Evelyn) Zhang edited this page Jun 29, 2024 · 7 revisions

PneumoLLM

  • Use a pre trained CLIP model for both text and image encoder (CLAP in audio case)
  • Add adaptor, trainable prompt, classifier, three part trainable
  • mask attention for prompt, unidirectional attention

A foundation model for generalizable disease detection from retinal images nature 2023

  • curate open datasets to train two modalities 2 models
  • 8 datasets, three groups of evaluation tasks, finetuning, including external evaluation (fine tune on A test on B)
  • MAE and 4 contrastive methods
  • interpretability through a tool using salient map
  • some analysis on distribution shift

Clone this wiki locally