LLM

Jump to bottom

Yuwei (Evelyn) Zhang edited this page Jun 29, 2024 · 7 revisions

PneumoLLM

Use a pre trained CLIP model for both text and image encoder (CLAP in audio case)
Add adaptor, trainable prompt, classifier, three part trainable
mask attention for prompt, unidirectional attention

A foundation model for generalizable disease detection from retinal images nature 2023

curate open datasets to train two modalities 2 models
8 datasets, three groups of evaluation tasks, finetuning, including external evaluation (fine tune on A test on B)
MAE and 4 contrastive methods
interpretability through a tool using salient map
some analysis on distribution shift