Skip to content

marrlab/DinoBloom

Repository files navigation

DinoBloom: A Foundation Model for Generalizable Cell Embeddings in Hematology

Repository of DinoBloom: A Foundation Model for Generalizable Cell Embeddings in Hematology that uses DINOv2 and is adapted from their original Github repository. DinoBloom is a model family (ViTs) trained on a large cohort of 13 diverse publicly available datasets of single cells in peripheral blood and bone marrow. The trained models in the can be downloaded on zenodo in the variants DinoBloom-S, DinoBloom-B, DinoBloom-L and DinoBloom-G. We show that our models outperforms existing medical and non-medical vision models in (i) linear probing and k-nearest neighbor evaluations for cell-type classification on peripheral blood and bone marrow smears and (ii) weakly supervised multiple instance learning for acute myeloid leukemia subtyping by a large margin.

Data and pipeline overview

Model farm

Model Feature dim #params Weights
DinoBloom-S 384 22M Download
DinoBloom-B 768 86M Download
DinoBloom-L 1024 304M Download
DinoBloom-G 1536 1136M Download

To train the model you need to specify the folder with .txt files holding the paths of the images you want to use to train in dinov2/configs/train/custom.yaml for training on a single GPU run:

python dinov2/train/train.py --config-file dinov2/configs/train/custom.yaml

for multiple GPUs on one node run

torchrun --nproc_per_node=#num_gpus dinov2/train/train.py --config-file dinov2/configs/train/custom.yaml

Sample Notebook

We provide a sample google colab notebook that shows feature extraction and how to do PCA visualization.

Citing DinoBloom

If you find this repository useful, please consider citing our work:

@misc{koch2024dinobloom,
      title={DinoBloom: A Foundation Model for Generalizable Cell Embeddings in Hematology}, 
      author={Valentin Koch and Sophia J. Wagner and Salome Kazeminia and Ece Sancar and Matthias Hehr and Julia Schnabel and Tingying Peng and Carsten Marr},
      year={2024},
      eprint={2404.05022},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

About

Blood Cell Foundation Model based on DINOv2

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published