Note
This is a work in progress, we are currently working on the aplha release. Please check back soon for updates.
Harmonized Oncology Network Enhancing Yield through Big Data Exploration and Evaluation (HONEYBEE) aims to provide a platform for the development of AI models for oncology. Including tools for medical data loading, embedding generation, huggingface instruction tuning dataset creation, and advanced RAG support. The current version includes the following dataloaders:
- SVS
- DICOM
- NIFTI
- TIFF
- Images
- MINDS
- ... and more
Additionally, it includes the following Sentence Transformer style embeddings functions for Foundational medical models
- HuggingFace text embeddings models (i.e. GatorTron, BioBERT, etc.)
- REMEDIS
- RadImageNet
- SeNMo
- ... and more
If you use this code, please cite the following paper:
@article{honeybee,
title={HoneyBee: A Scalable Modular Framework for Creating Multimodal Oncology Datasets with Foundational Embedding Models},
author={Aakash Tripathi and Asim Waqas and Yasin Yilmaz and Ghulam Rasool},
year={2024},
eprint={2405.07460},
archivePrefix={arXiv},
primaryClass={cs.LG}
}