v1.0.1
更新内容
- 支持视觉-语言多模态大模型的评测任务,例如:MathVista、MMMU,更多支持数据集请参考。
- 支持图像编辑任务评测,支持GEdit-Bench 评测基准,使用方法参考。
- 核心依赖移除
torch,移动到rag和aigc可选依赖中。
Update
- The evaluation tasks for vision-language multimodal large models are now supported, including MathVista and MMMU. For more information on the supported datasets, please refer to this link.
- Image editing task evaluation is now supported, with the GEdit-Bench evaluation benchmark available. For usage instructions, please refer to this guide.
- The core dependency on
torchhas been removed and is now an optional dependency underragandaigc.
What's Changed
- [DOC] Update 1.0 custom doc by @Yunnglin in #793
- [Fix] Fix reasoning content by @Yunnglin in #797
- [Fix] Change old collection to new version by @Yunnglin in #798
- Reduce dataset loading time by @mmdbhs in #805
- [Fix] fix reranker pad token and embedding max tokens by @Yunnglin in #806
- [Feature] Add image edit task by @Yunnglin in #804
- [Benchmark] Add mmmu by @Yunnglin in #812
- add math_vista by @mushenL in #813
- [Fix] tau-bench zero scores by @Yunnglin in #814
- [Fix] collection eval by @Yunnglin in #816
- [Feature] add vlm adapter by @Yunnglin in #817
- [Feature] remove torch from framework by @Yunnglin in #818
- add MMMU_Pro by @mushenL in #819
New Contributors
Full Changelog: v1.0.0...v1.0.1