v1.0.1

Yunnglin released this 05 Sep 09:11

· 36 commits to release/1.0 since this release

bebf960

更新内容

支持视觉-语言多模态大模型的评测任务，例如：MathVista、MMMU，更多支持数据集请参考。
支持图像编辑任务评测，支持GEdit-Bench 评测基准，使用方法参考。
核心依赖移除torch，移动到rag和aigc可选依赖中。

Update

The evaluation tasks for vision-language multimodal large models are now supported, including MathVista and MMMU. For more information on the supported datasets, please refer to this link.
Image editing task evaluation is now supported, with the GEdit-Bench evaluation benchmark available. For usage instructions, please refer to this guide.
The core dependency on torch has been removed and is now an optional dependency under rag and aigc.

What's Changed

[DOC] Update 1.0 custom doc by @Yunnglin in #793
[Fix] Fix reasoning content by @Yunnglin in #797
[Fix] Change old collection to new version by @Yunnglin in #798
Reduce dataset loading time by @mmdbhs in #805
[Fix] fix reranker pad token and embedding max tokens by @Yunnglin in #806
[Feature] Add image edit task by @Yunnglin in #804
[Benchmark] Add mmmu by @Yunnglin in #812
add math_vista by @mushenL in #813
[Fix] tau-bench zero scores by @Yunnglin in #814
[Fix] collection eval by @Yunnglin in #816
[Feature] add vlm adapter by @Yunnglin in #817
[Feature] remove torch from framework by @Yunnglin in #818
add MMMU_Pro by @mushenL in #819

New Contributors

@mmdbhs made their first contribution in #805

Full Changelog: v1.0.0...v1.0.1

Contributors

Yunnglin, mmdbhs, and mushenL

Assets 2