v1.4.2
中文版
基准测试数据集
- 代码评测: 新增 HumanEvalPlus、MBPPPlus 等代码能力评测
功能增强
- 性能测试: 新增对 Embedding 和 Rerank 模型的性能测试支持
文档优化
- 新增 general_fc 最佳实践文档
- 更新 collection 相关文档说明,支持自定义构建评测指数(index),参考使用文档
- 更新性能测试文档,新增 Embedding 和 Rerank 模型评测说明
问题修复
- 修复性能测试日志输出问题
- 修复 SimpleVQA 图像加载问题
English Version
Benchmark Datasets
- Code Evaluation: Added HumanEvalPlus and MBPPPlus for code capability assessment
Feature Enhancements
- Performance Testing: Added support for Embedding and Rerank models performance evaluation
Documentation
- Added general_fc best practice documentation
- Updated collection documentation with support for custom index construction
- Updated performance testing documentation with Embedding and Rerank model evaluation instructions
Bug Fixes
- Fixed performance testing log output issues
- Fixed SimpleVQA image loading issues
What's Changed
- [Doc] Add general_fc best practice by @Yunnglin in #1130
- [Doc] update index collection doc by @Yunnglin in #1132
- [Fix] update perf log by @Yunnglin in #1135
- [Draft] feat(perf): add support for embedding and rerank models by @gbdjxgp in #1140
- add humanevalplus and mbppplus benchmarks by @mushenL in #1144
- [Doc]update perf embedding and rerank by @Yunnglin in #1147
- [fix] simplevqa image load by @Yunnglin in #1153
Full Changelog: v1.4.1...v1.4.2