v1.6.1
中文版
基准测试数据集
- 新增 TIR-Bench 基准测试
功能增强
- Tokenize Prompt: 新增 tokenize prompt 开关,支持灵活控制 prompt 的 tokenize 行为
- 多轮性能测试: 新增多轮对话性能测试 (multi turn perf) 支持
- 自定义多轮性能测试: 新增自定义多轮性能测试 (custom multi_turn perf) 能力
- 评测集成性能测试: 在评测流程中集成性能测试 (perf in eval)
- 投机解码指标: 新增投机解码 (speculative decoding) 性能指标
问题修复
- 修复加载默认本地数据集的问题
- 修复 tokenize-prompt 长度语义问题
- 修复 tokenize 模板问题
- 更新 plot CDN 地址,避免网络加速后访问异常
English Version
Benchmark Datasets
- Added TIR-Bench benchmark
Feature Enhancements
- Tokenize Prompt: Added tokenize prompt switch for flexible prompt tokenization control
- Percentile Metrics: Added support for P50, P90 percentile statistics
- Multi-turn Performance: Added multi-turn conversation performance testing (multi turn perf)
- Custom Multi-turn Performance: Added custom multi-turn performance testing (custom multi_turn perf)
- Perf in Evaluation: Integrated performance testing in evaluation workflow (perf in eval)
- Speculative Metrics: Added speculative decoding performance metrics
Bug Fixes
- Fixed loading default local dataset issue
- Fixed tokenize-prompt length semantics issue
- Fixed tokenize template issue
- Updated plot CDN address to avoid access issues after network acceleration
What's Changed
- [Feature] Add tokenize prompt switch by @Yunnglin in #1289
- Feat/support p50 p90 percentiles by @yonlunwu in #1283
- update log filehandler by @Yunnglin in #1292
- [Fix] load default local dataset by @Yunnglin in #1293
- [Benchmark] Add TIR-Bench by @Yunnglin in #1295
- [Feature]Add multi turn perf by @Yunnglin in #1298
- Ensure output directory is created automatically when dumping JSONL files by @ShaohonChen in #1296
- Fix tokenize-prompt length semantics by @zongjing1998 in #1301
- [Feature]update time zone by @Yunnglin in #1303
- [Feature] Add speculative perf metrics by @Yunnglin in #1306
- feat: 更新plot的cdn地址,避免网络加速后访问异常 by @ZhengYingqian in #1308
- [Feature] Add custom multi_turn perf by @Yunnglin in #1309
- [Feature] Add perf in eval by @Yunnglin in #1310
- [Fix] tokenize template issue by @Yunnglin in #1311
New Contributors
- @yonlunwu made their first contribution in #1283
- @zongjing1998 made their first contribution in #1301
Full Changelog: v1.6.0...v1.6.1