v0.5.0
Release with a pip package
What's Changed
New UQ methods added:
- CoCoA by @rvashurin in #295
- Focus by @alfekka in #304
- KLE by @SpeedOfMagic in #278
- EigenScore by @ArtemVazh in #305
- LUQ by @ArtemVazh in #302
- CSL and RAUQ by @ArtemVazh in #347
- SelfCertainty by @cant-access-rediska0123 in #350
Other changes:
- Support for Python 3.11 by @SpeedOfMagic in #259
- vLLM inference support by @ArtemVazh in #308
- Visual LLM support by @alfekka in #341
- Calibration metrics (ECE and PCC) added to UEMetrics by @rvashurin in #386
- Support for greybox models from OpenAI API by @yobeen in #303
- Any statistic can be added to saved manager during benchmark run using
save_statsparameter, added by @silvimica in #371
Internal changes:
- Stat Calculators are now built by factories and share resources, added by @IINemo in #249
- Better logging by @IINemo in #266
Full Changelog: https://github.com/IINemo/lm-polygraph/compare/v0.4.0..v0.5.0