Skip to content

2.0.0

Latest

Choose a tag to compare

@katstankiewicz katstankiewicz released this 15 May 14:05
· 270 commits to main since this release
e0e60cd

What's Changed

  • Save more information on judge failures and rename final failed folder by @gabegma in #88
  • Modify user simulator prompt and improve user behavioral fidelity metric by @gabegma in #89
  • Add elevenlabs framework by @katstankiewicz in #90
  • Improvements to the results analysis app by @fanny-riols in #77
  • Fix metric bugs by @gabegma in #91
  • Improve faithfulness and conversation progression for S2S by @gabegma in #93
  • process elevenlabs as cascade by @katstankiewicz in #94
  • Run all metrics by default by @JosephMarinier in #96
  • Enhance robustness for turn-taking metrics by @nhhoang96 in #95
  • Omit new names for pass@k by @gabegma in #98
  • individualize override parameters per backend by @raghavm243512 in #97
  • Scratch/gemini tool fix by @tara-servicenow in #99
  • Bump metric version to account for recent changes by @tara-servicenow in #101
  • Update nvidia websocket by @katstankiewicz in #103
  • Classify Ultravox as AUDIO_LLM in get_pipeline_type by @tara-servicenow in #104
  • Add assets directory in dockerfile so that background noise files are… by @tara-servicenow in #102
  • Don't allow multiple domains with existing run id by @tara-servicenow in #105
  • add gemini ALM support by @raghavm243512 in #106
  • Update OpenAI by @katstankiewicz in #110
  • Various bug fixes and improvements to metrics by @gabegma in #112
  • Update documentation for ElevenLabs User Simulator and leaderboard model configs by @fanny-riols in #111
  • Add metric versionning by @gabegma in #113
  • Update website with v2 by @tara-servicenow in #114
  • Refactor model config by @JosephMarinier in #109
  • Website fixes by @tara-servicenow in #115
  • Change how to run multiple domains by @JosephMarinier in #100
  • Website fixes: Perturbation significance asterisks; scope domain toggle to scatter plot by @lindsaydbrin in #116
  • Elevenlabs not saving by @katstankiewicz in #107
  • Pr/tara/website names by @tara-servicenow in #118
  • bump version for release by @katstankiewicz in #117

New Contributors

Full Changelog: 0.1.3...2.0.0