Built an open source LLM evaluation framework on top of LiteLLM #29902
vignesh2027
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hey LiteLLM community!
LiteLLM's unified API is the backbone of this project, so wanted to share it here first.
I built an open source LLM Evaluation Framework that uses LiteLLM's
acompletionto benchmark any model in parallel across 5 metrics:What it does:
time.perf_counter()around each acompletion callHow it uses LiteLLM:
Since LiteLLM gives a unified interface, one command benchmarks any provider:
One benchmark run compares GPT-4o-mini vs Gemini Flash vs Claude Haiku with zero config changes.
Results from 100 prompts:
GPT-4o scored 88.2% at $0.008/1K. Gemini Flash scored 76.8% at $0.0001/1K. 80x cost difference for 11% accuracy gap.
Live demo (no API key): https://huggingface.co/spaces/vigneshwar234/llm-eval-demo
GitHub: https://github.com/vignesh2027/LLM-Evaluation-Framework
71 tests, 82% coverage, full CI/CD. Feedback welcome, especially on LiteLLM integration patterns!
Beta Was this translation helpful? Give feedback.
All reactions