Awesome Eval Driven Development (EDD)

Eval-Driven-Development (EDD) is a methodology for guiding the development of LLM-backed apps via a set of task-specific evals (i.e. prompt, context, expected outputs as references).*

These evals guide prompt engineering, model selection, fine-tuning, and so on. We can then run these evals to quickly measure improvements or regressions as the app changes.

It's Test Driven Development (TDD) for LLM-backed apps.

Open-source LLM-backed app evaluation products

Name	Description
Auto Evaluator	Evaluation tool for LLM QA chains
DeepEval	Evaluation and Unit Testing for LLMs
Evals	A framework for evaluating LLMs and LLM systems
Phoenix	Evaluate, troubleshoot, and fine tune your LLM in a notebook
Ragas	Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
Uptrain	Your open-source LLM evaluation toolkit

Paid LLM-backed app evaluation products

Name	Distribution	Maturity	Self-service signup
Freeplay	SaaS	Private Beta	No
Patronus AI	SaaS	Released	No

References

*- Definition adapted from Patterns for Building LLM-based Systems & Products by Eugene Yan.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Awesome Eval Driven Development (EDD)

Open-source LLM-backed app evaluation products

Paid LLM-backed app evaluation products

References

About

Releases

Packages

License

itsderek23/awesome-eval-driven-development

Folders and files

Latest commit

History

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Awesome Eval Driven Development (EDD)

Open-source LLM-backed app evaluation products

Paid LLM-backed app evaluation products

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages