Skip to content

itsderek23/awesome-eval-driven-development

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

Awesome Eval Driven Development (EDD)

Eval-Driven-Development (EDD) is a methodology for guiding the development of LLM-backed apps via a set of task-specific evals (i.e. prompt, context, expected outputs as references).*

These evals guide prompt engineering, model selection, fine-tuning, and so on. We can then run these evals to quickly measure improvements or regressions as the app changes.

It's Test Driven Development (TDD) for LLM-backed apps.

Open-source LLM-backed app evaluation products

Name Description
Auto Evaluator Evaluation tool for LLM QA chains
DeepEval Evaluation and Unit Testing for LLMs
Evals A framework for evaluating LLMs and LLM systems
Phoenix Evaluate, troubleshoot, and fine tune your LLM in a notebook
Ragas Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
Uptrain Your open-source LLM evaluation toolkit

Paid LLM-backed app evaluation products

Name Distribution Maturity Self-service signup
Freeplay SaaS Private Beta No
Patronus AI SaaS Released No

References

*- Definition adapted from Patterns for Building LLM-based Systems & Products by Eugene Yan.

About

A curated list of resources, projects, and products to help implement Eval-Driven-Development (EDD) for LLM-backed apps.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published