judgment-cookbook

This repo contains cookbooks demonstrating evaluations of AI Agents using the judgeval package implemented by Judgment Labs.

Prerequisites

Before running these examples, make sure you have:

Installed the latest version of the Judgeval package:
```
pip install judgeval
```

Set up your Judgeval API key and organization ID as environment variables:

export JUDGMENT_API_KEY="your_api_key"
export JUDGMENT_ORG_ID="your_org_id"

To get your API key and Organization ID, make an account on the Judgment Labs platform.

Cookbooks Overview

This repository provides a collection of cookbooks to demonstrate various evaluation techniques and agent implementations using Judgeval.

Handrolled API Agent Examples

These cookbooks feature agents that interact directly with LLM APIs (e.g., OpenAI, Anthropic), often implementing custom logic for tool use, function calling, and RAG.

multi-agent/: A flexible multi-agent framework for orchestrating and evaluating the collaboration of multiple agents and tools on complex tasks like financial analysis. Evaluated on factual adherence to retrieval context.

LangGraph Agent Examples

These cookbooks showcase agents built using the LangGraph framework, demonstrating complex state management and chained operations.

langgraph_music_recommender/: An agent that generates song recommendations based on user music taste.

Writing Custom Scorers

These cookbooks focus on how to implement and use custom scorers:

custom_scorers/: Provides examples of how to implement and use custom scorers to tailor evaluations to specific needs beyond built-in scorers.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
archived-cookbooks/rules_alerts		archived-cookbooks/rules_alerts
cookbooks		cookbooks
integrations		integrations
.gitignore		.gitignore
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

judgment-cookbook

Prerequisites

Cookbooks Overview

Handrolled API Agent Examples

LangGraph Agent Examples

Writing Custom Scorers

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 9

Uh oh!

Languages

JudgmentLabs/judgment-cookbook

Folders and files

Latest commit

History

Repository files navigation

judgment-cookbook

Prerequisites

Cookbooks Overview

Handrolled API Agent Examples

LangGraph Agent Examples

Writing Custom Scorers

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 9

Uh oh!

Languages

Packages