Skip to content

ekon15/bt-langgraph

Repository files navigation

bt-langgraph

Analyzes a Python LangGraph agent file and generates a ready-to-run Braintrust Remote eval — no execution required. It discovers node names, system prompts, and model configuration by reading source code directly, then wires them as tunable parameters so you can iterate on any node's prompt or model from the Braintrust Playground and observe end-to-end agent behavior without experimenting in production.

Requirements

  • Python 3.10+
  • Node.js 18+ (for the generated TypeScript eval)
  • A Braintrust account
  • BRAINTRUST_API_KEY and OPENAI_API_KEY

Setup

pip install -r requirements.txt
npm install

cp .env.example .env
# fill in your API keys

Usage

Try it against the included example agent:

python run.py --agent langgraph_agent.py

Or point it at your own agent:

python run.py --agent path/to/my_agent.py

This generates eval.py. Start the Remote eval dev server:

braintrust eval eval.py --dev

Then in Braintrust: Configuration → Remote evals → add http://localhost:8300, create a Playground, add a Task → Remote eval, and start iterating.

Options

--agent        Path to your Python LangGraph agent file (default: langgraph_agent.py)
--factory      Factory function name (default: build_graph)
--params       Per-node params to expose. Default: prompt,model
               Available: max_tokens, model, prompt, temperature
               Use "all" to expose everything
--extra-param  App-specific parameter (repeatable). Format: key:type:default:description
               Types: string, number, boolean
--lang         Output language: python (default) or typescript
--project      Braintrust project name (default: "LangGraph Agent")
--output-key   State field to return as output

Examples

# Default — generates eval.py with prompt and model parameters
python run.py --agent my_agent.py

# Opt in to temperature tuning as well
python run.py --agent my_agent.py --params prompt,model,temperature

# Expose everything
python run.py --agent my_agent.py --params all

# Generate a TypeScript eval instead (self-contained, no Python infra needed)
python run.py --agent my_agent.py --lang typescript

Adding app-specific parameters

For parameters that can't be auto-discovered (RAG search depth, feature flags, tool toggles), pass them at the command line with --extra-param. These are injected into the generated eval and passed through to build_graph at runtime.

Format: key:type:default:description — types are string, number, boolean.

python run.py --agent my_agent.py \
  --extra-param "ragSearchDepth:number:5:Number of chunks to retrieve" \
  --extra-param "webSearchEnabled:boolean:false:Enable the web search tool"

Tool descriptions are auto-discovered from @tool docstrings and Tool()/StructuredTool() constructors. Each tool gets a <toolName>Description parameter. You can override a tool's auto-discovered description with an explicit --extra-param:

python run.py --agent my_agent.py \
  --extra-param "searchDescription:string:Search recent news only.:Web search tool"

How discovery works

The tool uses Python's ast module to parse your agent file without executing it. It:

  1. Finds graph.add_node("name", fn) calls to map node names to functions
  2. Finds SystemMessage(content=...) inside each function and resolves the value — handles string literals, module-level constants, and {**DEFAULT_PROMPTS, ...} dict patterns
  3. Finds ChatOpenAI(model=...) (and other supported LLM classes) for model and temperature
  4. Finds tool descriptions from @tool docstrings and Tool()/StructuredTool() constructors — each becomes a {toolName}Description parameter, always included automatically

Limitations: prompts imported from other modules, loaded from files/env vars, or built dynamically (f-strings, string concatenation) won't be auto-discovered. Pass them via --extra-param or edit the generated eval file directly.

Supported LLM classes

ChatOpenAI, AzureChatOpenAI, ChatAnthropic, ChatBedrock, ChatGoogleGenerativeAI, ChatCohere

About

Statically analyze a Python LangGraph agent and generate a Braintrust Remote eval

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages