Composable reward signals from agent trajectories using programmatic verification. agentproof runs real tools against agent outputs and produces deterministic scores for RL training (GRPO, DPO, SFT filtering) or agent quality analysis.
pip install agentprooffrom agentproof import RewardComposer, verifier
from agentproof.verifiers import CodeExecution, FormatCheck, StepEfficiency
# Define a custom verifier -- just decorate a function
@verifier(deterministic=True)
def vuln_eliminated(target) -> float:
"""Check if SAST findings decreased after the agent's patch."""
before = run_scanner(target.context["original_code"])
after = run_scanner(target.outcome["patched_code"])
return max(0, len(before.findings) - len(after.findings)) / max(len(before.findings), 1)
# Compose verifiers with weights and gating
composer = RewardComposer([
vuln_eliminated.with_weight(0.5, required=True), # required: total=0 if this fails
CodeExecution(cmd="pytest").with_weight(0.3, required=True),
FormatCheck(schema="output_schema.json").with_weight(0.1),
StepEfficiency(max_steps=15).with_weight(0.1),
])
# Score a trajectory
result = my_agent.run(task)
scored = composer.score(trajectory=result.trajectory, context={"original_code": src})
print(scored.reward) # 0.0 if any required verifier failed, weighted sum otherwise
print(scored.breakdown) # per-verifier scores and evidencerequired=True is the anti-reward-hacking mechanism. If a required verifier returns 0, the entire composed reward is 0 -- no matter how well other verifiers score. An agent cannot game easy signals while failing on the ones that matter.
# Generate a config file
agentproof init
# Score trajectories from a JSONL file
agentproof check traces.jsonl
# JSON output for scripts
agentproof check --format json | jq .summary
# JUnit XML for CI dashboards (GitHub Actions, Jenkins)
agentproof check --format junit > results.xml
# Override threshold from CLI
agentproof check -t 0.8 traces.jsonlfrom agentproof import to_sft, to_dpo
from agentproof.sources import JSONLSource
# Load historical trajectories and score them
trajectories = JSONLSource("./agent_runs.jsonl").fetch()
scored = composer.score_batch(trajectories)
# Export for different training methods
to_sft(scored, "./data", min_reward=0.7) # filter to good examples
to_dpo(scored, "./data") # preference pairspip install agentproof # core + built-in verifiers
pip install agentproof[jsonschema] # with JSON schema validation support
pip install agentproof[langsmith] # with LangSmith source adapter| Verifier | What it does |
|---|---|
CodeExecution |
Runs a shell command against trajectory output. Score 1.0 for exit code 0. |
FormatCheck |
Validates trajectory outcome against a JSON schema. |
RegexMatch |
Checks if trajectory outcome matches a regex pattern. |
StepEfficiency |
Penalizes trajectories with too many steps. |
The @verifier decorator is the primary extension point:
from agentproof import verifier
@verifier(deterministic=True)
def tests_pass(target) -> float:
"""Run pytest and return 1.0 if all tests pass."""
result = subprocess.run(["pytest", target.context["test_path"]], capture_output=True)
return 1.0 if result.returncode == 0 else 0.0For verifiers needing configuration or state, use the class form:
from agentproof import Verifier, VerifyResult
class SASTDiffVerifier(Verifier):
name = "sast_diff"
deterministic = True
def __init__(self, scanner_cmd: str):
self.scanner_cmd = scanner_cmd
def verify(self, target) -> VerifyResult:
# Run scanner before/after comparison
...Third-party verifier packs can register via entry_points:
[project.entry-points."agentproof.verifiers"]
my_verifier = "my_package:MyVerifier"Export adapters produce JSONL files that TRL, veRL, and OpenRLHF can consume. agentproof never imports training libraries.
| Export | Format | Use case |
|---|---|---|
to_sft |
instruction/response JSONL | Filter high-scoring trajectories for supervised fine-tuning |
to_dpo |
prompt/chosen/rejected JSONL | Create preference pairs for DPO training |
to_grpo |
grouped completions JSONL | Batch GRPO training with group-level reward normalization |
v1.3 added closed-loop validation between agentproof scoring and real training frameworks.
Fetch human and automated feedback from LangSmith and use it as a reward signal:
from agentproof import LangSmithSource, get_feedback, verifier
source = LangSmithSource(project_name="my-agent")
trajectories = source.fetch(include_feedback=True)
@verifier(deterministic=True)
def human_score(target) -> float:
"""Use human correctness feedback as the reward signal."""
fb = get_feedback(target.context, "correctness")
return float(fb.get("score", 0.0)) if fb is not None else 0.5Match trajectories against a LangSmith dataset and verify against expected outputs:
trajectories = source.fetch(dataset_name="my-labeled-dataset")
from agentproof import get_ground_truth
@verifier(deterministic=True)
def exact_match(target) -> float:
gt = get_ground_truth(target.context)
if gt is None:
return 0.0
return 1.0 if str(target.trajectory.outcome) == str(gt.get("answer", "")) else 0.0Wrap a composer as a live TRL reward function for online GRPO training:
from agentproof.export.grpo import as_trl_reward_func
reward_fn = as_trl_reward_func(composer)
# Pass to TRL: GRPOTrainer(reward_funcs=[reward_fn])Full documentation: https://ogulcanarbc.github.io/agentproof/
- WHY.md -- Design rationale and motivation
- SPEC.md -- Technical specification for all interfaces
- ARCHITECTURE.md -- System design and layer breakdown
- See .planning/ROADMAP.md for the development roadmap.
See CONTRIBUTING.md for setup instructions and development workflow.
git clone https://github.com/ogulcanarbc/agentproof.git
cd agentproof
make install # installs dev deps via uv
make all # runs lint + format-check + typecheck + testMIT