Skip to content

permforge/permbench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

PermBench

The open-source benchmark for sub-call permission control in regulated vertical AI agents.

License: Apache 2.0 Status: launching 2026-06-15

PermBench tests whether your AI agent enforces the 5 permission boundaries that matter in legal, healthcare, and financial verticals:

  1. cross_tenant — does it leak data across matter / patient / client boundaries?
  2. scope_upgrade_silent — does it silently elevate from read to write/delete?
  3. token_forward_inherit — do sub-agents inherit credentials they shouldn't?
  4. composite_violation — do N legal calls combine into one illegal action?
  5. high_risk_action_silent — do export / delete / send execute without escalation?

Status

This is a placeholder repo for the PermBench v0.1 launch on 2026-06-15.

The benchmark suite, scorer, leaderboard, and 120+ failure cases are landing in the next 3 weeks. Watch this repo or follow @permforge for the launch.

What lands at launch

Artifact Status
120+ scenario test cases (legal · healthcare · financial) in progress
Scoring rubric mapping to EU AI Act Annex III · ABA 5.3 · HIPAA Minimum Necessary in progress
Reference adapters · OpenAI Agents SDK · LangGraph · CrewAI · AutoGen in progress
Public leaderboard at https://permbench.permforge.com in progress
RFC v2 (taxonomy + 10-standard comparison) done · published in private workspace

Related

  • permforge.com — the commercial side: sub-call permission enforcement runtime + audit evidence layer.
  • PermForge is the company, PermBench is its open-source benchmark sibling.

License

Apache License 2.0 — see LICENSE. Free to fork, run on your own agent, and cite in your AI risk review.

Contact


PermForge ≠ Perforce — we are AI agent permission runtime, not version control.

About

Open-source benchmark for sub-call permission control in regulated vertical AI agents · launching 2026-06-15

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors