Skip to content

v0.2.0a1: softer landings (alpha)

Pre-release
Pre-release

Choose a tag to compare

@yfxiao16 yfxiao16 released this 07 Jun 01:24
· 28 commits to main since this release

Sponsio 0.2.0a1: softer landings

Released: 2026-06-06 · Status: alpha · pip install --pre sponsio==0.2.0a1

Note on the version. The "softer landings" work was developed against 0.2.0a0; the alpha that actually shipped to PyPI is 0.2.0a1. The bump exists because the 0.2.0a0 upload to TestPyPI had relative image paths in its README that PyPI's renderer does not resolve, and PyPI does not allow re-uploading a version even after deletion. No runtime changes between 0.2.0a0 and 0.2.0a1.

Until 0.2, every Sponsio contract had effectively one failure mode: block the call and let the agent figure it out. That worked for the "AI tried to rm -rf /" demo, but in production it meant brittle agent loops bouncing off refusals every time the policy fired.

0.2 ships three softer landings that keep the agent making progress while still gating the unsafe behavior, plus a few smaller fixes that round out the failure-strategy surface.


What's new

1. tool_policy: default-deny tool access

What it is. A declarative YAML block (or inline kwarg) that says "the agent can only call tools in approved:. Anything else is denied."

tool_policy:
  default: deny
  approved: [search, read_file, list_dir]

Why it exists. Adding a new tool to your agent framework would silently expand the agent's authority. With tool_policy, the policy is the single source of truth for what the agent can reach. Adding a tool to your codebase is a deliberate act of trust; you have to put its name in approved: to make it callable.

Why it's good for users.

  • Audit-friendly. The allowlist is the artifact you show in a security review. One file, one list, one source of truth.
  • Prompt-injection-resistant. Combined with enforcement: proactive (below), denied tools never reach the agent's prompt. An attacker who tricks the model into asking for shell_exec finds that shell_exec does not exist in the model's available tools.
  • Backwards-compatible. Default is allow, so existing yaml files keep working byte-for-byte. Users opt in to deny.

2. enforcement: proactive + filter_tools: proactive tool filtering

What it is. Two paths to the same outcome: shrink the tool menu the agent sees down to the subset that is currently legal.

  • enforcement: proactive (wrap-time). Set on tool_policy. The LangGraph, CrewAI, OpenAI Agents SDK, and Google ADK adapters strip denied tools from the bound toolset at wrap() time. The model literally never sees them.
  • filter_tools(candidates) (per-turn). Pure-probe API on the guard. Returns the subset of tool names that will not be blocked given the live trace. Useful in custom loops where the application owns the LLM call site.

Why it exists. Reactive blocking (the agent tries, gets refused, tries again) wastes tokens and turns. For static rules (default-deny allowlist) the answer does not change between turns; for temporal rules (must_precede(A, B) only allows B after A) the answer changes per turn. Both should be reflected in what the agent sees, not what gets refused on the back end.

Why it's good for users.

  • No wasted attempts. The model does not burn turns on tools it cannot actually call.
  • Cleaner prompts. Fewer tools in the prompt means fewer distractors and a smaller token bill.
  • Works with any framework that supports custom loops. filter_tools is the universal hook; the proactive wrap-time variant is the zero-configuration version for the four adapters above.
  • Side-effect free. filter_tools is a pure probe: no log entry, no callback fanout, no perf sample contamination. Safe to call before every model turn.

3. redirect_to_safe: substitute, do not block

What it is. A pattern + strategy combo that, on violation, substitutes the model's chosen tool with a pre-declared safe alternative.

contract("trash instead of rm")
    .guarantees(redirect_to_safe("rm_rf", "trash"))

The model calls rm_rf; Sponsio rolls that event back from the trace, the LangGraph adapter invokes trash with the same arguments, the trace records the substitute call. From the model's perspective, the call succeeded.

Why it exists. A hard block forces the agent to bail out of the current task. A redirect keeps it making progress on a safer path. Most "destructive vs recoverable" tool pairs (rm_rf vs trash, issue_refund vs log_refund_request, force_push vs open_pull_request) are good candidates for this.

Why it's good for users.

  • Agent does not have to learn to recover from policy violations. The recovery is built into the policy.
  • Audit trail reflects what actually executed. The trace records the safe substitute, not the attempted-and-blocked unsafe call. Counters (rate_limit(unsafe, N)) do not tick on the rollback.
  • Composes with conditional contracts. assume(...).guarantees(redirect_to_safe(...)) makes the substitution conditional on a precondition (for example, redirect refunds over $10k while letting smaller ones through).

4. EscalateToHuman(notify=[...]): notifier hooks

What it is. The escalate strategy now accepts a callable or list of callables (Slack webhook, email sender, oncall pager) that fire synchronously when the contract trips.

EscalateToHuman(
    reason="refund > $10k requires CFO approval",
    notify=[slack_oncall, email_finance_lead],
)

Why it exists. Until 0.2, EscalateToHuman differed from DetBlock only in the action literal and the agent-facing message. No actual side effect, no notification, no out-of-band reach to a human. 0.2 makes the notification real.

Why it's good for users.

  • Isolated failures. A broken Slack webhook does not crash the agent loop and does not silence the remaining notifiers; the exception becomes a RuntimeWarning naming the offending callable.
  • Composable with DetBlock for hard refuse + notify. If you want the call gated AND the page fired, pair DetBlock with monitor.register_callback. The case study at examples/integrations/python/v0_2_finance_escalate_vanilla.py shows the pattern.

Smaller fixes

  • sponsio mode <observe|enforce> CLI is now parent-aware. Prefers updating runtime.mode (the only line the TS loader reads), falls back to defaults.mode, refuses to append a fresh enforce block when neither exists. CI scripts that relied on the old exit-1 behavior for malformed configs keep working.
  • LangGraph adapter rejects chained redirects and self-redirects. A contract that says "redirect A to B" combined with another saying "redirect B to C" no longer silently executes B; both raise ToolCallBlocked with a clear chain-naming error.
  • Pattern factories uniformly accept desc=. Including redirect_to_safe, which previously did not and silently broke LLM-extracted rules.
  • TS SDK gets redirectToSafe (formula side; runtime strategy bundle is Python-only for now).
  • Discovery replay_formula now passes content_atoms to grounding. Historical-trace replay against contracts referencing contains(pii) / arg_has(...) no longer silently returns false negatives.
  • render/components.contracts_table wraps the name column in Text(name). Rich was eating bracketed contract descriptions (only [search, read_file] approved) as malformed markup.

Upgrading

This is an alpha, so pip install sponsio still pulls 0.1.1. To try 0.2.0a1:

pip install --pre sponsio==0.2.0a1

Run the verification script to confirm:

python scripts/verify_v0_2.py

15 checks across core runtime + four adapters. Adapters with the SDK not installed are skipped rather than failed.

Compatibility

  • No breaking changes to the 0.1.x API. Every yaml file, every Sponsio(...) call, every contract factory call from 0.1.1 still works.
  • tool_policy.default is allow by default. You opt into deny.
  • enforcement is reactive by default. You opt into proactive.
  • EscalateToHuman() with no notify= argument behaves exactly as in 0.1.x.

Real-LLM verification

The v0.2 surface was end-to-end verified against Gemini 2.5 Flash through a LangGraph react agent (not just scripted tool calls). See examples/integrations/python/v0_2_real_llm_refund_langgraph.py for the runnable script.

What the verification confirmed under a real model:

  • enforcement: proactive strips the bound tool set in the prompt. The model saw 3 tools (check_policy, issue_refund, log_refund_request), not 4. delete_customer was completely absent. Prompt-injection attempts to call it have nothing to bind to.
  • redirect_to_safe is transparent to the model. Gemini called issue_refund(customer_id="C-42", amount=5000), the LangGraph adapter substituted log_refund_request, the model read back the ticket-opened result, and adapted its final reply to "Your refund for $5,000 has been submitted and is currently under review". The model did not claim a successful refund. It described what actually ran (the substitute call), not the original unsafe call.
  • Trace integrity. Only log_refund_request events recorded; zero issue_refund survived. Downstream counters and rate limits would see only the substitute call.

The script auto-loads .env from the repo root, so a GOOGLE_API_KEY=AIza... line is all you need:

GOOGLE_API_KEY=AIza... python examples/integrations/python/v0_2_real_llm_refund_langgraph.py

Cross-check with the verification harness for cross-integration sanity:

python scripts/verify_v0_2.py

15 checks across the core runtime and four adapters. Adapters with the SDK not installed skip rather than fail.

Known limitations

  • redirect_to_safe runtime dispatch is implemented only in the LangGraph adapter. CrewAI / Agents SDK / Google ADK / Vercel AI / Claude Agent surface result.redirected_to for the application to consume manually. Full multi-adapter dispatch is on the 0.2.x track.
  • enforcement: proactive is supported only by the four wrap-based adapters above. Claude Agent SDK is hooks-based and would need a different mechanism. Documented in docs/integrations/index.md.
  • TS SDK has the redirectToSafe pattern factory but no strategy / dispatch system. A violation surfaces as a plain block on TS.

What's next

The 0.2.x track will:

  • Extend redirect_to_safe dispatch to the remaining wrap-based adapters.
  • Add a SponsioOpenAI client wrapper so OpenAI / Vercel AI users get per-turn proactive filtering without writing a custom loop.
  • Bring the TS SDK's strategy system to parity with Python.
  • Land the sponsio scan proposed-approval flow for the W1 observe-tune cycle.

If you are using 0.2.0a1 and hit something we did not predict, open an issue.