Skip to content

feat(agentcore-strands): U5 Skill meta-tool + session allowlist (ships inert)#511

Merged
ericodom merged 1 commit into
mainfrom
feat/v1-agent-arch-u5-skill-meta-tool
Apr 24, 2026
Merged

feat(agentcore-strands): U5 Skill meta-tool + session allowlist (ships inert)#511
ericodom merged 1 commit into
mainfrom
feat/v1-agent-arch-u5-skill-meta-tool

Conversation

@ericodom
Copy link
Copy Markdown
Contributor

Summary

The single Skill(name, args) meta-tool that will become the sole invocation path for every skill-with-scripts once U6 makes the cutover. Ships inert — module + tests land, Dockerfile COPY picks it up, _boot_assert registers it, but nothing in server.py calls it yet.

Part of the V1 agent-architecture plan (docs/plans/2026-04-23-007-feat-v1-agent-architecture-final-call-plan.md §U5).

Why ship inert

The plan's own sequencing gates U6 (the live cutover) on U7 PASS — U7 is the shadow harness that dual-dispatches old and new paths on real invocations and measures divergence. Wiring U5 into the live Agent(tools=...) before U7 exists would swap the invocation path without the safety net the plan itself requires. This PR therefore lands the module + tests and defers wiring to U7.

Same pattern as U4 (#510) — the big security-sensitive chunks ship as reviewable, testable code without touching production flow.

What lands

  • container-sources/skill_meta_tool.pySessionAllowlist (R6/R7 intersection), invoke_skill (pure entry point), build_skill_meta_tool (Strands @tool factory), intersect_allowed_tools (narrow-only frontmatter intersection), SkillUnauthorized (distinct from SkillNotFound).
  • test_skill_meta_tool.py — 12 tests covering plan AE4 + every listed test scenario.
  • _boot_assert.EXPECTED_CONTAINER_SOURCES — adds skill_meta_tool.

Security choice: SkillUnauthorized vs SkillNotFound

SkillNotFound = slug not in catalog anywhere. SkillUnauthorized = slug exists in catalog but this session's allowlist filters it out. Distinct classes so the model cannot enumerate tenant-scoped catalog membership by probing slugs and watching which error type comes back.

Session allowlist precedence (plan R6/R7)

SessionAllowlist = (tenant_skills ∩ template_skills) − template_blocks − tenant_kill_switches

Tenant kill-switches always trump template enablement — a template cannot widen past what the tenant disabled. Test test_allowlist_template_cannot_unblock_a_tenant_kill_switch enforces this invariant.

What this PR does NOT do

  • Does not wire Skill into server.py's Agent(tools=...). Deferred to U7.
  • Does not drop the AGENTS.md-conditional around AgentSkills. Entangled with the live-path swap — lands alongside the cutover.
  • Does not suppress AgentSkills' built-in skills tool. Same reason — suppression only makes sense once Skill is canonical.

Test plan

  • test_skill_meta_tool.py — 12 tests pass
  • Full agent-container pytest — 223 green (12 new + 211 existing)
  • ruff import-sort clean
  • Ships inert — grep confirms no production import path reaches skill_meta_tool

What's next

U7 (characterization + shadow harness) consumes this PR. It dual-dispatches real invocations through run_skill_dispatch (old) and Skill → skill_dispatcher (new), logs divergence, and when shadow-clean for 30+ days per LLM slug, U6 flips the cutover.

🤖 Generated with Claude Code

…s inert)

The single `Skill(name, args)` meta-tool that U6 flips to be the sole
invocation path once U7's shadow harness validates equivalence. Today it
ships as inert code — the Dockerfile wildcard COPY picks it up (via U2a)
and _boot_assert registers it, but server.py's live Agent(tools=...) path
still routes through the existing run_skill_dispatch / composition_runner
code.

## Why ship inert

The plan (#7 §U4/U5/U6/U7) explicitly gates U6's cutover on U7 PASS
— U7 is the shadow harness that dual-dispatches both the old and new
paths on real invocations and measures divergence. Wiring U5 into the
live Agent(tools=...) before U7 exists would swap the invocation path
without the safety net the plan itself calls for. This PR therefore
ships the module + tests and defers server.py wiring to U7.

## What lands

### `container-sources/skill_meta_tool.py`
- `SessionAllowlist` — intersection of
  `tenant_skills ∩ template_skills ∩ ¬template_blocks ∩ ¬tenant_kill_switches`
  pre-computed once at Agent(tools=...). Narrow-only: a template cannot
  widen past what the tenant enabled (plan R6/R7).
- `invoke_skill(name, args, *, ctx)` — pure entry point the Strands @tool
  wrapper calls. Routes script-bundle skills to U4's `dispatch_skill_script`;
  pure-SKILL.md skills return their body for in-prompt consumption
  (no sandbox roundtrip).
- `build_skill_meta_tool(ctx)` — factory returning the coroutine the
  `@strands.tool` decorator wraps. Decoupled from the SDK so unit tests
  exercise the full decision tree without importing strands.
- `intersect_allowed_tools(declared, session_tools)` — narrow-only
  intersection of a skill's declared `allowed-tools` frontmatter against
  the session's effective tool set. Warns on declared-but-missing so
  operators can spot disabled dependencies.
- `SkillUnauthorized` — distinct error from `SkillNotFound` so the model
  cannot enumerate tenant-scoped catalog membership by probing slugs.
  Both raise; the audit log gets full context.

### `test_skill_meta_tool.py` — 12 cases
Covers plan AE4 + every listed test scenario:
- happy path: Skill("sales-prep") routes to dispatcher with correct args
- nested Skill() threads the same TurnCounters through
- pure-SKILL.md slug returns body, no sandbox
- unknown slug → SkillNotFound
- in catalog but not in session → SkillUnauthorized
- SessionAllowlist triple-constraint intersection correctness
- tenant kill-switch trumps template enablement (R7 precedence)
- allowed-tools frontmatter narrows (never widens) past session tools
- build_skill_meta_tool closure captures ctx correctly

### `_boot_assert.EXPECTED_CONTAINER_SOURCES`
Adds skill_meta_tool so the Dockerfile RUN asserts it landed.

## What this PR does NOT do

- Does NOT wire `Skill` into server.py's Agent(tools=...). Deferred to
  U7 (shadow wiring) then U6 (canonical cutover).
- Does NOT drop the AGENTS.md-conditional around AgentSkills. Plan calls
  for this at U5 but it's entangled with the live-path swap — lands
  alongside the cutover.
- Does NOT suppress AgentSkills' built-in `skills` tool. Same reason —
  suppression only makes sense once `Skill` is the canonical path.

## Test counts

- `test_skill_meta_tool.py` — 12 cases
- Full agent-container suite: 223 green (12 new + 211 existing)
- ruff import-sort clean on new files

Part of the V1 agent-architecture plan
(`docs/plans/2026-04-23-007-feat-v1-agent-architecture-final-call-plan.md` §U5).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ericodom ericodom merged commit 6eebe6d into main Apr 24, 2026
4 checks passed
@ericodom ericodom deleted the feat/v1-agent-arch-u5-skill-meta-tool branch April 24, 2026 11:32
ericodom added a commit that referenced this pull request May 5, 2026
…s inert) (#511)

The single `Skill(name, args)` meta-tool that U6 flips to be the sole
invocation path once U7's shadow harness validates equivalence. Today it
ships as inert code — the Dockerfile wildcard COPY picks it up (via U2a)
and _boot_assert registers it, but server.py's live Agent(tools=...) path
still routes through the existing run_skill_dispatch / composition_runner
code.

## Why ship inert

The plan (#7 §U4/U5/U6/U7) explicitly gates U6's cutover on U7 PASS
— U7 is the shadow harness that dual-dispatches both the old and new
paths on real invocations and measures divergence. Wiring U5 into the
live Agent(tools=...) before U7 exists would swap the invocation path
without the safety net the plan itself calls for. This PR therefore
ships the module + tests and defers server.py wiring to U7.

## What lands

### `container-sources/skill_meta_tool.py`
- `SessionAllowlist` — intersection of
  `tenant_skills ∩ template_skills ∩ ¬template_blocks ∩ ¬tenant_kill_switches`
  pre-computed once at Agent(tools=...). Narrow-only: a template cannot
  widen past what the tenant enabled (plan R6/R7).
- `invoke_skill(name, args, *, ctx)` — pure entry point the Strands @tool
  wrapper calls. Routes script-bundle skills to U4's `dispatch_skill_script`;
  pure-SKILL.md skills return their body for in-prompt consumption
  (no sandbox roundtrip).
- `build_skill_meta_tool(ctx)` — factory returning the coroutine the
  `@strands.tool` decorator wraps. Decoupled from the SDK so unit tests
  exercise the full decision tree without importing strands.
- `intersect_allowed_tools(declared, session_tools)` — narrow-only
  intersection of a skill's declared `allowed-tools` frontmatter against
  the session's effective tool set. Warns on declared-but-missing so
  operators can spot disabled dependencies.
- `SkillUnauthorized` — distinct error from `SkillNotFound` so the model
  cannot enumerate tenant-scoped catalog membership by probing slugs.
  Both raise; the audit log gets full context.

### `test_skill_meta_tool.py` — 12 cases
Covers plan AE4 + every listed test scenario:
- happy path: Skill("sales-prep") routes to dispatcher with correct args
- nested Skill() threads the same TurnCounters through
- pure-SKILL.md slug returns body, no sandbox
- unknown slug → SkillNotFound
- in catalog but not in session → SkillUnauthorized
- SessionAllowlist triple-constraint intersection correctness
- tenant kill-switch trumps template enablement (R7 precedence)
- allowed-tools frontmatter narrows (never widens) past session tools
- build_skill_meta_tool closure captures ctx correctly

### `_boot_assert.EXPECTED_CONTAINER_SOURCES`
Adds skill_meta_tool so the Dockerfile RUN asserts it landed.

## What this PR does NOT do

- Does NOT wire `Skill` into server.py's Agent(tools=...). Deferred to
  U7 (shadow wiring) then U6 (canonical cutover).
- Does NOT drop the AGENTS.md-conditional around AgentSkills. Plan calls
  for this at U5 but it's entangled with the live-path swap — lands
  alongside the cutover.
- Does NOT suppress AgentSkills' built-in `skills` tool. Same reason —
  suppression only makes sense once `Skill` is the canonical path.

## Test counts

- `test_skill_meta_tool.py` — 12 cases
- Full agent-container suite: 223 green (12 new + 211 existing)
- ruff import-sort clean on new files

Part of the V1 agent-architecture plan
(`docs/plans/2026-04-23-007-feat-v1-agent-architecture-final-call-plan.md` §U5).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant