Skip to content

fix(spawn): resolve parent agent from context in managed mode#4

Merged
viettranx merged 1 commit intonextlevelbuilder:mainfrom
xthanhn91:fix/spawn-parent-agent-managed-mode
Feb 27, 2026
Merged

fix(spawn): resolve parent agent from context in managed mode#4
viettranx merged 1 commit intonextlevelbuilder:mainfrom
xthanhn91:fix/spawn-parent-agent-managed-mode

Conversation

@xthanhn91
Copy link
Copy Markdown
Contributor

Problem

In managed mode, NewSpawnTool and NewSubagentTool are constructed with a hardcoded parentID of "default" (gateway.go:246-247). When a non-default agent spawns a subagent, the announce routes back through gateway_consumer.go using parent_agent metadata — which is "default". The session key resolves to agent:default:..., and since agent "default" does not exist in managed mode, the announce fails:

level=INFO  msg="subagent spawned" parent=default ...
level=INFO  msg="subagent announce → scheduler" session=agent:default:rua-telegram:group:...
level=ERROR msg="subagent announce: agent run failed" error="agent default not found: agent not found: default"

This affects all non-default agents in managed mode deployments.

Fix

Add ctxAgentKey to tool context keys (same pattern as ctxChannel, ctxChatID, etc.):

  1. context_keys.go — new ctxAgentKey constant + WithToolAgentKey / ToolAgentKeyFromCtx
  2. loop.go — inject l.id (agent key) into ctx before tool execution
  3. subagent_spawn_tool.go — read agent key from ctx, fall back to construction-time parentID
  4. subagent_tool.go — same ctx-based resolution for RunSync

Standalone mode is unaffected: when ctxAgentKey is empty, the fallback "default" is used.

Test plan

  • go build ./... passes
  • Managed mode: secondary agent spawns subagent → announce routes to correct parent session
  • Standalone mode: spawn still works (fallback to "default")

In managed mode, multiple agents share a single tool registry. The spawn
and subagent tools were constructed with a hardcoded parentID of "default",
which does not exist as an agent in managed mode deployments. This caused
subagent announces to fail with "agent default not found" when any
non-default agent (e.g. a secondary agent) tried to spawn a subagent.

Fix: add ctxAgentKey to tool context keys (same pattern as channel/chatID),
inject the calling agent's key from the loop, and read it in spawn/subagent
tools with a fallback to the construction-time default for backward
compatibility with standalone mode.
@viettranx viettranx merged commit 0105bab into nextlevelbuilder:main Feb 27, 2026
MiltonSilvaJr referenced this pull request in vellus-ai/argoclaw Mar 22, 2026
Sprint 0 — Security hardening before feature development.

HIGH fixes:
- #1: Whitelist table names in execMapUpdate() — prevents SQL injection
  via dynamic table name (store/pg/helpers.go)
- #2: Log invalid groupBy values in snapshot queries (store/pg/snapshot.go)
- #3: Validated shellEscape() — single-quote wrapping is correct;
  added PBT tests for shell injection (tools/dynamic_tool_security_test.go)

MEDIUM fixes:
- #4-5: Log security warnings for no-token and viewer-fallback auth
  (gateway/router.go)
- #6: Restrict CORS on OpenAPI endpoint — removed wildcard, allow only
  localhost origins (http/openapi.go)
- #7: Add CheckSSRFWithPinning() for DNS rebinding TOCTOU prevention
  (tools/web_shared.go)
- #8: Log warning when TLS verification is disabled
  (tracing/otelexport/exporter.go)
- #9: Pin all Python package versions in Dockerfile — prevents
  supply chain attacks via unpinned dependencies
- #10: Change HOME fallback from /tmp to /app — prevents temp dir
  abuse (tools/credentialed_exec.go)

Also fixes arargoclaw double-rename bug in 356 Go import paths.

Tests: PBT tests for table whitelist and shell escaping (testing/quick).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
MiltonSilvaJr referenced this pull request in vellus-ai/argoclaw Mar 22, 2026
Sprint 0 — Security hardening before feature development.

HIGH fixes:
- #1: Whitelist table names in execMapUpdate() — prevents SQL injection
  via dynamic table name (store/pg/helpers.go)
- #2: Log invalid groupBy values in snapshot queries (store/pg/snapshot.go)
- #3: Validated shellEscape() — single-quote wrapping is correct;
  added PBT tests for shell injection (tools/dynamic_tool_security_test.go)

MEDIUM fixes:
- #4-5: Log security warnings for no-token and viewer-fallback auth
  (gateway/router.go)
- #6: Restrict CORS on OpenAPI endpoint — removed wildcard, allow only
  localhost origins (http/openapi.go)
- #7: Add CheckSSRFWithPinning() for DNS rebinding TOCTOU prevention
  (tools/web_shared.go)
- #8: Log warning when TLS verification is disabled
  (tracing/otelexport/exporter.go)
- #9: Pin all Python package versions in Dockerfile — prevents
  supply chain attacks via unpinned dependencies
- #10: Change HOME fallback from /tmp to /app — prevents temp dir
  abuse (tools/credentialed_exec.go)

Also fixes arargoclaw double-rename bug in 356 Go import paths.

Tests: PBT tests for table whitelist and shell escaping (testing/quick).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@viettranx viettranx mentioned this pull request Mar 23, 2026
blackbirdzzzz365-gif pushed a commit to blackbirdzzzz365-gif/goclaw that referenced this pull request Apr 12, 2026
9 checkpoint documents covering the upgrade from 43% to ~85% pattern
matching with Claude Code's architectural patterns.

Checkpoints:
- CP-00: Current state analysis
- CP-01: Context defense 5 layers (Pattern nextlevelbuilder#9)
- CP-02: Concurrency-safe partitioning (Pattern nextlevelbuilder#4)
- CP-03: Streaming tool execution (Pattern nextlevelbuilder#5)
- CP-04: Escalating recovery (Pattern nextlevelbuilder#3)
- CP-05: Context modifier chain + fork isolation (Patterns nextlevelbuilder#6, nextlevelbuilder#8)
- CP-06: Permission classification pipeline (Pattern nextlevelbuilder#10)
- CP-07: Skill system upgrade (Patterns nextlevelbuilder#11-13)
- CP-08: Plugin ecosystem (Patterns nextlevelbuilder#14-16)

Based on analysis from "Giai phau mot Agentic Operating System"
(18 patterns from 513K LOC Claude Code source).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
viettranx added a commit that referenced this pull request May 7, 2026
…scord

Lifts the L106 deferral from Plan #4 P05. Users in chat can now bind, query,
and clear the per-session project binding without an admin RPC roundtrip.

Subcommands (all 3 channels):
- /project list                — projects the user has access to
- /project current             — show current binding
- /project switch <slug>       — bind this session to <slug> (RBAC-gated)
- /project clear               — clear the binding (fall back to channel
                                 default + parent override)
- /project | /project help     — usage text

Implementation:
- internal/channels/project_command.go: shared HandleProjectCommand entry
  point. Channels parse + format only; permission check
  (ProjectGrantStore.ResolveProjectRole rank>0 OR isOwner) and the
  agent_sessions.project_id write live in one place.
- 3 thin per-channel adapters (commands_project.go) build the canonical
  session_key, resolve users.id from the channel sender via
  ContactCollector.ResolveTenantUserID, and forward to the shared handler.
- Discord channel migrated from positional store args to the Options
  pattern Telegram and Feishu already use; gateway_channels_setup.go now
  wires Sessions/Projects/ProjectGrants stores into all 3 channels.
- Storage: agent_sessions.project_id column directly. Same column the
  WS RPC path (Layer 1, admin) writes — bot command and admin share one
  store API. Override resets when the session rotates; no TTL bookkeeping.
- Resolver unchanged: loop_context.go Source 1 already reads
  session.ProjectID, so the new write surfaces on the next agent run.

Tests:
- 13 unit tests covering help / not-configured / switch success / owner
  bypass / permission deny / no user_id / project not found / clear /
  current bound + unbound / list / slug whitespace / update error.
- Full unit suite + sqliteonly build + integration suite remain green.

Plan: plans/260507-1632-layer2-session-project-override/plan.md
Roadmap: docs/development-roadmap.md updated to reflect Layer 2 landed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants