Skip to content

feat: safety governance layer — kill switch & admin controls#29

Merged
artugro merged 2 commits intofeat/multi-directional-orchestrationfrom
feat/safety-governance-layer
Apr 9, 2026
Merged

feat: safety governance layer — kill switch & admin controls#29
artugro merged 2 commits intofeat/multi-directional-orchestrationfrom
feat/safety-governance-layer

Conversation

@artugro
Copy link
Copy Markdown
Collaborator

@artugro artugro commented Apr 9, 2026

Summary

  • Platform-wide emergency halt: Redis-backed flag that instantly blocks all agent operations (broker, streaming, networks, A2A, workflows) with a single admin API call
  • Per-agent kill switch: Admin can force-disable any agent — sets is_active=False in DB and caches in Redis for fast rejection at all chokepoints
  • Admin API: 6 endpoints under /admin for kill/reactivate agents, halt/resume platform, view status, and list disabled agents
  • Distributed halt codes: Trustees hold bcrypt-hashed codes that can halt the platform without any JWT — the code is the auth. Asymmetric by design: easy to stop, hard to restart
  • Public safety endpoints: GET /safety/status (anyone can check) and POST /safety/halt (code-authenticated, no login needed)
  • Bug fix: Streaming broker path (invoke_agent_stream) was missing is_active check — streaming agents could be invoked even when inactive
  • Network safety: _validate_communication() now checks agent-level is_active for all network participants, and handle_callback() enforces platform halt

New files

  • src/services/safety.py — SafetyService (platform halt, agent kill, Redis cache)
  • src/core/admin_auth.pyget_admin_user dependency
  • src/routes/admin.py — Admin API routes
  • src/routes/safety.py — Public halt endpoint + admin halt code management
  • src/models/halt_code.py — HaltCode model (trustee codes)
  • alembic/versions/2026_04_09_0001-add_is_admin_to_users.py — is_admin migration
  • alembic/versions/2026_04_09_0002-add_halt_codes_table.py — halt_codes migration

API Endpoints

Public (no auth)

  • GET /safety/status — platform halt status
  • POST /safety/halt — halt with trustee code

Admin only

  • POST /admin/agents/{uuid}/kill — force-disable agent
  • POST /admin/agents/{uuid}/reactivate — re-enable agent
  • POST /admin/platform/halt — emergency halt
  • POST /admin/platform/resume — resume operations
  • GET /admin/platform/status — full status with disabled count
  • GET /admin/agents/disabled — list disabled agents
  • POST /safety/codes — create halt code (returns plaintext once)
  • GET /safety/codes — list codes (no plaintext)
  • DELETE /safety/codes/{id} — revoke code

Test plan

  • Run migrations: alembic upgrade head
  • Promote admin: UPDATE users SET is_admin = true WHERE email = '...';
  • Create halt code via POST /safety/codes — save the returned code
  • Test public halt: POST /safety/halt with the code — verify 503 on all paths
  • Resume: POST /admin/platform/resume
  • Kill agent: POST /admin/agents/{uuid}/kill — verify broker returns error
  • Reactivate: POST /admin/agents/{uuid}/reactivate
  • Streaming invoke with inactive agent — verify rejection (was a bug)

🤖 Generated with Claude Code

Add platform-wide emergency halt, per-agent kill switch, and admin API
to enable ethical control over AI agent operations. This ensures all
communication paths (broker, streaming, networks, A2A, workflows) can
be shut down instantly when needed.

- Add is_admin to User model with migration
- Add SafetyService with Redis-backed platform halt and agent status cache
- Add admin auth dependency (get_admin_user)
- Add admin API: kill/reactivate agents, halt/resume platform, status
- Expose is_active in AgentUpdate schema
- Enforce platform halt check at all communication chokepoints
- Fix streaming broker path missing is_active check
- Add agent-level active checks in network channel validation
- Add PlatformHaltedException (503) and AgentDisabledException (403)
- Filter inactive agents in workflow resolver discovery

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@artugro artugro self-assigned this Apr 9, 2026
Trustees can halt the platform with a code — no JWT needed. Codes are
bcrypt-hashed, shown once on creation, and managed by admins. Asymmetric
by design: easy to stop (code), hard to restart (admin auth).

- Add HaltCode model and migration
- Add public POST /safety/halt (code-authenticated)
- Add public GET /safety/status
- Add admin endpoints: create/list/revoke halt codes
- Register safety router in main.py

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@artugro artugro merged commit c2cbbb0 into feat/multi-directional-orchestration Apr 9, 2026
artugro added a commit that referenced this pull request Apr 9, 2026
* feat: multi-directional agent orchestration with calls, messages, and mailboxes

Add communication network infrastructure enabling bidirectional,
topology-rich agent communication. External agents can now proactively
send messages back via callback URLs (reply_url pattern), solving the
invocability asymmetry where only orchestrators could initiate.

Phase 1 - Network foundation: CommunicationNetwork, NetworkParticipant,
NetworkMessage models with Redis-backed context accumulation.

Phase 2 - Three communication channels: synchronous calls, near-real-time
messages (webhook push), and async mailboxes (polling). Callback endpoint
enables external agents to push messages into the network.

Phase 3 - Complex topologies: loop steps with convergence detection
(similarity, approval, max iterations), fan-in aggregation (merge, vote,
LLM summarize), and topology validation (mesh, star, ring).

Phase 4 - A2A protocol alignment: Agent Card generation, JSON-RPC
protocol adapter, and A2A-compatible endpoints for interoperability
with Google's Agent-to-Agent protocol.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: A2A agent discovery and import as first-class agents

Add discovery service that fetches remote A2A Agent Cards, registers
them in the Intuno registry, generates embeddings, and indexes them
in Qdrant. Imported A2A agents become fully discoverable, invocable,
and can join communication networks — identical to natively registered
agents.

New endpoints:
- POST /a2a/agents/import — import single agent by URL
- POST /a2a/agents/import/batch — import multiple agents
- POST /a2a/agents/{id}/refresh — re-fetch card and update
- GET /a2a/agents/fetch-card?url= — preview card without importing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add communication networks and A2A documentation

Add NETWORKS.md covering communication channels, topologies, workflow
loops/aggregation, and the reply_url bidirectional pattern. Add A2A.md
covering agent import, discovery, protocol mapping, and examples.
Update PROJECT.md with new concepts and doc index. Update
API_ENDPOINTS.md with all network, channel, callback, and A2A endpoints.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add integration tests for communication networks

Tests cover the full network lifecycle: create network, add participants,
exchange messages and mailbox items, verify shared context, bidirectional
callbacks, multi-participant context sharing, A2A platform card, agent
card generation, and agent-linked participants.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: safety governance layer — kill switch & admin controls (#29)

* feat: add safety governance layer with kill switch and admin controls

Add platform-wide emergency halt, per-agent kill switch, and admin API
to enable ethical control over AI agent operations. This ensures all
communication paths (broker, streaming, networks, A2A, workflows) can
be shut down instantly when needed.

- Add is_admin to User model with migration
- Add SafetyService with Redis-backed platform halt and agent status cache
- Add admin auth dependency (get_admin_user)
- Add admin API: kill/reactivate agents, halt/resume platform, status
- Expose is_active in AgentUpdate schema
- Enforce platform halt check at all communication chokepoints
- Fix streaming broker path missing is_active check
- Add agent-level active checks in network channel validation
- Add PlatformHaltedException (503) and AgentDisabledException (403)
- Filter inactive agents in workflow resolver discovery

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add distributed halt codes and public safety endpoints

Trustees can halt the platform with a code — no JWT needed. Codes are
bcrypt-hashed, shown once on creation, and managed by admins. Asymmetric
by design: easy to stop (code), hard to restart (admin auth).

- Add HaltCode model and migration
- Add public POST /safety/halt (code-authenticated)
- Add public GET /safety/status
- Add admin endpoints: create/list/revoke halt codes
- Register safety router in main.py

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant