feat: safety governance layer — kill switch & admin controls by artugro · Pull Request #29 · IntunoAI/intuno

artugro · 2026-04-09T19:49:06Z

Summary

Platform-wide emergency halt: Redis-backed flag that instantly blocks all agent operations (broker, streaming, networks, A2A, workflows) with a single admin API call
Per-agent kill switch: Admin can force-disable any agent — sets is_active=False in DB and caches in Redis for fast rejection at all chokepoints
Admin API: 6 endpoints under /admin for kill/reactivate agents, halt/resume platform, view status, and list disabled agents
Distributed halt codes: Trustees hold bcrypt-hashed codes that can halt the platform without any JWT — the code is the auth. Asymmetric by design: easy to stop, hard to restart
Public safety endpoints: GET /safety/status (anyone can check) and POST /safety/halt (code-authenticated, no login needed)
Bug fix: Streaming broker path (invoke_agent_stream) was missing is_active check — streaming agents could be invoked even when inactive
Network safety: _validate_communication() now checks agent-level is_active for all network participants, and handle_callback() enforces platform halt

New files

src/services/safety.py — SafetyService (platform halt, agent kill, Redis cache)
src/core/admin_auth.py — get_admin_user dependency
src/routes/admin.py — Admin API routes
src/routes/safety.py — Public halt endpoint + admin halt code management
src/models/halt_code.py — HaltCode model (trustee codes)
alembic/versions/2026_04_09_0001-add_is_admin_to_users.py — is_admin migration
alembic/versions/2026_04_09_0002-add_halt_codes_table.py — halt_codes migration

API Endpoints

Public (no auth)

GET /safety/status — platform halt status
POST /safety/halt — halt with trustee code

Admin only

POST /admin/agents/{uuid}/kill — force-disable agent
POST /admin/agents/{uuid}/reactivate — re-enable agent
POST /admin/platform/halt — emergency halt
POST /admin/platform/resume — resume operations
GET /admin/platform/status — full status with disabled count
GET /admin/agents/disabled — list disabled agents
POST /safety/codes — create halt code (returns plaintext once)
GET /safety/codes — list codes (no plaintext)
DELETE /safety/codes/{id} — revoke code

Test plan

Run migrations: alembic upgrade head
Promote admin: UPDATE users SET is_admin = true WHERE email = '...';
Create halt code via POST /safety/codes — save the returned code
Test public halt: POST /safety/halt with the code — verify 503 on all paths
Resume: POST /admin/platform/resume
Kill agent: POST /admin/agents/{uuid}/kill — verify broker returns error
Reactivate: POST /admin/agents/{uuid}/reactivate
Streaming invoke with inactive agent — verify rejection (was a bug)

🤖 Generated with Claude Code

Add platform-wide emergency halt, per-agent kill switch, and admin API to enable ethical control over AI agent operations. This ensures all communication paths (broker, streaming, networks, A2A, workflows) can be shut down instantly when needed. - Add is_admin to User model with migration - Add SafetyService with Redis-backed platform halt and agent status cache - Add admin auth dependency (get_admin_user) - Add admin API: kill/reactivate agents, halt/resume platform, status - Expose is_active in AgentUpdate schema - Enforce platform halt check at all communication chokepoints - Fix streaming broker path missing is_active check - Add agent-level active checks in network channel validation - Add PlatformHaltedException (503) and AgentDisabledException (403) - Filter inactive agents in workflow resolver discovery Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Trustees can halt the platform with a code — no JWT needed. Codes are bcrypt-hashed, shown once on creation, and managed by admins. Asymmetric by design: easy to stop (code), hard to restart (admin auth). - Add HaltCode model and migration - Add public POST /safety/halt (code-authenticated) - Add public GET /safety/status - Add admin endpoints: create/list/revoke halt codes - Register safety router in main.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: multi-directional agent orchestration with calls, messages, and mailboxes Add communication network infrastructure enabling bidirectional, topology-rich agent communication. External agents can now proactively send messages back via callback URLs (reply_url pattern), solving the invocability asymmetry where only orchestrators could initiate. Phase 1 - Network foundation: CommunicationNetwork, NetworkParticipant, NetworkMessage models with Redis-backed context accumulation. Phase 2 - Three communication channels: synchronous calls, near-real-time messages (webhook push), and async mailboxes (polling). Callback endpoint enables external agents to push messages into the network. Phase 3 - Complex topologies: loop steps with convergence detection (similarity, approval, max iterations), fan-in aggregation (merge, vote, LLM summarize), and topology validation (mesh, star, ring). Phase 4 - A2A protocol alignment: Agent Card generation, JSON-RPC protocol adapter, and A2A-compatible endpoints for interoperability with Google's Agent-to-Agent protocol. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: A2A agent discovery and import as first-class agents Add discovery service that fetches remote A2A Agent Cards, registers them in the Intuno registry, generates embeddings, and indexes them in Qdrant. Imported A2A agents become fully discoverable, invocable, and can join communication networks — identical to natively registered agents. New endpoints: - POST /a2a/agents/import — import single agent by URL - POST /a2a/agents/import/batch — import multiple agents - POST /a2a/agents/{id}/refresh — re-fetch card and update - GET /a2a/agents/fetch-card?url= — preview card without importing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add communication networks and A2A documentation Add NETWORKS.md covering communication channels, topologies, workflow loops/aggregation, and the reply_url bidirectional pattern. Add A2A.md covering agent import, discovery, protocol mapping, and examples. Update PROJECT.md with new concepts and doc index. Update API_ENDPOINTS.md with all network, channel, callback, and A2A endpoints. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add integration tests for communication networks Tests cover the full network lifecycle: create network, add participants, exchange messages and mailbox items, verify shared context, bidirectional callbacks, multi-participant context sharing, A2A platform card, agent card generation, and agent-linked participants. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: safety governance layer — kill switch & admin controls (#29) * feat: add safety governance layer with kill switch and admin controls Add platform-wide emergency halt, per-agent kill switch, and admin API to enable ethical control over AI agent operations. This ensures all communication paths (broker, streaming, networks, A2A, workflows) can be shut down instantly when needed. - Add is_admin to User model with migration - Add SafetyService with Redis-backed platform halt and agent status cache - Add admin auth dependency (get_admin_user) - Add admin API: kill/reactivate agents, halt/resume platform, status - Expose is_active in AgentUpdate schema - Enforce platform halt check at all communication chokepoints - Fix streaming broker path missing is_active check - Add agent-level active checks in network channel validation - Add PlatformHaltedException (503) and AgentDisabledException (403) - Filter inactive agents in workflow resolver discovery Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add distributed halt codes and public safety endpoints Trustees can halt the platform with a code — no JWT needed. Codes are bcrypt-hashed, shown once on creation, and managed by admins. Asymmetric by design: easy to stop (code), hard to restart (admin auth). - Add HaltCode model and migration - Add public POST /safety/halt (code-authenticated) - Add public GET /safety/status - Add admin endpoints: create/list/revoke halt codes - Register safety router in main.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

artugro self-assigned this Apr 9, 2026

artugro merged commit c2cbbb0 into feat/multi-directional-orchestration Apr 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: safety governance layer — kill switch & admin controls#29

feat: safety governance layer — kill switch & admin controls#29
artugro merged 2 commits intofeat/multi-directional-orchestrationfrom
feat/safety-governance-layer

artugro commented Apr 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

artugro commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New files

API Endpoints

Public (no auth)

Admin only

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

artugro commented Apr 9, 2026 •

edited

Loading