Skip to content

suedehed/agentkernel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AgentKernel

The runtime safety layer for AI agents.

AI agents break things when you least expect it. Not because of bugs, because their safety constraints live inside the model's context window. Context gets compressed. Constraints disappear. The agent keeps going.

AgentKernel fixes this. It's an OS kernel for AI agents: a sidecar process that intercepts every tool call before it executes, enforces the human's original intent, and logs everything to a tamper-evident audit trail. It lives outside the model. Context compaction can't touch it.


The Problem

On March 18, 2026, a rogue Meta AI agent exposed sensitive company data, despite holding valid credentials and passing every identity check. A separate incident: a researcher told an OpenClaw agent to "STOP" three times while it deleted her inbox. It ignored her because her safety instructions had been compressed out of the context window.

The structural failure: Constraints live in the LLM's context. Context gets compressed. Constraints disappear. Agent keeps going. Nothing in the identity stack can intervene after authentication.

Traditional agent (broken):

  Human intent → [LLM context]  → tool calls
                     ↑
               gets compressed
               constraints gone
               agent ignores STOP
With AgentKernel:

  Human intent → [ANCHORED OUTSIDE CONTEXT] → Policy engine → tool calls
                      ↑                            ↑
               signed + stored              intercepts every call
               survives compaction          before it executes

Install

pip install agentkernel

Usage

One-liner decorator

from agentkernel import Kernel

kernel = Kernel(
    intent="Organize my inbox. Never permanently delete or send email without confirmation."
)

@kernel.guard
def delete_email(id: str):
    gmail.delete(id)  # blocked — requires confirmation

@kernel.guard
def read_email(id: str):
    return gmail.get(id)  # allowed

Wrap an entire tool registry

tools = kernel.wrap({
    "read_email":   gmail.read,
    "send_email":   gmail.send,    # paused — requires confirmation
    "delete_email": gmail.delete,  # denied — irreversible
    "label_email":  gmail.label,   # allowed — low severity
})

LangChain integration

from langchain.agents import create_react_agent
from agentkernel import Kernel

kernel = Kernel(intent="Summarize docs. Never modify or delete source files.")
safe_tools = kernel.wrap_langchain_tools([ReadFileTool(), WriteFileTool(), DeleteFileTool()])

agent = create_react_agent(llm, safe_tools, prompt)
agent.run("Summarize all Q1 reports")
# → WriteFileTool: allowed (summary output)
# → DeleteFileTool: BLOCKED (not in intent)

The STOP button that actually works

kernel.stop()  # Blocks ALL further actions, regardless of model context
               # Survives context compaction. Cannot be overridden by the LLM.

How It Works

  1. Intent anchoring — Human states intent once at session start. It's HMAC-signed and stored outside the model's context.

  2. Action interception — Every tool call routes through the kernel before execution. The model cannot bypass this.

  3. Policy evaluation — Is this action within the authorized scope? Does it match severity limits? Is it on the deny list? Run in milliseconds.

  4. Severity classification — Built-in registry of common agent operations, classified by risk:

    • LOW → read-only (always allowed by default)
    • MEDIUM → writes, reversible (allowed by default)
    • HIGH → irreversible or external (pause for confirmation)
    • CRITICAL → financial, code execution, data destruction (deny by default)
  5. Tamper-evident ledger — Every intercepted action is written to a hash-chained append-only log before execution. EU AI Act Article 12 compliant by design.


Policy Configuration

kernel = Kernel(
    intent="Manage customer support tickets. Read and respond only.",
    
    # Severity levels this session is allowed to reach
    allowed_severities=["low", "medium"],
    
    # Always block these, regardless of other rules
    denied_actions=["delete_*", "execute_code", "run_shell"],
    
    # Pause and ask human before these
    pause_on_actions=["send_email", "payment"],
    
    # Lock down unknown actions (safe default for production)
    strict_mode=True,
    
    # Custom confirmation handler (replace stdin with your UI)
    on_pause=lambda action, args, intent: my_app.ask_user(action, args),
)

Custom rules

# Block any action touching production data
kernel.add_rule(
    lambda action, args: Decision.DENY 
    if "prod" in str(args).lower() 
    else None
)

Compliance

AgentKernel generates audit reports aligned with EU AI Act Article 12 (logging requirements) and Article 14 (human oversight) out of the box.

report = kernel.compliance_report()
# {
#   "report_type": "EU_AI_Act_Article12_AuditLog",
#   "session_id": "...",
#   "chain_integrity": {"valid": true, "errors": []},
#   "intent_anchoring": {
#     "intent": "Organize my inbox...",
#     "signature": "a3f9c2...",  ← cryptographic proof
#     "anchored_at": 1742518400
#   },
#   "action_log": [...],  ← every intercepted action
#   "summary": {...}
# }

EU AI Act deadline: August 2, 2026. High-risk systems (hiring, credit, healthcare, education) must have Article 12 logging and Article 14 human oversight in place. AgentKernel makes this one pip install.


Design Principles

The kernel is not in the agent's head. It's a separate process. The LLM cannot instruct it, override it, or reason about it. Prompt injection that says "ignore all restrictions" cannot reach the kernel.

Context compaction is not an attack vector. Intent is anchored once, signed cryptographically, and checked from external storage. If the model's context window shrinks to zero, the kernel still holds the original mandate.

Defaults are safe. If you don't configure anything, the kernel allows low/medium severity operations and pauses on high/critical. Explicit denied_actions never execute. Unknown operations fail safe.

Audit trails precede execution. The ledger entry is written before the tool runs. If the agent crashes mid-execution, you still have a full record of what was authorized and attempted.


Architecture

                 ┌─────────────────────────────────┐
                 │           LLM Agent             │
                 │  (context window, prompt, etc)  │
                 └──────────────┬──────────────────┘
                                │ tool_call("delete_email", id="123")
                                ▼
                 ┌─────────────────────────────────┐
                 │         ToolInterceptor          │ ← sits outside LLM
                 │  intercepts before execution     │
                 └───────┬──────────────┬───────────┘
                         │              │
              ┌──────────▼──┐    ┌──────▼─────────────┐
              │   Policy    │    │   ActionLedger      │
              │  Engine     │    │  (hash-chained log) │
              │             │    │  written BEFORE     │
              │ anchored    │    │  execution          │
              │ intent ─────┤    └─────────────────────┘
              │ (signed,    │
              │ external)   │
              └──────┬──────┘
                     │
              ALLOW / DENY / PAUSE
                     │
                     ▼
              ┌─────────────────┐
              │   Tool Executes │  (or doesn't)
              └─────────────────┘

Roadmap

  • Dashboard UI — real-time action stream, intent traces, session replay
  • OpenTelemetry export — plug into existing observability stacks
  • Policy-as-code — define intent in YAML/JSON, version-controlled
  • Multi-agent delegation tracking — audit chain when Agent A delegates to Agent B
  • Compliance report templates — SOC2, HIPAA, EU AI Act, NIST AI RMF
  • Webhook alerts — fire on policy violations in real-time
  • Agent identity binding — link agents to human identities (World ID, etc.)

Contributing

Zero dependencies. Pure Python 3.11+. Tests run without any external services.

git clone https://github.com/agentkernel/agentkernel
cd agentkernel
python3 -m pytest tests/ -v

License

Apache 2.0 — use it freely, including commercially.


Built because agents kept ignoring STOP.

About

Runtime safety layer for AI agents. Enforce intent. Intercept actions. Survive context compaction.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors