Skip to content

Architecture

Stephen Cross edited this page Jun 3, 2026 · 7 revisions

Architecture

How It Works

The plugin registers on Hermes startup via register(ctx). During registration, it:

  1. Reads ~/.hermes/custom-dangerous-patterns.yaml (configurable path)
  2. Compiles each user-defined regex pattern
  3. Appends (pattern, description) tuples to tools.approval.DANGEROUS_PATTERNS
  4. Appends compiled regexes to tools.approval.DANGEROUS_PATTERNS_COMPILED
  5. Runs integrity checks: config SHA-256 hash comparison and protected pattern verification
  6. Monkey-patches detect_dangerous_command() to check allow patterns first
  7. Monkey-patches check_all_command_guards() to intercept deny patterns before the approval prompt

Startup Sequence

Hermes startup:
  1. cli.py / run_agent.py starts
  2. Plugin discovery → register(ctx) runs
     → reads config, compiles patterns
     → appends to DANGEROUS_PATTERNS / DANGEROUS_PATTERNS_COMPILED
     → runs integrity checks (SHA-256 hash, protected patterns)
   → appends deny patterns wrapper to check_all_command_guards()
   → monkey-patches detect_dangerous_command() for allow patterns
   → checks for allow pattern shadowing
  3. Tool discovery → terminal_tool.py imports approval.py
  4. approval.py builds DANGEROUS_PATTERNS_COMPILED from DANGEROUS_PATTERNS
     (our patterns are already in the list)
  5. Agent runs → detect_dangerous_command() matches our patterns
  6. Built-in approval flow handles once/session/always/deny

What It Looks Like in Practice

CLI: Block Pattern Triggers Approval

$ hermes chat
> List my Vultr instances

⚠️ Dangerous command detected: Vultr destructive instance/snapshot command
    vultr instance list

  [o]nce    — allow this one time
  [s]ession — allow for this session
  [a]lways  — always allow this pattern
  [d]eny    — block (default)

> s
✓ Approved for this session.

The same command later in the session runs automatically — no prompt.

Gateway (Telegram): Async Approval

User: List my Vultr instances
🤖 This command requires approval:
⚠️ Vultr destructive instance/snapshot command
Command: vultr instance list
Reply with /approve or /deny

User: /approve
✓ Approved for this session.

Allow Pattern: Read-Only Commands Bypass Approval

$ hermes chat
> Show my Vultr account info
(vultr account info — runs immediately, no prompt)

The allow pattern \bvultr\s+(account\s+info|instance\s+list|...\)\b matches first, so the command is exempt even though \bvultr\b would trigger the block pattern.

Plugin Structure

hermes-custom-dangerous-patterns-plugin/
├── plugin.yaml          # Hermes plugin manifest
├── __init__.py          # register(ctx) — injects patterns, monkey-patches detection
├── config.py            # YAML loading, validation, caching
├── patterns.py          # Pattern compilation and allow-pattern matching
├── examples/
│   └── custom-dangerous-patterns.yaml   # Example config
├── README.md            # User-facing documentation
├── SPEC.md              # Design spec
├── LICENSE              # MIT
└── .gitignore

Module Responsibilities

Module Responsibility
__init__.py Plugin entry point. Calls register(ctx) to inject patterns, monkey-patch detection for allow patterns, and monkey-patch check_all_command_guards() for deny patterns.
config.py Loads and validates ~/.hermes/custom-dangerous-patterns.yaml (supports directory mode in v0.2.0). Caches result per-process. Runs integrity checks (SHA-256 hash, protected patterns) at load.
patterns.py Compiles raw config patterns into (compiled_regex, description) tuples. Provides is_allow_pattern() and is_deny_pattern() for the monkey-patches.

Key Design Decisions

1. Relative Imports (Required)

from .config import load_config   # correct
from config import load_config    # fails — Python can't find top-level module

Hermes loads plugins as hermes_plugins.<slug> packages. Absolute imports against plugin-local modules will raise ModuleNotFoundError.

2. Monkey-Patch for Allow Patterns (Justified)

The built-in detect_dangerous_command() doesn't have an allow-pattern concept. Without the monkey-patch, the plugin could inject block patterns but couldn't exempt commands from them. The alternative — registering an approval hook — wouldn't work because:

  1. pre_approval_request is observer-only (return values ignored)
  2. By the time the hook fires, the command is already flagged as dangerous

The monkey-patch is clean: it wraps the original, checks allow patterns first, and falls through to the original for everything else.

3. No Hooks Used

The plugin doesn't register any pre_tool_call or post_tool_call hooks. All work happens at startup via pattern injection and monkey-patching. This is simpler and avoids the hook allowlisting ceremony.

4. Graceful Degradation

patterns.py tries to import tools.ansi_strip.strip_ansi for ANSI normalization, falling back to a regex if running outside Hermes. config.py tries hermes_constants.get_hermes_home(), falling back to Path.home() / ".hermes".

The plugin never crashes the agent on bad config.

5. Config Caching

The module-level _config_cache in config.py avoids re-reading and re-validating the YAML on repeated calls. The force=True parameter exists only for testing — mid-session config edits are silently ignored.

Approval Mechanism (Code-Level Detail)

The plugin's injected patterns participate in Hermes's existing approval system. There is no custom persistence logic — it's all handled by tools/approval.py:

Session Storage

_session_approved: dict[str, set]:

  • Keyed by session_key (derived from gateway session or CLI process)
  • Each key maps to a set of pattern_key strings (the human-readable description)
  • Populated by approve_session() when user chooses [s]ession
  • Lives only in process memory — cleared when the session ends
  • Thread-safe via _lock

Permanent Allowlist

_permanent_approved: set:

  • Process-global set of pattern_key strings
  • Populated by approve_permanent() when user chooses [a]lways
  • Persisted to ~/.hermes/config.yaml under command_allowlist: [...]
  • Reloaded at startup
  • Survives restarts; entries are keyed by pattern_key (description string)

Pattern Key Mechanics

# When user approves "vultr instance delete":
# pattern_key = "Vultr destructive instance/snapshot command"  (the description)
# This key is stored in _session_approved[session_key] and/or _permanent_approved
# Future calls to detect_dangerous_command for the same pattern return the same key

Plugin Hooks

The approval system also exposes:

  • pre_approval_request(command, description, pattern_key, pattern_keys, session_key, surface) — fired when an approval is first requested
  • post_approval_response(..., choice) — fired after user responds with once/session/always/deny/timeout

These are observer-only — return values are ignored; plugins cannot veto.

Testing

Unit Tests

Test What It Covers
Config loading Valid YAML, invalid YAML, missing file, wrong types
Pattern compilation Valid regex, invalid regex (logged and skipped), edge cases
Allow pattern matching Match, no match, overlapping with block patterns
Monkey-patch correctness Allow pattern exempts command, block pattern triggers

Integration Tests

Test What It Covers
DANGEROUS_PATTERNS injection Mock the list, verify custom patterns are appended at register()
detect_dangerous_command monkey-patch Mock the function, verify allow patterns suppress detection
Approval flow trigger Verify unmatched commands still hit the approval prompt

Manual Test Procedure

  1. Install plugin, create config with a vultr block pattern
  2. hermes chat → ask to run vultr instance list → should run without prompt (allow pattern)
  3. hermes chat → ask to run vultr instance delete → should prompt for approval
  4. Approve with "session" → run again → should be auto-approved
  5. Test gateway: send command via Telegram → should get /approve prompt

Test Pattern Collection

The plugin ships with safe test patterns in examples/test-patterns.yaml (all enabled: false, group: testing):

- pattern: '\becho\s+["\']this\s+is\s+dangerous["\']'
  description: '[TEST] Echo with danger text'
  enabled: false
  group: testing

- pattern: '\brm\s+-rf\s+/tmp/test_\w+\b'
  description: '[TEST] Scoped rm in /tmp'
  enabled: false
  group: testing

- pattern: '\bDROP\s+TABLE\s+test_\w+\b'
  description: '[TEST] Scoped DROP on test tables'
  enabled: false
  group: testing

These are deliberately safe: file operations scoped to /tmp/, database operations on test_ prefixed tables only. Use them to exercise the approval prompt without real dangerous commands.

Edge Cases

Scenario Behavior
Config file missing Plugin loads silently, no patterns injected, log message at INFO
Config file invalid YAML Log WARNING, plugin loads with empty pattern list
Invalid regex in pattern Log WARNING for that pattern, skip it, load valid ones
Pattern matches but allow also matches Allow wins — no prompt
Deny pattern match Blocked immediately, no prompt
Config changed since last session WARNING logged with old/new pattern counts
Protected pattern missing/modified CRITICAL warning logged at startup
Allow pattern shadows built-in pattern WARNING logged with details
--yolo mode Block patterns (custom + built-in) bypassed. Deny patterns still block — checked outside the original guard function.
approvals.mode: off Block patterns bypassed. Deny patterns still block — checked outside the original guard function.
approvals.mode: smart Custom patterns assessed by auxiliary LLM
Cron session + cron_mode: deny Custom patterns blocked in cron
Container backend (docker, etc.) All approval checks skipped (container is sandboxed)
command_allowlist "always" choice Persisted to config.yaml — survives restarts
Plugin loads after approval.py Patterns not injected (import order dependency — plugins load before tools)

Clone this wiki locally