HERMES meets PAPERCLIP — the coding agent that never forgets, never stops, and never asks twice.
Knowledge graph memory | Deployment orchestration | Service health monitoring | Adaptive throttling | Web UI
Hermes represents the swift, articulate messenger. The Paperclip Maximizer represents the theoretical model of absolute, relentless optimization toward a target goal.
custom-pi is a premium engineering extension suite for the Pi Coding Agent. It equips the host agent with persistent context recall, multi-agent wave orchestration, safe system execution tooling, and full operational autonomy — the ability to proactively monitor, react to external events, and manage the entire development lifecycle without continuous user intervention.
- Knowledge Graph Memory: SQLite-backed triplet store (Subject→Predicate→Object) with confidence scoring, TTL-based pruning, and automatic extraction from conversations and tool outputs.
- Tiered Context Recall: Intent classification decides whether to search FTS5 chat history, the knowledge graph, or system state — with cascading fallback.
- DAG Swarms: Multi-agent pipelines (Researcher, Coder, Reviewer) running in parallel to prevent single-agent execution dead-ends.
- Deployment Orchestration: Stateful CI/CD pipeline management — PR → build → tests → staging → smoke tests → production — with automatic rollback on failure.
- Webhook Ingestion: Receive events from Sentry, Datadog, GitHub, or custom CI. LLM-parsed into failure triplets with proactive triage when 3+ related errors occur.
- Service Health Monitoring: Periodic endpoint checks tracking latency, jitter, and consecutive failures. Contextual advisories modify planning based on service health.
- Adaptive Throttling: Rate limit tracking with exponential backoff circuit breaker. Resource-aware task scoring adjusts parallelism based on CPU/memory load.
- 35+ Built-in Tools: Full-featured OS, browser, LSP, AST-grep, email, cryptographic vault, SSH, and social posting integration.
- Dual Dashboards: Stream reasoning and execution logs in real-time through an interactive fullscreen TUI or a React-based web dashboard.
- Secure Sandbox: Enforced user approval gates, AES-256 encrypted configuration storage, and isolated custom plugins execution.
The workflow diagram below illustrates how custom-pi coordinates task completion through the CEO Orchestrator, the subagent swarm, and the diagnostic verification layer:
graph TD
Goal[User Input / Goal] --> CEO[CEO Orchestrator]
CEO --> Config[DAG Planner / dag-config.yaml]
Config --> Swarm{DAG Swarm Execution}
Swarm -->|Task A| Res[Researcher Agent]
Swarm -->|Task B| Cod[Coder Agent]
Res -.->|Analyzed Data| Cod
Cod --> Rev[Reviewer Agent]
Rev --> Test[Test Suite / LSP Linting]
Test -->|Validation Failed| CEO
Test -->|Validation Passed| Deliver[Final Deliverable]
subgraph Memory System
KG[Knowledge Graph<br/>Triplet Store]
FTS[FTS5 Chat History]
Auto[Auto-Learning<br/>LLM Extraction]
Prune[TTL Pruning<br/>Redundancy Merge]
end
subgraph Operational Autonomy
WH[Webhook Listener<br/>Sentry / Datadog / CI]
SM[Service Health<br/>Latency / Jitter]
RL[Rate Limiter<br/>Circuit Breaker]
RM[Resource Monitor<br/>CPU / RAM]
end
CEO --> KG
CEO --> FTS
Auto --> KG
Prune --> KG
WH --> CEO
SM --> CEO
RL --> CEO
RM --> CEO
Install the package globally using npm:
npm install -g custom-piTo run browser automation tasks, install Playwright's Chromium binary:
npx playwright install chromiumFor full IDE code intelligence support, make sure you have appropriate language servers installed locally:
npm install -g typescript-language-server
pip install pyrightStart the terminal dashboard interface:
custom-piStart the React web server (served locally at http://localhost:4321):
custom-pi-web| Objective | Legacy Limitation | custom-pi Resolution |
|---|---|---|
| Context Retention | Agents lose state and forget decisions across sessions. | Knowledge graph (SQLite triplet store) with confidence scoring, weighted retrieval, and multi-hop graph traversal. |
| Knowledge Synthesis | Memory is manual; no automatic extraction from conversations. | Auto-learning LLM pipeline extracts Subject→Predicate→Object triplets from tool outputs and flushes them on a 5-min cron + shutdown. |
| Memory Maintenance | Stale or redundant facts accumulate and degrade retrieval quality. | TTL-based pruning per entity type (7d–5y) plus redundancy merging via similarity scoring; logged for manual review. |
| Context Retrieval | Single flat search — no strategy for different query types. | Intent classification (conversational / knowledge / system) drives tiered recall: FTS5 → triplets → state, with cascading fallback. |
| Problem Solving | Single agents get stuck in recursive bugs or loops. | DAG swarms that delegate specialized roles (Research, Coding, Reviewing) concurrently. |
| Log Analysis | Errors discovered only when the user reports them. | Webhook ingestion + LLM log parsing creates failure triplets and incidents; 3+ related failures trigger proactive triage. |
| Deployment | Manual CI/CD tracking with no agent oversight. | Stateful pipeline orchestrator tracks 6 stages (PR→build→tests→staging→smoke→prod) with auto-rollback on failure. |
| Rate Limits | API quota exhaustion causes silent failures. | Circuit breaker tracks X-RateLimit-Remaining headers; exponential backoff (1s·2ⁿ) defers calls until reset. |
| Resource Awareness | No visibility into host load when scheduling tasks. | Host metrics (/proc/stat, /proc/meminfo) feed a penalty scoring function that adjusts task parallelism dynamically. |
| Service Health | No awareness of external service degradation. | Health monitor pings endpoints, tracks latency/jitter, and injects contextual warnings into planning. |
| Interface Telemetry | Plain terminal scrolling output makes tracking swarms difficult. | Double-buffered fullscreen TUI alongside a WebSocket-driven React dashboard. |
| Credentials Storage | Storing API keys in plaintext files or config variables. | AES-256-GCM encrypted vault with programmatic key generation. |
| Code Understanding | Regex-based file search lacks structural and type context. | LSP client integration with ast-grep queries for 11+ languages. |
| Web Access | Incapable of logging in or interacting with complex JavaScript apps. | Playwright integration for screenshot capture, selector querying, and action execution. |
memory_store/memory_search/memory_edit: Create, retrieve, and update TF-IDF memory vectors./triplets: Query the knowledge graph — list all triplets or drill into an entity's connections.vault_set/vault_get/vault_delete/vault_list/vault_import: Cryptographically secure storage mapped to AES-256-GCM encryption.
web_search: Multi-tier search fallback chain (DuckDuckGo → Algolia HackerNews → Wikipedia).web_fetch: Programmatic page fetching with HTML-to-Markdown parsing and automatic user-agent rotation.internal_url: URL router handling internal schemas (memory://,vault://,local://,issue://,pr://,skill://,rule://).
browser: Navigate, type, click, screenshot (base64 PNG), and extract node content via headless Chromium.ssh_exec: Execution of remote host commands with secure temporary SSH key and password management.
lsp: Programmatic interface for language servers supporting hover definitions, symbols, rename, and diagnostic arrays.ast_grep: Structural syntax tree search across 11 programming languages.hashline_edit: Content-hash validated code editing to ensure merge safety and prevent corrupt patches.
github: Full integration with the GitHub API for issues, pull request tracking, and branch-specific code search.send_email: Compose and dispatch emails using the Gmail API via OAuth 2.0 Device Flow authorization.
post_to_reddit: Submit post payloads via Reddit OAuth password grant.post_to_bluesky: AT Protocol client helper for publishing text updates.post_to_discord: Webhook integration for system logs broadcasting.post_to_telegram: Bot API integration for secure command updates.
generate_image: Automated image rendering using DALL-E 3, Gemini, or Grok based on active vault credentials.text_to_speech: Edge-tts CLI client returning audio base64 buffers.render_mermaid: Compiles Mermaid diagrams into SVG vectors with ASCII fallbacks.
plan: Formulate, track, and complete multi-step checklists.session: Checkpoint serialization containing execution states, tools, and variables.plugin: Register, configure, and isolate dynamic JavaScript extensions.todo_write: Structured task lists with phased action plans.
Configure parallel multi-agent workflows using ~/.pi/agent/dag-config.yaml:
version: 1
mode: pipeline
pipeline_count: 3
agents:
- id: researcher
role: Research specifications and codebase structure
tools: [web_search, web_fetch, memory_search]
waits_for: []
- id: coder
role: Implement software features and bug fixes
tools: [write, edit, bash, glob, grep]
waits_for: [researcher]
- id: reviewer
role: Validate type checks, tests, and run compiler
tools: [bash, glob, grep, lsp]
waits_for: [coder]- pipeline: Iterative loop. Reviewer validates output, and the CEO Orchestrator routes feedback to the Coder/Researcher if defects are found.
- parallel: Executes all agents concurrently when task definitions do not overlap.
- sequential: Strict single-lane dependency execution.
custom-pi runs a webhook listener at POST /api/webhooks/:source that accepts events from Sentry, Datadog, GitHub Actions, or any custom CI. Each event is:
- Normalized into a standard format by
listener.js. - Parsed by an LLM to extract structured failure data: component, error code, severity.
- Persisted as a failure triplet in the knowledge graph, tagged with severity.
- Aggregated into incidents — if 3+ related failures occur in a short window, a triage task is automatically raised.
The service-health-monitor periodically checks external endpoints, tracking:
- Latency (ms) and jitter (variance between checks)
- Consecutive failures for flapping detection
- Status classification: excellent (<50ms), good, degraded, slow, critical (>2s)
Health data feeds into the planner: "PostgreSQL pool latency is 150ms — prioritizing reads over writes for 2 hours."
The deployment-orchestrator manages stateful pipelines:
- PR Created → Build Started → Unit Tests → Staging Deploy → Smoke Tests → Production Deploy
- Each stage runs verification gates (idempotent bash scripts checking critical invariants).
- On failure, the orchestrator auto-rolls back to the last stable commit SHA.
All tool-calling wrappers use a circuit breaker pattern:
- Tracks
X-RateLimit-Remainingfrom API responses. - When remaining < 5, flips a
RATE_LIMIT_BREACHflag and applies exponential backoff (delay =1s × 2ⁿ, capped at 60s). - Subsequent tool calls wait for the delay or defer to a lower-priority queue.
The resource-monitor reads /proc/stat and /proc/meminfo every check cycle:
- CPU usage % and memory usage %
- Load average compared to CPU core count
- Task priority is penalized:
Score = Importance / (Resources × Cost)— at 90% CPU, parallel subagent cost skyrockets, favoring sequential execution.
custom-pi maintains a SQLite knowledge graph with the schema (Subject) → [Predicate] → (Object), each triplet having:
- Confidence score (0.0–1.0) — how reliable the fact is
- Entity types: tool, file, function, class, concept, dependency, setting, person
- TTL by type: tool data (7d), files (90d), functions/classes (180d), concepts/dependencies (365d), people (5y)
Commands:
/triplets — List top 20 knowledge graph entries
/triplets <entity> — Drill into an entity's connections
Legacy vector memory using cosine similarity with recency decay:
Every significant tool output is queued for LLM-based triplet extraction. The triplet-generator:
- Feeds raw text to an LLM with a structured extraction prompt.
- Validates the JSON output against the TripletRecord schema.
- Deduplicates and upserts into the knowledge graph (min confidence: 0.4).
A daily cron job runs:
- Staleness pruning: Deletes triplets past their TTL.
- Redundancy merging: Finds near-duplicate triplets (same subject/object, similar predicate), keeps the highest confidence entry.
- Prune logging: All actions are logged to
prune-log.jsonfor manual review.
The React dashboard provides real-time visibility into all subsystems:
| Tab | View |
|---|---|
| Chat | Real-time conversation with streaming agent output |
| Dashboard | System telemetry, budget, vault, MCP servers, work products |
| Memory | TF-IDF semantic memory search and storage |
| Knowledge Graph | Triplet table with confidence slider, entity drill-down |
| Pipeline | Deployment pipeline status across all stages |
| Health | Service health, host CPU/RAM, rate limit status |
| Sub-Agents | Swarm agent logs, tool calls, status |
| Secrets Vault | Encrypted credential management |
| Budget | Token/cost tracking with daily and per-session limits |
All assets, extensions, and configurations are synchronized locally in your home directory:
~/.pi/agent/
├── SOUL.md # Identity definition injected on turn start
├── SYSTEM.md # Core programming and formatting rules
├── settings.json # Model profiles and active configurations
├── models.json # API keys and provider targets
├── session-state.db # SQLite database (messages, triplets, health, rate limits)
├── session-state.json # Periodic state snapshots (every 10 turns)
├── dag-config.yaml # Active swarm configurations
├── mcp-servers.json # MCP server definition list
├── lsp-servers.json # LSP server executables mapping
├── prune-log.json # Triplet pruning audit log
├── checkpoints/ # Directory containing past session snapshots
├── costs/ # Log files tracking token usage and cost bounds
├── work-products/ # Ledger mapping created and modified files
├── webhooks/ # Incoming webhook event storage
├── plugins/ # Directory containing dynamic script extensions
├── .vault/ # Secure storage directory
│ ├── master.key # 32-byte hex master key
│ └── vault.json # Encrypted key-value database
└── web/ # Distribution containing Vite client assets
Verify the system by running the full unit and integration test suites:
npm testCheck TypeScript type compiler compliance:
npx tsc --noEmitMIT - Licensed under the MIT License. Free to use, modify, and distribute.
Hermes speed + Paperclip obsession = custom-pi
One agent to configure them all. And in the terminal bind them.
