Skip to content

Demo: aiosandbox + Hermes + AgentKeys on ESP32 hardware #103

@hanwencheng

Description

@hanwencheng

Summary

End-to-end v0 demo for the AgentKeys hardware-vendor wedge: an ESP32 device, configured with one URL + one actor token, talks to a cloud-hosted agent-infra/sandbox running a Hermes agent runtime + agentkeys-daemon that auto-injects a mock user-memory MD blob from S3 at agent boot. The device sounds personalized from the first conversation.

This is the v0 buyer-pitch demo called for in the office-hours design doc §9.6 Storyboard, scoped to single device, single sandbox, single mock memory blob.

Full plan: docs/spec/plans/issue-102-aiosandbox-hermes-esp32-demo.md

Why now

The office-hours diagnostic surfaced that the next critical step is a working demo a vendor can SEE — not more architecture docs. Approach D (AgentKeys-native sandbox) was chosen specifically because vendor integration friction collapses from "embed SDK in firmware" (2 months) to "point your device at a URL" (1 day). This issue ships that 1-day vendor onboarding story end-to-end so we can take it to FoloToy / Ropet / BubblePal.

Scope

IN scope (v0):

  • One ESP32 board talking to one cloud-hosted sandbox
  • Mock memory injected from one S3 MD blob at agent boot
  • Single hardcoded actor (O_demo_001)
  • Text-mode interaction (voice mode deferred to follow-up)
  • Subsidized LLM (DashScope Qwen-Plus default; OpenRouter / OpenAI configurable)
  • Public demo URL https://demo.aiosandbox.litentry.org (or chosen subdomain)
  • One-command idempotent setup script per CLAUDE.md "Idempotent remote-setup rule"
  • Demo runbook for live walk-throughs

NOT in scope (deferred):

  • Voice STT/TTS pipeline
  • Real agentkeys-worker-memory integration (demo uses static S3 blob)
  • Cross-vendor memory portability
  • Multi-tenant sandbox orchestration
  • Pricing / billing / activation flow
  • Cap-token enforcement on the memory read path
  • Parent-control / consumer mobile app
  • Audit anchoring to Heima (off-chain audit only)
  • Real-time revocation UI

Architecture (summary; full diagram in plan)

ESP32 → HTTPS POST /v1/chat → nginx → hermes-runtime → LLM → response → ESP32
                                  ↑
                                  uses memory injected at agent boot via
                                  agentkeys-daemon GET /v1/memory/{actor}/profile.md
                                  → S3 (bots/<actor>/memory/profile.md)

Reuses canonical AgentKeys primitives from docs/arch.md:

  • agent-infra/sandbox (already arch.md's chosen agent runtime substrate)
  • HDKD actor model (single O_demo_001 for v0)
  • Memory bucket layout bots/<actor_omni_hex>/memory/<path>
  • supervisord lifecycle (per Round 13 runtime probe)
  • agentkeys-daemon extended with one new GET endpoint

Implementation order (12 steps)

# Deliverable Verify by
1 Mock memory MD fixture File exists, markdown-lint passes
2 agentkeys-hermes-runtime crate scaffold + /v1/chat stub cargo test -p agentkeys-hermes-runtime
3 Hook hermes-runtime to DashScope Qwen-Plus curl localhost:8090/v1/chat -d '{"query":"hi"}' returns LLM response
4 Add /v1/memory/{actor}/profile.md to agentkeys-daemon (test fixture, no S3 yet) curl returns fixture
5 Hermes-runtime fetches memory from daemon at startup; injects into system prompt Chat response references profile facts (Kevin, Chengdu, spicy)
6 Provision S3 bucket + upload fixture via setup script aws s3 ls s3://agentkeys-demo-memory/bots/O_demo_001/memory/
7 agentkeys-daemon reads from S3 (not hardcoded) Swap S3 file, restart, chat reflects new content
8 Build extended sandbox Dockerfile with supervisord configs docker run agentkeys/aiosandbox-demo:latest boots clean
9 Deploy to demo host with TLS curl https://demo.aiosandbox.litentry.org/v1/chat works
10 ESP32 firmware (text mode); flash to board Button press → text query → response on serial
11 End-to-end ESP32 → sandbox → LLM → memory-aware response Live demo
12 docs/demo-aiosandbox-runbook.md Operator can re-run from doc alone

Acceptance criteria

A reviewer takes the runbook, runs bash scripts/setup-demo-aiosandbox.sh on a fresh demo host, flashes the ESP32 firmware, and within 15 minutes can:

  • Send a text query from the ESP32
  • Receive a response that demonstrably reflects the mock memory content
  • Swap the S3 memory blob and see the next response reflect the new content
  • Read the demo runbook to understand every command they ran

Open questions (resolve before step 3)

  1. "Hermes" naming — confirm vs. OSS conflict; potential rename to agentkeys-companion-runtime or agentkeys-shell
  2. LLM provider — DashScope (default, China-friendly) vs. OpenRouter vs. Claude/OpenAI
  3. Demo host — reuse Heima broker host or spin up separate VM (recommend separate)
  4. Voice mode — defer to follow-up issue (recommend yes; text demo validates the pitch)
  5. ESP32 board — WROOM-32 ($5-8) for v0; ESP32-S3 for voice follow-up
  6. Auth — simple static bearer token tied to actor_omni (recommended for v0 demo)

Effort estimate

  • ~3 weeks for v0 working demo
  • Steps 1-7 (Rust + S3 + memory injection): ~1.5 weeks
  • Steps 8-9 (Dockerfile + deploy): ~3 days
  • Steps 10-11 (ESP32 + end-to-end): ~1 week
  • Step 12 (runbook): ~2 days

Dependencies

  • agent-infra/sandbox (stock image, no fork)
  • AgentKeys Stage 7+ stack (extends one endpoint on agentkeys-daemon)
  • AWS S3 (existing agentkeys-admin profile, us-east-1)
  • DashScope or OpenRouter LLM credit (~$10/mo for demo usage)
  • ESP32 hardware (~$5-15 off-the-shelf)
  • Demo host (2 vCPU / 4GB VM)
  • Let's Encrypt TLS cert

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/firmwareESP32 firmware, device-side code, MCU workarea/mcpMCP server, MCP tool integration, MCP protocol work

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions