Skip to content

fix(hermes): add Discord facade#3293

Merged
ericksoa merged 13 commits into
mainfrom
aerickson/pr-3238-main-discord-facade-replay
May 9, 2026
Merged

fix(hermes): add Discord facade#3293
ericksoa merged 13 commits into
mainfrom
aerickson/pr-3238-main-discord-facade-replay

Conversation

@ericksoa
Copy link
Copy Markdown
Contributor

@ericksoa ericksoa commented May 8, 2026

Summary

Commit shape

Local validation

  • python3 -m py_compile agents/hermes/discord-facade.py agents/hermes/decode-proxy.py
  • bash -n agents/hermes/start.sh test/e2e/test-hermes-discord-e2e.sh
  • npm run build:cli
  • npx vitest run test/hermes-discord-facade.test.ts test/generate-hermes-config.test.ts src/lib/agent/runtime.test.ts test/sandbox-init.test.ts test/sandbox-provisioning.test.ts test/policies.test.ts
  • git diff --check

Original PRs: #3153, #3238

Signed-off-by: Aaron Erickson aerickson@nvidia.com

Summary by CodeRabbit

  • New Features

    • Local Discord facade added to emulate Discord Gateway+REST in the Hermes sandbox; runtime now exposes Discord proxy/facade URLs and augments PYTHONPATH for preload routing.
    • Sandbox policies updated to allow facade preload files and required Discord network routes.
  • Tests

    • Extensive unit and E2E coverage added for gateway auth boundary, REST forwarding/polling, interaction signature handling/token localization, preload routing, startup/init, and sandbox policy validation.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 283b1c01-98ba-4bb1-9f79-aab21b313390

📥 Commits

Reviewing files that changed from the base of the PR and between f7550cb and e6e5ead.

📒 Files selected for processing (1)
  • test/e2e/test-hermes-discord-e2e.sh

📝 Walkthrough

Walkthrough

This PR adds a local Discord REST/Gateway facade service to Hermes, a preload that rewrites aiohttp Discord traffic to the facade, Dockerfile and sandbox policy updates, start.sh wiring to run the facade and export Discord envs/PYTHONPATH, runtime recovery updates, and tests covering facade behavior and policies.

Changes

Discord Facade for Hermes Gateway Emulation

Layer / File(s) Summary
Data Contracts & Configuration
agents/hermes/config/messaging-config.ts, agents/hermes/Dockerfile
Adds internal Discord env constants; buildMessagingEnvLines() emits DISCORD_PROXY, NEMOCLAW_DISCORD_FACADE_URL, and NEMOCLAW_DISCORD_GUILD_IDS when discord is enabled; Dockerfile docs updated.
Build & Security Policy
agents/hermes/Dockerfile, agents/hermes/policy-additions.yaml, agents/hermes/policy-permissive.yaml
Installs nemoclaw-discord-facade and discord-preload/ with permissions; policies allowlist preload path and add scoped Discord network routes (gateway, commands, messages, reactions, webhooks) with explicit tls attributes where specified.
Preload & Decode-Proxy Clarifications
agents/hermes/discord-preload/sitecustomize.py, agents/hermes/decode-proxy.py
sitecustomize.py monkey-patches aiohttp to rewrite Discord REST (/api*) and gateway WebSocket (gateway.discord.gg) URLs to the local facade; decode-proxy.py docstring clarifies it does not inspect/modify WebSocket frames.
Sandbox Startup
agents/hermes/start.sh
Adds start_discord_facade(), uses Hermes venv python for proxy/facade, exports NEMOCLAW_DISCORD_FACADE_URL, prepends preload to PYTHONPATH, injects DISCORD_PROXY/facade into proxy env, launches facade before Hermes gateway, and tracks facade PID.
Facade Core Implementation
agents/hermes/discord-facade.py
aiohttp server with /gateway WebSocket handling (IDENTIFY/RESUME enforcement), READY/RESUMED opcodes, session/sequence tracking, broadcast dispatch, and lifecycle management.
REST Routing & Interaction Handling
agents/hermes/discord-facade.py
Handles gateway status and @me, application command storage, interaction webhook signature verification and token localization, webhook/callback token rewriting, forwarding to discord.com with hop-by-hop header filtering and token-redacted logs.
Polling & Normalization
agents/hermes/discord-facade.py
Polling loop and guild/channel discovery, per-channel message and thread polling, message normalization/change detection, reaction-count delta computation, and synthesized gateway events (MESSAGE_/THREAD_ and reaction add/remove).
Optional Tunnel & Main Wiring
agents/hermes/discord-facade.py
Optional cloudflared tunnel support, public interactions listener, interactions endpoint registration, main() wiring, and graceful shutdown.
Runtime Recovery & CLI Wiring
src/lib/agent/runtime.ts, src/lib/agent/runtime.test.ts
hermesGatewayEnvPrefix() includes Discord proxy/facade/PYTHONPATH; recovery command uses venv python, readiness checks for proxy (3129) and facade (3130), log setup, and retry loops; tests updated to assert presence of facade-related env/commands.
Unit Tests: Facade & Preload
test/hermes-discord-facade.test.ts
Vitest tests exercise in-process facade/preload: gateway IDENTIFY validation, REST forwarding, polling→event synthesis, Ed25519 signature verification/token localization, callback token mapping, and aiohttp rewriting.
Unit Tests: Config, Policy & Provisioning
test/generate-hermes-config.test.ts, test/policies.test.ts, test/sandbox-provisioning.test.ts
Validates generated .env contains Discord proxy/facade/guild IDs; policy YAML tests enforce gateway L4 WebSocket shape and scoped REST mutation allowlist; preload path allowlisted in policies.
Integration & E2E Tests
test/e2e/test-hermes-discord-e2e.sh, test/e2e/test-messaging-providers.sh, test/sandbox-init.test.ts
E2E checks for facade /health, gateway protocol (HELLO/READY/heartbeat), token isolation, env/guild propagation, gateway-auth-boundary log confirmation, and start.sh assertions (venv python usage, privileged launch, logs).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

I nibble code beneath the moonlit log,
A rabbit patches paths and guards the proxy bog,
Facade hums local, tokens tucked away,
Polls turn to events and gateways softly play,
Sandbox hops happy — bravo, tiny cog! 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(hermes): add Discord facade' directly describes the main change—adding Discord facade functionality to Hermes. It is concise, specific, and clearly summarizes the primary objective.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch aerickson/pr-3238-main-discord-facade-replay

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@agents/hermes/discord-facade.py`:
- Around line 160-163: The public tunnel currently exposes the full facade
(routes registered via web.get("/gateway", self.handle_gateway), web.route("*",
"/api/{tail:.*}", self.handle_rest)) which allows _forward_rest() to accept
caller headers (including Authorization) and lets external users drive
bot-authenticated calls via _run_tunnel_command(); restrict the public tunnel to
only expose web.post("/interactions", self.handle_interaction) or instead create
a separate aiohttp/web listener for the public webhook and move
handle_gateway/handle_rest (and any _forward_rest logic) behind an auth gate
that strips/validates Authorization headers before proxying. Update the tunnel
registration logic in _run_tunnel_command and routing setup so only
handle_interaction is published, or implement a new private listener and attach
handle_gateway/handle_rest there with header sanitization.
- Around line 625-634: _handle_interaction_callback currently returns a local
204 for callbacks whose local_token is in self._interaction_tokens, which
discards the bot's real callback instead of forwarding it; update the logic so
that when local_token matches an entry in self._interaction_tokens you map it to
the original Discord token and forward the request upstream via _forward_rest
(using the mapped real token) instead of immediately returning 204, or
alternatively remove the eager ACK in handle_interaction so the bot performs the
initial {"type":5} itself; locate references to _interaction_tokens,
handle_interaction, _handle_interaction_callback, _forward_rest and _read_bytes
and implement the token mapping + forwarding path or the removal of the eager
ACK consistently (also apply the same fix to the duplicated block at lines
649-662).
- Around line 343-382: The MESSAGE_DELETE synthesis in _poll_channel_messages is
producing false deletes because it diffs the cached IDs against only the latest
25 messages; to mitigate, change _poll_channel_messages so it does not
synthesize deletes when the fetched page is saturated (len(messages) == 25) —
i.e., if the page is full, skip the loop that computes previous_ids -
current_ids and emitting MESSAGE_DELETE via dispatch_to_all; keep all other
behavior (normal CREATE/UPDATE/reaction handling and updating
_channel_message_ids and _message_cache) the same so the facade remains accurate
until a proper paginated/tombstone strategy is implemented.

In `@test/hermes-discord-facade.test.ts`:
- Around line 11-25: The test helpers leak parent DISCORD_* and TELEGRAM_*
environment variables into child Python processes; update both the
hasCryptography spawnSync call and the runPython function to use a sanitized
child environment: create sanitizedEnv by copying process.env but removing any
keys starting with "DISCORD_" or "TELEGRAM_", then when invoking spawnSync pass
env: { ...sanitizedEnv, ...env } (for hasCryptography pass sanitizedEnv as the
env arg, and for runPython use sanitizedEnv as the base before layering the
function's env parameter) so tests remain hermetic; reference hasCryptography
and runPython when making this change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6bb60e3e-cd76-4466-ac14-87440d193026

📥 Commits

Reviewing files that changed from the base of the PR and between 30545bb and 6e77d48.

📒 Files selected for processing (17)
  • agents/hermes/Dockerfile
  • agents/hermes/config/messaging-config.ts
  • agents/hermes/decode-proxy.py
  • agents/hermes/discord-facade.py
  • agents/hermes/discord-preload/sitecustomize.py
  • agents/hermes/policy-additions.yaml
  • agents/hermes/policy-permissive.yaml
  • agents/hermes/start.sh
  • src/lib/agent/runtime.test.ts
  • src/lib/agent/runtime.ts
  • test/e2e/test-hermes-discord-e2e.sh
  • test/e2e/test-messaging-providers.sh
  • test/generate-hermes-config.test.ts
  • test/hermes-discord-facade.test.ts
  • test/policies.test.ts
  • test/sandbox-init.test.ts
  • test/sandbox-provisioning.test.ts

Comment thread agents/hermes/discord-facade.py
Comment thread agents/hermes/discord-facade.py
Comment thread agents/hermes/discord-facade.py
Comment thread test/hermes-discord-facade.test.ts
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
agents/hermes/discord-facade.py (1)

846-858: 💤 Low value

Store reference to background task for proper lifecycle management.

The task created by asyncio.create_task(_capture_url()) is not stored, which means:

  1. Exceptions raised in _capture_url will be silently ignored
  2. The task cannot be awaited or cancelled during shutdown
  3. Risk of garbage collection (though unlikely here due to stdout reference)
♻️ Suggested fix

Return the task from the function and manage it in main():

-    asyncio.create_task(_capture_url())
-    return proc
+    capture_task = asyncio.create_task(_capture_url())
+    return proc, capture_task

Then in main(), store and optionally cancel on shutdown.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agents/hermes/discord-facade.py` around lines 846 - 858, The background task
started with asyncio.create_task(_capture_url()) inside the function that
currently returns only proc should be preserved and surfaced so it can be
awaited/cancelled and so exceptions aren't lost; change the call site that
creates the task (the asyncio.create_task(_capture_url()) line) to capture the
Task object (e.g., task = asyncio.create_task(_capture_url())), ensure the
coroutine _capture_url has proper try/except logging or attach a done callback
to log exceptions, and return or attach that Task alongside the proc (or return
a tuple (proc, task)) so callers (e.g., main()) can store, await, and cancel the
task during shutdown.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@agents/hermes/discord-facade.py`:
- Line 136: The _interaction_tokens dict in discord-facade.py is unbounded
(tokens are inserted in _localize_interaction_token and never removed); change
it to a bounded/TLL-aware structure and prune expired entries: replace or wrap
self._interaction_tokens with a TTL-aware cache (e.g., a small LRU or TTL cache
keyed by interaction id with 15-minute expiry) or store values as (token,
expiry_timestamp) and in _localize_interaction_token (and any lookup paths)
remove stale entries before inserting/returning; ensure cleanup runs on
insert/lookup to prevent growth and keep behavior identical otherwise.

---

Nitpick comments:
In `@agents/hermes/discord-facade.py`:
- Around line 846-858: The background task started with
asyncio.create_task(_capture_url()) inside the function that currently returns
only proc should be preserved and surfaced so it can be awaited/cancelled and so
exceptions aren't lost; change the call site that creates the task (the
asyncio.create_task(_capture_url()) line) to capture the Task object (e.g., task
= asyncio.create_task(_capture_url())), ensure the coroutine _capture_url has
proper try/except logging or attach a done callback to log exceptions, and
return or attach that Task alongside the proc (or return a tuple (proc, task))
so callers (e.g., main()) can store, await, and cancel the task during shutdown.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b02976f5-5219-44f0-9a11-5bc998d42e18

📥 Commits

Reviewing files that changed from the base of the PR and between 6e77d48 and be9b2b6.

📒 Files selected for processing (2)
  • agents/hermes/discord-facade.py
  • test/hermes-discord-facade.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/hermes-discord-facade.test.ts

Comment thread agents/hermes/discord-facade.py Outdated
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@ericksoa ericksoa self-assigned this May 8, 2026
@ericksoa ericksoa added v0.0.38 Release target Integration: Hermes bug Something isn't working Integration: Discord Use this label to identify Discord bot integration issues with NemoClaw. labels May 8, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Selective E2E Results — ❌ Some jobs failed

Run: 25583784468
Branch: aerickson/pr-3238-main-discord-facade-replay
Requested jobs: hermes-discord-e2e,hermes-e2e,messaging-providers-e2e,token-rotation-e2e,rebuild-hermes-e2e,rebuild-hermes-stale-base-e2e
Summary: 5 passed, 1 failed, 0 skipped

Job Result
hermes-discord-e2e ❌ failure
hermes-e2e ✅ success
messaging-providers-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-hermes-stale-base-e2e ✅ success
token-rotation-e2e ✅ success

Failed jobs: hermes-discord-e2e. Check run artifacts for logs.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 9, 2026

Selective E2E Results — ✅ All requested jobs passed

Run: 25587605524
Branch: aerickson/pr-3238-main-discord-facade-replay
Requested jobs: hermes-discord-e2e,hermes-e2e,messaging-providers-e2e,token-rotation-e2e,rebuild-hermes-e2e,rebuild-hermes-stale-base-e2e
Summary: 6 passed, 0 failed, 0 skipped

Job Result
hermes-discord-e2e ✅ success
hermes-e2e ✅ success
messaging-providers-e2e ✅ success
rebuild-hermes-e2e ✅ success
rebuild-hermes-stale-base-e2e ✅ success
token-rotation-e2e ✅ success

@ericksoa ericksoa merged commit 1bfc136 into main May 9, 2026
16 of 17 checks passed
@ericksoa
Copy link
Copy Markdown
Contributor Author

ericksoa commented May 9, 2026

Post-merge housekeeping: the superseded source/replay PRs have been closed and labeled now that this main-based replay landed.

Thanks again to Ben Barclay and the NousResearch team for the partnership on this Hermes Discord work. Their follow-up in #3238 helped get the facade path from plausible to actually wired through the Hermes sandbox runtime, and that work is represented in this merged replay alongside the maintainer conflict-resolution and validation commits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Integration: Discord Use this label to identify Discord bot integration issues with NemoClaw. Integration: Hermes v0.0.38 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants