fix(hermes): add Discord facade#3293
Conversation
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Restores the hardened recovery log and readiness checks while keeping the replayed Hermes venv interpreter fix. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@agents/hermes/discord-facade.py`:
- Around line 160-163: The public tunnel currently exposes the full facade
(routes registered via web.get("/gateway", self.handle_gateway), web.route("*",
"/api/{tail:.*}", self.handle_rest)) which allows _forward_rest() to accept
caller headers (including Authorization) and lets external users drive
bot-authenticated calls via _run_tunnel_command(); restrict the public tunnel to
only expose web.post("/interactions", self.handle_interaction) or instead create
a separate aiohttp/web listener for the public webhook and move
handle_gateway/handle_rest (and any _forward_rest logic) behind an auth gate
that strips/validates Authorization headers before proxying. Update the tunnel
registration logic in _run_tunnel_command and routing setup so only
handle_interaction is published, or implement a new private listener and attach
handle_gateway/handle_rest there with header sanitization.
- Around line 625-634: _handle_interaction_callback currently returns a local
204 for callbacks whose local_token is in self._interaction_tokens, which
discards the bot's real callback instead of forwarding it; update the logic so
that when local_token matches an entry in self._interaction_tokens you map it to
the original Discord token and forward the request upstream via _forward_rest
(using the mapped real token) instead of immediately returning 204, or
alternatively remove the eager ACK in handle_interaction so the bot performs the
initial {"type":5} itself; locate references to _interaction_tokens,
handle_interaction, _handle_interaction_callback, _forward_rest and _read_bytes
and implement the token mapping + forwarding path or the removal of the eager
ACK consistently (also apply the same fix to the duplicated block at lines
649-662).
- Around line 343-382: The MESSAGE_DELETE synthesis in _poll_channel_messages is
producing false deletes because it diffs the cached IDs against only the latest
25 messages; to mitigate, change _poll_channel_messages so it does not
synthesize deletes when the fetched page is saturated (len(messages) == 25) —
i.e., if the page is full, skip the loop that computes previous_ids -
current_ids and emitting MESSAGE_DELETE via dispatch_to_all; keep all other
behavior (normal CREATE/UPDATE/reaction handling and updating
_channel_message_ids and _message_cache) the same so the facade remains accurate
until a proper paginated/tombstone strategy is implemented.
In `@test/hermes-discord-facade.test.ts`:
- Around line 11-25: The test helpers leak parent DISCORD_* and TELEGRAM_*
environment variables into child Python processes; update both the
hasCryptography spawnSync call and the runPython function to use a sanitized
child environment: create sanitizedEnv by copying process.env but removing any
keys starting with "DISCORD_" or "TELEGRAM_", then when invoking spawnSync pass
env: { ...sanitizedEnv, ...env } (for hasCryptography pass sanitizedEnv as the
env arg, and for runPython use sanitizedEnv as the base before layering the
function's env parameter) so tests remain hermetic; reference hasCryptography
and runPython when making this change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 6bb60e3e-cd76-4466-ac14-87440d193026
📒 Files selected for processing (17)
agents/hermes/Dockerfileagents/hermes/config/messaging-config.tsagents/hermes/decode-proxy.pyagents/hermes/discord-facade.pyagents/hermes/discord-preload/sitecustomize.pyagents/hermes/policy-additions.yamlagents/hermes/policy-permissive.yamlagents/hermes/start.shsrc/lib/agent/runtime.test.tssrc/lib/agent/runtime.tstest/e2e/test-hermes-discord-e2e.shtest/e2e/test-messaging-providers.shtest/generate-hermes-config.test.tstest/hermes-discord-facade.test.tstest/policies.test.tstest/sandbox-init.test.tstest/sandbox-provisioning.test.ts
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
agents/hermes/discord-facade.py (1)
846-858: 💤 Low valueStore reference to background task for proper lifecycle management.
The task created by
asyncio.create_task(_capture_url())is not stored, which means:
- Exceptions raised in
_capture_urlwill be silently ignored- The task cannot be awaited or cancelled during shutdown
- Risk of garbage collection (though unlikely here due to stdout reference)
♻️ Suggested fix
Return the task from the function and manage it in
main():- asyncio.create_task(_capture_url()) - return proc + capture_task = asyncio.create_task(_capture_url()) + return proc, capture_taskThen in
main(), store and optionally cancel on shutdown.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agents/hermes/discord-facade.py` around lines 846 - 858, The background task started with asyncio.create_task(_capture_url()) inside the function that currently returns only proc should be preserved and surfaced so it can be awaited/cancelled and so exceptions aren't lost; change the call site that creates the task (the asyncio.create_task(_capture_url()) line) to capture the Task object (e.g., task = asyncio.create_task(_capture_url())), ensure the coroutine _capture_url has proper try/except logging or attach a done callback to log exceptions, and return or attach that Task alongside the proc (or return a tuple (proc, task)) so callers (e.g., main()) can store, await, and cancel the task during shutdown.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@agents/hermes/discord-facade.py`:
- Line 136: The _interaction_tokens dict in discord-facade.py is unbounded
(tokens are inserted in _localize_interaction_token and never removed); change
it to a bounded/TLL-aware structure and prune expired entries: replace or wrap
self._interaction_tokens with a TTL-aware cache (e.g., a small LRU or TTL cache
keyed by interaction id with 15-minute expiry) or store values as (token,
expiry_timestamp) and in _localize_interaction_token (and any lookup paths)
remove stale entries before inserting/returning; ensure cleanup runs on
insert/lookup to prevent growth and keep behavior identical otherwise.
---
Nitpick comments:
In `@agents/hermes/discord-facade.py`:
- Around line 846-858: The background task started with
asyncio.create_task(_capture_url()) inside the function that currently returns
only proc should be preserved and surfaced so it can be awaited/cancelled and so
exceptions aren't lost; change the call site that creates the task (the
asyncio.create_task(_capture_url()) line) to capture the Task object (e.g., task
= asyncio.create_task(_capture_url())), ensure the coroutine _capture_url has
proper try/except logging or attach a done callback to log exceptions, and
return or attach that Task alongside the proc (or return a tuple (proc, task))
so callers (e.g., main()) can store, await, and cancel the task during shutdown.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: b02976f5-5219-44f0-9a11-5bc998d42e18
📒 Files selected for processing (2)
agents/hermes/discord-facade.pytest/hermes-discord-facade.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- test/hermes-discord-facade.test.ts
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Selective E2E Results — ❌ Some jobs failedRun: 25583784468
|
Selective E2E Results — ✅ All requested jobs passedRun: 25587605524
|
|
Post-merge housekeeping: the superseded source/replay PRs have been closed and labeled now that this main-based replay landed. Thanks again to Ben Barclay and the NousResearch team for the partnership on this Hermes Discord work. Their follow-up in #3238 helped get the facade path from plausible to actually wired through the Hermes sandbox runtime, and that work is represented in this merged replay alongside the maintainer conflict-resolution and validation commits. |
Summary
Commit shape
Local validation
Original PRs: #3153, #3238
Signed-off-by: Aaron Erickson aerickson@nvidia.com
Summary by CodeRabbit
New Features
Tests