Skip to content

Proposal: optional end-to-end encrypted messaging (per-channel mode) #748

@willwashburn

Description

@willwashburn

Motivation

Today all relay message bodies are plaintext on the wire and at rest in Relaycast. For use cases where agents coordinate on sensitive material (customer data, credentials, regulated work), we should offer an opt-in mode where only endpoints can read message content — while keeping the current plaintext mode as the default so integrations, search, and webhooks keep working.

Goal: support two modes side-by-side, selected per-channelplaintext (today) and encrypted (new). Mode is set at channel creation and immutable.

Architecture grounding

Relevant facts from the current codebase that shape the design:

  • Storage is cloud-side (Relaycast), not broker-side. The broker is a transient router with in-memory dedup + pending-delivery state. Per packages/sdk/src/.claude/rules/sdk.md: "There is NO storage package… Relaycast handles all message persistence." If endpoints encrypt before the SDK hands bytes to the broker, ciphertext is what Relaycast stores — no broker DB change needed.
  • Broker routing is envelope-driven, not body-driven. resolve_delivery_targets() in src/routing.rs:68-143 and DM participant resolution in src/main.rs:2587-2614 route on target, from, event_id, thread_id — never on body. E2E is largely additive.
  • No per-agent key material exists today. WorkspaceCredential in src/auth.rs:10-26 is a per-workspace Relaycast API key plus an optional JWT. Agents have no identity keypair. This is the main gap.
  • Channel membership is tracked via per-agent subscriptions (src/main.rs:1936-1967, src/relaycast_ws.rs:32-34). Subscribe/unsubscribe events fire on join/leave — useful hooks for key-rotation triggers.
  • Thread replies broadcast to all workspace workers (src/main.rs:2551-2567 synthesizes target=\"thread\"). Recipients without the key simply fail to decrypt — acceptable.
  • Reactions are re-formatted server-side as text (src/message_bridge.rs:100-145). This one genuinely breaks in encrypted mode and needs a parallel encrypted-reaction payload.

Proposed design

Per-channel mode, not per-workspace

Encryption is selected at channel creation time and cannot be changed afterward. DMs get their own default (encrypted once agent keys exist). This keeps integrations, search, unfurling, and webhooks working in plaintext channels while sensitive channels go dark to the server.

Envelope extension

Extend packages/sdk/src/protocol.ts and the Rust mirror in src/protocol.rs:

body: string                          // plaintext OR base64 ciphertext
content_type: \"text/plain\" | \"application/relay-encrypted+v1\"
enc?: { key_id, sender_key_gen, nonce }   // present only when encrypted

Routing fields (target, from, thread_id, priority, injection_mode) stay plaintext. The broker code path is unchanged; encrypt/decrypt happens at the SDK edges.

Keys

  • Each agent generates an Ed25519 (signing) + X25519 (KEM) keypair on first boot. Private key lives in .agent-relay/keys/. Public key is published to Relaycast as a new agent_public_key field on the credential record.
  • Channels: MLS group per encrypted channel (RFC 9420). Join/leave maps cleanly onto the existing subscribe/unsubscribe hooks and gets us proper forward/post-compromise security.
  • DMs: MLS 2-party. Using one crypto stack (vs. Signal for DMs + MLS for groups) keeps the surface smaller.
  • Library: openmls (Rust) with a thin binding for the TS SDK.

What breaks in encrypted channels (explicitly)

  • Server-side full-text search — replace with client-side index.
  • Link unfurling / mention parsing — client-side only.
  • Webhooks & integrations that read message content — they must be explicit channel members with their own keypair (treated as first-class agents).
  • Server-rendered reactions — send as a separate encrypted "reaction" payload rather than server-generated text.

Open decisions

  1. Key-loss policy. If an agent loses its private key (reinstall, machine loss):
    • No recovery — strongest security; lost history is lost.
    • Workspace-owner escrow — recoverable; weaker because the owner is a trust anchor.
      I lean toward no-recovery with a documented backup flow, but this is a product call.
  2. MLS vs. Signal for DMs. Proposal above picks MLS for one-stack simplicity. Signal (X3DH + double ratchet) is more battle-tested for 1:1. Worth discussing.
  3. Metadata leakage. Even in encrypted mode, Relaycast still sees who talked to whom and when. Is that acceptable, or do we want to look at metadata-minimizing techniques later (sealed-sender-style)?

Rough build order

  1. Per-agent keypair generation + publish to Relaycast — plumbing only, no crypto on the wire yet.
  2. Envelope content_type field + SDK pass-through; broker stays oblivious.
  3. DM E2E via MLS 2-party — small blast radius, good test bed.
  4. Encrypted channels with MLS groups, wired to subscribe/unsubscribe as commit triggers.
  5. Encrypted reactions + client-side search.

Non-goals (for this proposal)

  • Rewriting the existing plaintext path.
  • Changing broker storage or routing logic beyond the envelope extension.
  • Metadata-level privacy (who-talks-to-whom hiding) — can be a follow-up.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions