Skip to content

Architecture: markdown AST + per-channel renderer (replace four hand-rolled formatters) #22

@theognis1002

Description

@theognis1002

Problem Statement

Borg's native channels each implement a hand-rolled markdown→platform state machine: crates/gateway/src/{slack,discord,telegram,signal}/format.rs total ~1,100 lines of near-identical code. Helpers (find_double_star, extract_delimited, link extraction) recur verbatim. A markdown edge case fixed in Slack is still latent in Discord, Telegram, and Signal. The four format modules are shallow duplicates of one shape.

Solution

Parse agent markdown once into a small AST scoped to exactly the subset Borg emits, then define a MarkdownRenderer trait. Each channel becomes a small renderer impl that overrides only the primitives that differ for its platform.

User Stories

  1. As a Borg contributor, I want to fix a markdown edge case (nested bold inside a link, fenced code containing backticks, etc.) once and have the fix apply to Slack, Discord, Telegram, and Signal, so that I don't ship four near-identical patches.
  2. As a Borg contributor, I want to add a new chat channel by implementing a small MarkdownRenderer, so that I don't reinvent a markdown state machine.
  3. As a Borg contributor, I want golden tests for each channel's renderer plus thorough fixture tests for the shared parser, so that parsing complexity is covered exactly once.
  4. As a Borg contributor, I want the markdown subset Borg supports to be documented as an enum, so that "what does the agent emit?" has a single answer.
  5. As a Borg user, I want existing message formatting in every channel to be byte-identical (or intentionally improved) after the refactor, so that nothing regresses.

Implementation Decisions

  • Add a markdown module (likely core::markdown or a small markdown module inside gateway) with MarkdownNode: Text, Bold, Italic, InlineCode, CodeBlock { lang, body }, Link { text, url }, LineBreak. Scope is exactly Borg's emitted subset — not full CommonMark.
  • Define trait MarkdownRenderer { fn render(&self, nodes: &[MarkdownNode]) -> String; } with default per-variant methods so channel impls override only what differs.
  • Replace each channel's format.rs state machine with a MarkdownRenderer impl. plain_text.rs becomes the trivial renderer.
  • The parser is the deep module (one place to fix edge cases); renderers are intentionally shallow adapters.
  • Output for each channel must match current behavior at the byte level on a representative fixture set, except where current behavior is provably wrong (document any deltas in the PR).

Testing Decisions

A good test exercises the parser or a renderer with realistic agent output and asserts on the produced string. No tautological tests.

  • Heavy fixture-driven parser tests: nested bold, links containing emphasis, fenced code with backticks, mixed inline + block, edge cases the existing per-channel tests already cover (consolidated).
  • Per-renderer golden tests covering platform-specific quirks: Slack mrkdwn, Telegram HTML escaping, Discord backtick fences, Signal plain.
  • A regression suite that pipes a corpus of real prior agent outputs through each renderer and asserts current outputs match (captured before the refactor lands).
  • Prior art: existing per-channel format tests in crates/gateway/src/{slack,discord,telegram,signal}/format.rs.

Out of Scope

  • Supporting markdown features Borg's agent does not currently emit (tables, blockquotes, images, headings beyond what's already used).
  • Adding a new channel or changing existing channel transports.
  • Switching to a third-party markdown crate — kept in scope as an option to evaluate during implementation, but not required.

Further Notes

  • Worth checking whether pulldown-cmark (already in some Rust ecosystems) covers the parsing need cheaply; if it does, only the AST→MarkdownNode adaption is custom.
  • Capture a corpus of recent agent outputs from logs for regression fixtures before starting the refactor.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions