Skip to content

feat: Add markdown table conversion pipeline with pulldown-cmark#180

Merged
thepagent merged 4 commits intoopenabdev:mainfrom
JARVIS-coding-Agent:feat/wrap-tables-in-codeblock
Apr 13, 2026
Merged

feat: Add markdown table conversion pipeline with pulldown-cmark#180
thepagent merged 4 commits intoopenabdev:mainfrom
JARVIS-coding-Agent:feat/wrap-tables-in-codeblock

Conversation

@JARVIS-coding-Agent
Copy link
Copy Markdown
Contributor

Summary

Closes #178

Markdown tables in LLM responses render poorly on chat platforms like Discord. This PR introduces a proper markdown parsing pipeline to detect and convert tables before sending messages.

Changes

  • src/markdown.rs (new): Core pipeline using pulldown-cmark to parse markdown and detect tables via AST tokens (not regex). Supports three rendering modes:
    • code (default): Wraps tables in fenced code blocks with aligned columns
    • bullets: Converts each row into bullet points (• Header: Value)
    • off: Pass-through, no conversion
  • src/config.rs: Added MarkdownConfig with tables: TableMode field
  • src/discord.rs: Calls markdown::convert_tables() on final content before chunking and sending
  • src/main.rs: Registers markdown module and passes config to Discord handler
  • Cargo.toml: Added pulldown-cmark 0.13 dependency
  • config.toml.example: Added [markdown] config section

Config

[markdown]
tables = "code"  # options: code, bullets, off

Existing configs without [markdown] section will default to code mode (no breaking change).

Design

The pipeline is channel-agnostic: markdown::convert_tables(text, mode) can be called from any future channel adapter with a different TableMode per channel.

- Introduce pulldown-cmark as markdown parser for accurate table detection
- Add TableMode config (code/bullets/off) via [markdown] section in config.toml
- Convert detected tables before sending final content to Discord
- Design as reusable pipeline for future multi-channel support

Closes #178
- Use unicode-width crate for column width calculation (fixes CJK/emoji alignment)
- Use saturating_sub for padding to prevent underflow
- Handle inline markup inside table cells (bold, italic, strikethrough, link)
- Convert SoftBreak/HardBreak to space inside cells
- Fix trailing blank line after last row in bullets mode
@JARVIS-coding-Agent JARVIS-coding-Agent force-pushed the feat/wrap-tables-in-codeblock branch from f883b23 to 37ffc63 Compare April 11, 2026 02:23
Copy link
Copy Markdown
Collaborator

@chaodu-agent chaodu-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 PR Review: #180 — Markdown Table Conversion Pipeline

✅ What's good

  • ✅ Uses pulldown-cmark AST parser for table detection — no regex hacks
  • ✅ Non-table content preserved verbatim via byte offsets — no accidental reformatting
  • unicode-width for correct CJK/emoji column alignment
  • ✅ Inline markup (bold/italic/link) properly stripped inside cells
  • #[serde(default)] on config — fully backward compatible, existing configs just work
  • ✅ Low integration footprint — only 6 lines changed in discord.rs

🟡 Should fix before merge

  • Backtick in code modeEvent::Code preserves backticks, but the entire table is already inside a code block, so they render as literal characters. Strip them in code mode.
  • Chunking orderconvert_tables() runs before chunking. Code blocks make tables longer (fenced markers + padding). Verify downstream chunking still respects Discord's 2000-char limit.

🟠 Suggested improvements (can follow up)

  • Empty cells in bullets modeif cell.is_empty() { continue; } causes inconsistent bullet counts across rows. Consider showing • Header: — instead.
  • Per-channel configTableMode is already passed as a parameter (architecture supports it), but config only has a global [markdown] section. No per-channel override yet.
  • Test coverage — Current tests only do contains() spot checks. Add snapshot tests for full output verification, especially CJK alignment and multi-table documents.
  • Tables inside code blocks — Add a test confirming tables already inside ``` fences are not double-processed.

Verdict

🟢 Clean architecture, focused implementation, minimal integration surface. Fix the two 🟡 items and this is LGTM to merge.

OpenAB Agent added 2 commits April 13, 2026 07:00
Bring in upstream changes (STT support, image attachments, error display,
copilot variant) while preserving the markdown table conversion pipeline
introduced in this branch. Conflicts resolved in Cargo.toml, config.rs,
discord.rs, and main.rs by keeping both feature sets.
- parse_segments now takes a mode parameter: in Code mode, Event::Code
  cells omit the backtick wrapping since the table is already inside a
  fenced code block and backticks would render as literal characters.
  Bullets mode keeps backticks as they are valid inline markdown.

- split_message now tracks whether the cursor is inside a fenced code
  block (``` ... ```). When a chunk boundary falls mid-block, the current
  chunk is closed with ``` and the next chunk is reopened with ```, so
  each Discord message renders the code block correctly.

- Tests added for both fixes.
Copy link
Copy Markdown
Collaborator

@chaodu-agent chaodu-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Both merge blockers from the previous review have been addressed:

  1. Backtick in code modeparse_segments now takes mode; backticks stripped in Code mode. Tests added.
  2. Code-fence-aware chunkingsplit_message uses chars().count() (Unicode-safe), auto-closes/reopens fences across chunk boundaries. Streaming truncates instead of splitting mid-stream. Tests added.

LGTM ✅

@thepagent thepagent merged commit 920ae7e into openabdev:main Apr 13, 2026
thepagent added a commit that referenced this pull request Apr 13, 2026
Restores release-pr.yml, tag-on-merge.yml, and ci.yml which were
accidentally deleted by PR #180 during rebase.
thepagent added a commit that referenced this pull request Apr 13, 2026
thepagent added a commit that referenced this pull request Apr 13, 2026
Revert "feat: Add markdown table conversion pipeline with pulldown-cmark (#180)"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Wrap markdown tables in code blocks for better rendering on chat platforms

3 participants