Add to_markdown to Action Text, mirroring to_plain_text#56858
Merged
dhh merged 1 commit intorails:mainfrom Feb 24, 2026
Merged
Conversation
Introduce markdown conversion across the Action Text stack: - `Content#to_markdown` renders attachments then converts the fragment - `Fragment#to_markdown` delegates to the new `MarkdownConversion` module - `RichText#to_markdown` delegates through `Content` - `Attachment#to_markdown` delegates to `attachable_markdown_representation` All attachable types implement `attachable_markdown_representation`: - `RemoteImage` renders `` - `ContentAttachment` converts its embedded HTML to markdown - `ActiveStorage::Blob` renders `[caption || filename]` - `MissingAttachable` renders `☒` (see related rails#56854) `MarkdownConversion` is a bottom-up tree reducer (like `PlainTextConversion`) that converts HTML nodes to Markdown. It handles inline formatting (bold, italic, strikethrough, code), block elements (paragraphs, headings, blockquotes, code blocks, horizontal rules), lists (ordered, unordered, nested), links, tables, and details/summary. This implementation was manually tested against the markup generated by both Trix and Lexxy. This commit also promotes the `BottomUpReducer` tree-walking class from `ActionText::PlainTextConversion` into its own file as `ActionText::BottomUpReducer` for use by both conversion modules. Security note: Markdown links are checked against Loofah's allowed URI protocols to prevent unsafe schemes like `javascript:` from appearing in the output. Performance note: This implementation benchmarks ~35% faster than `to_plain_text` on my development machine for a ~42k HTML document generated by cmark-gfm from a real-world markdown file: ``` \#!/usr/bin/env ruby require "bundler/inline" gemfile do source "https://rubygems.org" gem "benchmark-ips" end require_relative "../config/environment" html = `cmark-gfm --to html #{File.expand_path("~/code/github.com/basecamp/activerecord-tenanted/GUIDE.md")}` puts "HTML: #{html.bytesize} bytes, #{html.lines.count} lines" html_doc = Nokogiri::HTML5.fragment(html) Benchmark.ips do |x| x.report("to_markdown") { ActionText::MarkdownConversion.node_to_markdown(html_doc) } x.report("to_plain_text") { ActionText::PlainTextConversion.node_to_plain_text(html_doc) } x.compare! end ``` ``` HTML: 42282 bytes, 932 lines ruby 3.4.7 (2025-10-08 revision 7a5688e2a2) +PRISM [x86_64-linux] Warming up -------------------------------------- to_markdown 7.000 i/100ms to_plain_text 4.000 i/100ms Calculating ------------------------------------- to_markdown 70.931 (± 7.0%) i/s (14.10 ms/i) - 357.000 in 5.066019s to_plain_text 51.811 (± 5.8%) i/s (19.30 ms/i) - 260.000 in 5.033790s Comparison: to_markdown: 70.9 i/s to_plain_text: 51.8 i/s - 1.37x slower ```
a852bfc to
c3dd09b
Compare
This was referenced Feb 24, 2026
|
had the same idea :) be83ff0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation / Background
Today, Rails supports two formats for "exporting" rich text content:
HTML is very verbose, and not always handled well (or cheaply) by LLM agents. Plain text format is
pretty good for both human and agent consumption, but loses some of the nuance and formatting of the
original (for example: headers, strong, italic, strikethrough).
This pull request was created to allow rich text to be exported as Markdown, keeping as much of the
formatting intact as possible.
Detail
Introduce markdown conversion across the Action Text stack:
Content#to_markdownrenders attachments then converts the fragmentFragment#to_markdowndelegates to the newMarkdownConversionmoduleRichText#to_markdowndelegates throughContentAttachment#to_markdowndelegates toattachable_markdown_representationAll attachable types implement
attachable_markdown_representation:RemoteImagerendersContentAttachmentconverts its embedded HTML to markdownActiveStorage::Blobrenders[caption || filename]MissingAttachablerenders☒(see related Render MissingAttachable as "☒" in plain text #56854)MarkdownConversionis a bottom-up tree reducer (likePlainTextConversion) that converts HTML nodes to Markdown. It handles inline formatting (bold, italic, strikethrough, code), block elements (paragraphs, headings, blockquotes, code blocks, horizontal rules), lists (ordered, unordered, nested), links, tables, and details/summary. This implementation was manually tested against the markup generated by both Trix and Lexxy.This commit also promotes the
BottomUpReducertree-walking class fromActionText::PlainTextConversioninto its own file asActionText::BottomUpReducerfor use by both conversion modules.Additional information
Security note: Markdown links are checked against Loofah's allowed URI protocols to prevent unsafe schemes like
javascript:from appearing in the output.Performance note: This implementation benchmarks ~35% faster than
to_plain_texton my development machine for a ~42k HTML document generated by cmark-gfm from a real-world markdown file (but I'll note it's likely that a little effort can bring plain text conversion in line with the markdown performance, should someone choose to work on i).Checklist
Before submitting the PR make sure the following are checked:
[Fix #issue-number]