Skip to content

[Repo Assist] perf: avoid Seq allocations in hot markdown-parsing paths#1171

Closed
github-actions[bot] wants to merge 2 commits intomainfrom
repo-assist/perf-avoid-seq-alloc-2026-04-16-5b6ca2b86611649b
Closed

[Repo Assist] perf: avoid Seq allocations in hot markdown-parsing paths#1171
github-actions[bot] wants to merge 2 commits intomainfrom
repo-assist/perf-avoid-seq-alloc-2026-04-16-5b6ca2b86611649b

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

🤖 This is an automated pull request from Repo Assist.

Summary

Three small but impactful performance improvements in parsing hot paths, eliminating unnecessary Seq enumerator allocations and object boxing.

1. removeSpaces — count leading whitespace without Seq

File: src/Common/StringParsing.fs

removeSpaces is called for every XML doc comment and every literate code block. The old implementation counted leading whitespace with:

line |> Seq.takeWhile Char.IsWhiteSpace |> Seq.length

This allocates two IEnumerator objects per non-empty line and boxes each char. The replacement uses line.Length - line.TrimStart().Length, which is a single O(n) pass with no allocations.

2. StartsWithNTimesTrimIgnoreStartWhitespace — count fence chars without Seq.windowed

File: src/Common/StringParsing.fs

This active pattern is called for every line during markdown block parsing to detect fenced code blocks (``` or ~~~). The old implementation:

Seq.windowed start.Length startAndRest
|> Seq.map (fun chars -> System.String(chars))
|> Seq.takeWhile ((=) start)
|> Seq.length

...built a sliding-window sequence over the entire line, allocated a new System.String for each window, and counted with Seq.length. For a 100-character line with a 3-char fence, that's ~98 string allocations before yielding 3.

Replaced with a direct index loop:

let mutable count = 0
let mutable pos = 0
let startLen = start.Length
while pos + startLen <= startAndRest.Length
      && startAndRest.Substring(pos, startLen) = start do
    count <- count + 1
    pos <- pos + startLen
count

3. readXmlElementAsSingleSummary — count leading spaces without Seq

File: src/FSharp.Formatting.ApiDocs/XmlDocReader.fs

Same Seq.takeWhile + Seq.length pattern, called once per line per XML doc comment to detect indentation uniformity. Replaced with line.Length - line.TrimStart([|' '|]).Length.

Test Status

  • dotnet build FSharp.Formatting.sln --configuration Release — 0 warnings, 0 errors
  • ✅ 317 Markdown tests pass
  • ✅ 143 Literate tests pass
  • ✅ 30 CodeFormat tests pass
  • ✅ 29 ApiDocs tests pass
  • ✅ All 520 tests pass — no regressions

Generated by 🌈 Repo Assist, see workflow run. Learn more.

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@97143ac59cb3a13ef2a77581f929f06719c7402a

Generated by 🌈 Repo Assist, see workflow run. Learn more.

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@97143ac59cb3a13ef2a77581f929f06719c7402a

- removeSpaces: replace Seq.takeWhile+Seq.length per line with
  String.TrimStart().Length — avoids boxing each char and allocating
  two enumerators per non-empty line.
- StartsWithNTimesTrimIgnoreStartWhitespace: replace Seq.windowed +
  Seq.map String + Seq.takeWhile + Seq.length with a direct index loop
  — avoids O(n) sliding-window allocations just to count consecutive
  fence characters (e.g. backticks or tildes) at the start of a line.
  Called for every line during markdown block parsing.
- XmlDocReader.readXmlElementAsSingleSummary: same Seq.takeWhile fix
  when checking indentation columns of XML doc comment lines.

All 520 tests pass (317 Markdown, 143 Literate, 30 CodeFormat, 30 ApiDocs).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant