Skip to content

[Repo Assist] perf: use HashSet(char) and String.exists for adorner detection in TextConversions#1725

Merged
dsyme merged 2 commits intomainfrom
repo-assist/improve-text-conversions-adorner-perf-2026-04-03-e7b09abe75b8053b
Apr 3, 2026
Merged

[Repo Assist] perf: use HashSet(char) and String.exists for adorner detection in TextConversions#1725
dsyme merged 2 commits intomainfrom
repo-assist/improve-text-conversions-adorner-perf-2026-04-03-e7b09abe75b8053b

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot commented Apr 3, 2026

🤖 This is an automated pull request from Repo Assist, an AI assistant for this repository.

What and Why

Two small improvements to TextConversions.RemoveAdorners, which is called on every cell value during CSV/JSON/XML parsing for AsInteger, AsDecimal, and AsInteger64.

1. HashSet<char> instead of F# Set<char> for the private adorner set

-static member val private DefaultRemovableAdornerCharacters =
-    Set.union TextConversions.DefaultNonCurrencyAdorners TextConversions.DefaultCurrencyAdorners
+static member val private DefaultRemovableAdornerCharacters =
+    // HashSet<char> for O(1) membership testing (vs O(log n) for F# Set)
+    HashSet<char>(
+        Set.union TextConversions.DefaultNonCurrencyAdorners TextConversions.DefaultCurrencyAdorners
+    )

The set has ~26 characters. F# Set.Contains is O(log n) (balanced BST). HashSet<char>.Contains is O(1). The public members DefaultNonCurrencyAdorners and DefaultCurrencyAdorners remain Set<char> — no API change.

2. String.exists with early exit instead of a manual loop

-let mutable hasAdorners = false
-for i = 0 to value.Length - 1 do
-    if TextConversions.DefaultRemovableAdornerCharacters.Contains(value.[i]) then
-        hasAdorners <- true
+// Fast path: short-circuit on first adorner found (String.exists stops early)
+let hasAdorners =
+    value |> String.exists (fun c -> TextConversions.DefaultRemovableAdornerCharacters.Contains(c))

The previous loop always iterated to the end of the string even after finding an adorner. String.exists short-circuits on the first match. For values like "$1,234", the adorner $ is at index 0, so the remaining 5 characters are unnecessary work.

Test Status

  • dotnet test tests/FSharp.Data.Core.Tests/2896 passed, 0 failed

Generated by Repo Assist

Generated by 🌈 Repo Assist, see workflow run. Learn more.

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@4ea8c81959909f40373e2a5c2b7fdb54ea19e0a5

…xtConversions

Two small improvements to TextConversions.RemoveAdorners:

1. Switch DefaultRemovableAdornerCharacters from F# Set<char> (O(log n) lookup)
   to System.Collections.Generic.HashSet<char> (O(1) lookup). The set has ~26
   characters; while individual operations are fast, this function is called on
   every cell when parsing CSV/JSON/XML for AsInteger, AsDecimal, AsInteger64.

2. Replace the manual mutable-flag loop with String.exists, which short-circuits
   on the first adorner match. The previous loop always iterated to the end even
   after finding an adorner. For values like ',234' the adorner is at index 0,
   so early exit skips 5 unnecessary iterations.

No API change — DefaultRemovableAdornerCharacters is private. The public
DefaultNonCurrencyAdorners and DefaultCurrencyAdorners members remain Set<char>.
All 2896 tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dsyme dsyme marked this pull request as ready for review April 3, 2026 14:44
@dsyme dsyme merged commit 5f11283 into main Apr 3, 2026
2 checks passed
@dsyme dsyme deleted the repo-assist/improve-text-conversions-adorner-perf-2026-04-03-e7b09abe75b8053b branch April 3, 2026 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant