Skip to content

perf: optimize string literal equality simprocs for kernel efficiency#12887

Merged
nomeata merged 3 commits intomasterfrom
joachim/string-neq-proc3
Mar 14, 2026
Merged

perf: optimize string literal equality simprocs for kernel efficiency#12887
nomeata merged 3 commits intomasterfrom
joachim/string-neq-proc3

Conversation

@nomeata
Copy link
Copy Markdown
Collaborator

@nomeata nomeata commented Mar 11, 2026

This PR optimizes the String.reduceEq, String.reduceNe, and Sym.Simp string equality simprocs to produce kernel-efficient proofs. Previously, these used String.decEq which forced the kernel to run UTF-8 encoding/decoding and byte array comparison, causing 86+ kernel unfoldings on short strings.

The new approach reduces string inequality to List Char via String.ofList_injective, then uses two strategies depending on the difference:

  • Different characters at position i: Projects to Nat via congrArg (fun l => (List.get!Internal l i).toNat), then uses Nat.ne_of_beq_eq_false rfl. This avoids Decidable instances entirely — the kernel only evaluates Nat.beq on two concrete natural numbers.

  • One string is a prefix of the other: Uses congrArg (List.drop n ·) with List.cons_ne_nil, which is a definitional proof requiring no decide step at all.

For equal strings, eq_true rfl avoids kernel evaluation entirely.

The shared proof construction is in Lean.Meta.mkStringLitNeProof (Lean/Meta/StringLitProof.lean), used by both the standard simprocs and the Sym.Simp ground evaluator.

Kernel max unfolds for "hello" ≠ "foo": 86+ → 6.

@nomeata nomeata requested a review from leodemoura as a code owner March 11, 2026 15:55
@nomeata nomeata added the changelog-tactics User facing tactics label Mar 11, 2026
@github-actions github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Mar 11, 2026
@leanprover-bot
Copy link
Copy Markdown
Collaborator

leanprover-bot commented Mar 11, 2026

Reference manual CI status:

  • ❗ Reference manual CI can not be attempted yet, as the nightly-testing-2026-03-09 tag does not exist there yet. We will retry when you push more commits. If you rebase your branch onto nightly-with-manual, reference manual CI should run now. You can force reference manual CI using the force-manual-ci label. (2026-03-11 17:03:40)
  • ✅ Reference manual branch lean-pr-testing-12887 has successfully built against this PR. (2026-03-14 15:12:27) View Log
  • 🟡 Reference manual branch lean-pr-testing-12887 build against this PR didn't complete normally. (2026-03-14 15:12:42) View Log

mathlib-nightly-testing bot pushed a commit to leanprover-community/batteries that referenced this pull request Mar 11, 2026
@github-actions github-actions bot added the mathlib4-nightly-available A branch for this PR exists at leanprover-community/mathlib4-nightly-testing:lean-pr-testing-NNNN label Mar 11, 2026
mathlib-nightly-testing bot pushed a commit to leanprover-community/mathlib4-nightly-testing that referenced this pull request Mar 11, 2026
@mathlib-lean-pr-testing mathlib-lean-pr-testing bot added the builds-mathlib CI has verified that Mathlib builds against this PR label Mar 11, 2026
@mathlib-lean-pr-testing
Copy link
Copy Markdown

mathlib-lean-pr-testing bot commented Mar 11, 2026

Mathlib CI status (docs):

  • ✅ Mathlib branch lean-pr-testing-12887 has successfully built against this PR. (2026-03-11 17:59:55) View Log
  • ❗ Mathlib CI can not be attempted yet, as the nightly-testing-2026-03-14 tag does not exist there yet. We will retry when you push more commits. If you rebase your branch onto nightly-with-mathlib, Mathlib CI should run now. You can force Mathlib CI using the force-mathlib-ci label. (2026-03-14 15:06:40)

@nomeata nomeata force-pushed the joachim/string-neq-proc3 branch from c0bde30 to b251f89 Compare March 12, 2026 15:28
@nomeata
Copy link
Copy Markdown
Collaborator Author

nomeata commented Mar 12, 2026

!radar

@leanprover-radar
Copy link
Copy Markdown

leanprover-radar commented Mar 12, 2026

Benchmark results for b251f89 against e804829 are in. There are no significant changes. @nomeata

  • 🟥 build//instructions: +2.5G (+0.02%)

Small changes (3🟥)

  • 🟥 build/module/Lean.Meta.Sym.Simp.EvalGround//instructions: +257.8M (+1.13%) (reduced significance based on *//lines)
  • 🟥 build/module/Lean.Meta.Tactic.Simp.BuiltinSimprocs.String//instructions: +265.6M (+11.21%) (reduced significance based on *//lines)
  • 🟥 build/module/Lean.Meta.Tactic.Simp.BuiltinSimprocs.Util//instructions: +254.3M (+34.75%) (reduced significance based on *//lines)

mathlib-nightly-testing bot pushed a commit to leanprover-community/batteries that referenced this pull request Mar 12, 2026
mathlib-nightly-testing bot pushed a commit to leanprover-community/mathlib4-nightly-testing that referenced this pull request Mar 12, 2026
@mathlib-lean-pr-testing
Copy link
Copy Markdown

Mathlib CI status (docs):

nomeata and others added 3 commits March 14, 2026 09:18
This PR optimizes the `String.reduceEq`, `String.reduceNe`, and `Sym.Simp`
string equality simprocs to produce kernel-efficient proofs. Previously, these
used `String.decEq` which forced the kernel to run UTF-8 encoding/decoding and
byte array comparison, causing 86+ kernel unfoldings on short strings.

The new approach uses `String.ofList_injective` combined with
`congrArg (List.get?Internal · i)` at the first differing character position,
reducing kernel work to O(first_diff_pos) list indexing plus a single character
comparison. For equal strings, `eq_true rfl` avoids kernel evaluation entirely.

The shared proof construction is in `Lean.Meta.mkStringLitNeProof`, used by both
the standard simprocs and the `Sym.Simp` ground evaluator.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This PR further optimizes the kernel cost of string literal inequality proofs:

- When strings differ at a character position, use `List.get!Internal` instead
  of `List.get?Internal` to compare `Char` directly without the `Option` wrapper.

- When one string is a prefix of the other, use `List.drop` with
  `List.cons_ne_nil` instead of indexing past the end. This avoids `decide`
  entirely since `cons_ne_nil` is a definitional proof.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This PR replaces the `decide` step in string inequality character comparison
with `Char.toNat` projection followed by `Nat.ne_of_beq_eq_false rfl`. The
kernel now evaluates `Nat.beq` on two concrete natural numbers instead of
going through the `Decidable` instance for `Char` equality, reducing the
max kernel unfolds from 12 to 6 for typical string inequality proofs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@nomeata nomeata force-pushed the joachim/string-neq-proc3 branch from b251f89 to 005e81b Compare March 14, 2026 09:18
@nomeata
Copy link
Copy Markdown
Collaborator Author

nomeata commented Mar 14, 2026

!radar

@leanprover-radar
Copy link
Copy Markdown

leanprover-radar commented Mar 14, 2026

Benchmark results for 005e81b against 47b3be0 are in. There are no significant changes. @nomeata

  • 🟥 build//instructions: +1.8G (+0.01%)

Small changes (6🟥)

  • 🟥 build/module/Lean.Meta.Sym.Simp.EvalGround//instructions: +232.5M (+1.06%) (reduced significance based on *//lines)
  • 🟥 build/module/Lean.Meta.Tactic.Simp.BuiltinSimprocs.String//instructions: +209.4M (+9.69%) (reduced significance based on *//lines)
  • 🟥 build/module/Lean.Meta.Tactic.Simp.BuiltinSimprocs.Util//instructions: +238.6M (+33.31%) (reduced significance based on *//lines)
  • 🟥 elab/big_omega//task-clock: +77ms (+4.49%)
  • 🟥 elab/big_omega//wall-clock: +77ms (+4.66%)
  • 🟥 elab/simp_congr//task-clock: +47ms (+6.13%)

@nomeata
Copy link
Copy Markdown
Collaborator Author

nomeata commented Mar 14, 2026

!radar

@leanprover-radar
Copy link
Copy Markdown

leanprover-radar commented Mar 14, 2026

Benchmark results for aaaf1dc against 47b3be0 are in. There are no significant changes. @nomeata

  • 🟥 build//instructions: +2.0G (+0.02%)

Small changes (3🟥)

  • 🟥 build/module/Lean.Meta.Sym.Simp.EvalGround//instructions: +230.7M (+1.05%) (reduced significance based on *//lines)
  • 🟥 build/module/Lean.Meta.Tactic.Simp.BuiltinSimprocs.String//instructions: +212.2M (+9.82%) (reduced significance based on *//lines)
  • 🟥 build/module/Lean.Meta.Tactic.Simp.BuiltinSimprocs.Util//instructions: +238.3M (+33.27%) (reduced significance based on *//lines)

@nomeata
Copy link
Copy Markdown
Collaborator Author

nomeata commented Mar 14, 2026

!radar

@leanprover-radar
Copy link
Copy Markdown

leanprover-radar commented Mar 14, 2026

Benchmark results for 3064709 against 47b3be0 are in. There are no significant changes. @nomeata

  • 🟥 build//instructions: +2.2G (+0.02%)

Small changes (3🟥)

  • 🟥 build/module/Lean.Meta.Sym.Simp.EvalGround//instructions: +240.8M (+1.10%) (reduced significance based on *//lines)
  • 🟥 build/module/Lean.Meta.Tactic.Simp.BuiltinSimprocs.String//instructions: +210.8M (+9.75%) (reduced significance based on *//lines)
  • 🟥 build/module/Lean.Meta.Tactic.Simp.BuiltinSimprocs.Util//instructions: +236.6M (+33.03%) (reduced significance based on *//lines)

@nomeata nomeata added this pull request to the merge queue Mar 14, 2026
Merged via the queue into master with commit c2d4079 Mar 14, 2026
15 checks passed
leanprover-bot added a commit to leanprover/reference-manual that referenced this pull request Mar 14, 2026
@leanprover-bot leanprover-bot added the builds-manual CI has verified that the Lean Language Reference builds against this PR label Mar 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

builds-manual CI has verified that the Lean Language Reference builds against this PR builds-mathlib CI has verified that Mathlib builds against this PR changelog-tactics User facing tactics mathlib4-nightly-available A branch for this PR exists at leanprover-community/mathlib4-nightly-testing:lean-pr-testing-NNNN toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants