Skip to content

feat: Array.mergeSort#12385

Merged
datokrat merged 13 commits intomasterfrom
paul/array-mergeSort
Mar 6, 2026
Merged

feat: Array.mergeSort#12385
datokrat merged 13 commits intomasterfrom
paul/array-mergeSort

Conversation

@datokrat
Copy link
Copy Markdown
Contributor

@datokrat datokrat commented Feb 9, 2026

This PR implements a merge sort algorithm on arrays. It has been measured to be about twice as fast as List.mergeSort for large arrays with random elements, but for small or almost sorted ones, the list implementation is faster. Compared to Array.qsort, it is stable and has O(n log n) worst-case cost. Note: There is still a lot of potential for optimization. The current implementation allocates O(n log n) arrays, one per recursive call.

@datokrat datokrat added the changelog-library Library label Feb 9, 2026
@datokrat
Copy link
Copy Markdown
Contributor Author

datokrat commented Feb 9, 2026

!radar

@leanprover-radar
Copy link
Copy Markdown

leanprover-radar commented Feb 9, 2026

Benchmark results for b8e10de against 39c26fc are in! @datokrat

  • 🟥 build//instructions: +9.0G (+0.07%)

Medium changes (1🟥)

  • 🟥 size/init/.olean.private//bytes: +1MiB (+0.56%)

Small changes (1✅, 4🟥)

  • 🟥 binarytrees.st//instructions: +8.9M (+0.01%)
  • 🟥 build/module/Init.Data.Array//instructions: +17.1M (+3.81%)
  • build/module/Lean.Elab.Tactic.Grind.LintExceptions//instructions: -22.9M (-1.73%)
  • 🟥 iterators (compiled)//instructions: +627.5k (+0.13%)
  • 🟥 leanchecker --fresh Init//instructions: +1.5G (+0.46%)

@datokrat datokrat force-pushed the paul/array-mergeSort branch from b8e10de to 98f06d3 Compare February 9, 2026 14:41
@datokrat datokrat changed the base branch from master to paul/base/array-mergeSort February 9, 2026 14:42
@github-actions github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Feb 9, 2026
@leanprover-bot
Copy link
Copy Markdown
Collaborator

leanprover-bot commented Feb 9, 2026

Reference manual CI status:

  • ❗ Reference manual CI can not be attempted yet, as the nightly-testing-2026-02-08 tag does not exist there yet. We will retry when you push more commits. If you rebase your branch onto nightly-with-manual, reference manual CI should run now. You can force reference manual CI using the force-manual-ci label. (2026-02-09 15:57:40)
  • ❗ Reference manual CI will not be attempted unless your PR branches off the nightly-with-manual branch. Try git rebase df26bea7c1dc3a930dc67917f06a5dc814ddfaae --onto 75d7f7eb227bc54dc6ea3d8ead090ee4180debaf. You can force reference manual CI using the force-manual-ci label. (2026-02-10 14:08:06)
  • ❗ Reference manual CI will not be attempted unless your PR branches off the nightly-with-manual branch. Try git rebase 94deb8955649f4b3dba1e7aab60ae8a49cb7a31a --onto 03dc334f73259e1bae0b8f2b80a39a82e1de3df6. You can force reference manual CI using the force-manual-ci label. (2026-02-11 14:15:18)
  • ❗ Reference manual CI will not be attempted unless your PR branches off the nightly-with-manual branch. Try git rebase 94deb8955649f4b3dba1e7aab60ae8a49cb7a31a --onto bda15f6c25557f2b9e4062e640562fc9370e3069. You can force reference manual CI using the force-manual-ci label. (2026-02-16 14:36:28)
  • ❗ Reference manual CI can not be attempted yet, as the nightly-testing-2026-02-18 tag does not exist there yet. We will retry when you push more commits. If you rebase your branch onto nightly-with-manual, reference manual CI should run now. You can force reference manual CI using the force-manual-ci label. (2026-02-19 20:07:44)
  • ❗ Reference manual CI can not be attempted yet, as the nightly-testing-2026-03-02 tag does not exist there yet. We will retry when you push more commits. If you rebase your branch onto nightly-with-manual, reference manual CI should run now. You can force reference manual CI using the force-manual-ci label. (2026-03-02 10:52:12)

mathlib-nightly-testing bot pushed a commit to leanprover-community/batteries that referenced this pull request Feb 9, 2026
@github-actions github-actions bot added the mathlib4-nightly-available A branch for this PR exists at leanprover-community/mathlib4-nightly-testing:lean-pr-testing-NNNN label Feb 9, 2026
mathlib-nightly-testing bot pushed a commit to leanprover-community/mathlib4-nightly-testing that referenced this pull request Feb 9, 2026
@mathlib-lean-pr-testing mathlib-lean-pr-testing bot added the builds-mathlib CI has verified that Mathlib builds against this PR label Feb 9, 2026
@mathlib-lean-pr-testing
Copy link
Copy Markdown

mathlib-lean-pr-testing bot commented Feb 9, 2026

Mathlib CI status (docs):

  • ✅ Mathlib branch lean-pr-testing-12385 has successfully built against this PR. (2026-02-09 16:53:56) View Log
  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase df26bea7c1dc3a930dc67917f06a5dc814ddfaae --onto 4cdc199f772977113bd6c1bf87a6c02b78d5c4d3. You can force Mathlib CI using the force-mathlib-ci label. (2026-02-10 14:08:03)
  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 94deb8955649f4b3dba1e7aab60ae8a49cb7a31a --onto 4cdc199f772977113bd6c1bf87a6c02b78d5c4d3. You can force Mathlib CI using the force-mathlib-ci label. (2026-02-11 14:15:16)
  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 94deb8955649f4b3dba1e7aab60ae8a49cb7a31a --onto 5c7a508e21c5ebc710b0dfe65648d5beba4e34e0. You can force Mathlib CI using the force-mathlib-ci label. (2026-02-16 14:36:27)
  • ❗ Mathlib CI can not be attempted yet, as the nightly-testing-2026-02-18 tag does not exist there yet. We will retry when you push more commits. If you rebase your branch onto nightly-with-mathlib, Mathlib CI should run now. You can force Mathlib CI using the force-mathlib-ci label. (2026-02-19 20:07:42)
  • ❗ Mathlib CI can not be attempted yet, as the nightly-testing-2026-03-02 tag does not exist there yet. We will retry when you push more commits. If you rebase your branch onto nightly-with-mathlib, Mathlib CI should run now. You can force Mathlib CI using the force-mathlib-ci label. (2026-03-02 10:52:10)

@datokrat datokrat force-pushed the paul/array-mergeSort branch from 98f06d3 to a483adf Compare February 10, 2026 13:01
@datokrat datokrat force-pushed the paul/base/array-mergeSort branch from 8e874ca to df26bea Compare February 10, 2026 13:01
@datokrat datokrat force-pushed the paul/array-mergeSort branch from a483adf to 71abf58 Compare February 11, 2026 13:09
@datokrat datokrat changed the base branch from paul/base/array-mergeSort to master February 14, 2026 21:36
@datokrat datokrat force-pushed the paul/array-mergeSort branch from 2442a3a to e9d2e41 Compare February 19, 2026 17:51
@datokrat datokrat marked this pull request as ready for review February 25, 2026 09:27
@datokrat datokrat requested a review from kim-em as a code owner February 25, 2026 09:27
datokrat and others added 11 commits March 2, 2026 10:57
Break down benchmark results by data pattern (reversed, sorted,
random, partially sorted) instead of reporting only aggregate time.

This reveals that the relative performance of Array.mergeSort vs
List.mergeSort depends heavily on data pattern, not element type:

| Pattern          | List winner? | Ratio       |
|------------------|--------------|-------------|
| Random           | no           | Array ~2.5x |
| Reversed         | yes          | List ~1.7x  |
| Sorted           | yes          | List ~1.8x  |
| All same         | yes          | List ~1.6x  |
| Partially sorted | yes          | List ~1.8x  |

The old aggregate-only benchmark was dominated by the fast
low-entropy cases, making List appear faster overall.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update bench.py and bench2.py to parse the new per-pattern output
from the Lean benchmark. bench2.py now generates per-pattern
visualizations (2x2 grid + ratio plot). bench.py updated to parse
the new aggregate format.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@datokrat datokrat force-pushed the paul/array-mergeSort branch from 86be355 to 3be1241 Compare March 2, 2026 10:03
@datokrat datokrat added this pull request to the merge queue Mar 6, 2026
Merged via the queue into master with commit 68ea28c Mar 6, 2026
17 checks passed
sgraf812 pushed a commit that referenced this pull request Mar 9, 2026
This PR implements a merge sort algorithm on arrays. It has been
measured to be about twice as fast as `List.mergeSort` for large arrays
with random elements, but for small or almost sorted ones, the list
implementation is faster. Compared to `Array.qsort`, it is stable and
has O(n log n) worst-case cost. Note: There is still a lot of potential
for optimization. The current implementation allocates O(n log n)
arrays, one per recursive call.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

builds-mathlib CI has verified that Mathlib builds against this PR changelog-library Library mathlib4-nightly-available A branch for this PR exists at leanprover-community/mathlib4-nightly-testing:lean-pr-testing-NNNN toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants