perf: primitive double sort for numeric arrays in std.sort#766
Merged
stephenamar-db merged 1 commit intodatabricks:masterfrom Apr 12, 2026
Merged
Conversation
Replace Comparator-based java.util.Arrays.sort with primitive double sort (DualPivotQuicksort) for numeric arrays. Extracts doubles into primitive array, sorts with intrinsic DualPivotQuicksort, reconstructs Val.Num array via cachedNum. Eliminates Comparator virtual dispatch and Double boxing per comparison. Upstream: jit branch commit b1f64df
He-Pin
commented
Apr 12, 2026
| while (di < n) { | ||
| strict(di) = Val.cachedNum(pos, doubles(di)); di += 1 | ||
| } | ||
| } else if (keyType == classOf[Val.Arr]) { |
Contributor
Author
There was a problem hiding this comment.
Minor: Val.cachedNum(pos, doubles(di)) reconstructs Val.Num objects. This is the same total number of allocations as before, just deferred until after the sort. Consider whether Val.Num caching (e.g., for common values) could further reduce allocation here.
stephenamar-db
approved these changes
Apr 12, 2026
This was referenced Apr 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
std.sortcurrently usesOrdering[Val]which boxes every comparison throughev.compare(Val, Val). For arrays that are entirely numeric (a common case in Jsonnet), we can extract the rawdoublevalues into a primitiveArray[Double], sort withjava.util.Arrays.sort, and reconstruct — avoiding all boxing overhead.Key Design Decision
Add a fast path that detects all-numeric arrays at the start of
std.sort. When every element isVal.Num, extract todouble[], sort natively, and wrap back. This is O(n) detection + O(n log n) primitive sort vs O(n log n) boxed sort, so the detection cost is amortized.Modification
SetModule.scala: AddedprimitiveDoubleSortmethod that:Val.NumArray[Double]with index trackingjava.util.Arrays.sort(dual-pivot quicksort on primitives)Integrated into
std.sortandstd.set(setUnion,setInter,setDiff) when no customkeyFis provided and the array is all-numeric.Benchmark Results
JMH (JVM, single iteration, lower is better)
Hyperfine (Scala Native vs jrsonnet, Apple Silicon)
Analysis
The biggest impact is on
comparison2under Scala Native where the primitive sort avoids boxing overhead that the JVM JIT can optimize but Scala Native cannot. Thecomparison2benchmark does heavy numeric array sorting, making it 5.73x faster than jrsonnet.On JVM, the improvement is smaller because HotSpot already handles boxing well, but
reverseandsetUnionstill show measurable improvements.References
Ported from jit branch commit b1f64df (primitive double sort for numeric arrays).
Result
All 420 tests pass across JVM/JS/WASM/Native × Scala 3.3.7/2.13.18/2.12.21. Massive improvement on numeric sort workloads under Scala Native.