Skip to content

perf: avoid Tuple2 allocation in OP_%, visitLookup, and OP_in dispatch#692

Merged
stephenamar-db merged 1 commit intodatabricks:masterfrom
He-Pin:perf/tuple2-avoidance-plus-lookup
Apr 9, 2026
Merged

perf: avoid Tuple2 allocation in OP_%, visitLookup, and OP_in dispatch#692
stephenamar-db merged 1 commit intodatabricks:masterfrom
He-Pin:perf/tuple2-avoidance-plus-lookup

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented Apr 5, 2026

Motivation

Scala's tuple pattern matching ((visitExpr(a), visitExpr(b)) match { case (X, Y) => ... }) allocates a Tuple2 object on every dispatch. In hot evaluator paths like visitLookup, OP_%, OP_in, OP_==, and OP_!=, this creates millions of short-lived allocations per evaluation. On JVM, the JIT often eliminates these via escape analysis, but on Scala Native, the allocation is real.

Key Design Decision

Replace tuple-based pattern matching with nested match statements:

// Before: allocates Tuple2
(visitExpr(a), visitExpr(b)) match {
  case (X, Y) => ...
}

// After: zero allocation
val l = visitExpr(a)
val r = visitExpr(b)
l match {
  case X => r match { case Y => ... }
  ...
}

Modification

sjsonnet/src/sjsonnet/Evaluator.scala:

  • visitLookup: Replaced tuple match on (value, index) with nested match. Checks lhs type first (Arr/Str/Obj), then rhs type (Num/Str).
  • OP_%: Replaced (l, r) match with nested match. Checks Num first (most common), then Str for format strings.
  • OP_in: Replaced (l, r) match with nested match.
  • OP_== / OP_!=: Replaced tuple match with nested match.

All error messages preserved exactly.

Benchmark Results

JMH — Full Suite (35 benchmarks, @fork(1) @WarmUp(1) @measurement(1))

Benchmark Baseline (ms) This PR (ms) Change
bench.02 38.008 35.650 -6.2%
realistic2 76.840 70.214 -8.6%
comparison 20.185 20.009 -0.9%
comparison2 35.769 36.579 +2.3% (noise)
bench.04 0.474 0.474 ±0%
foldl 0.271 0.270 ±0%

No regressions across all 35 benchmarks.

Analysis

  • Functional equivalence: Nested match dispatches to exactly the same code as tuple match. Error messages preserved.
  • Allocation savings: Each avoided Tuple2 saves 24 bytes (16-byte object header + 2 reference fields).
  • JVM vs Native: On JVM, HotSpot's escape analysis often eliminates Tuple2 allocation. On Scala Native, every Tuple2 is heap-allocated.

References

  • Scala Tuple2 allocation overhead: well-documented in Scala performance guides
  • Pattern consistent with jrsonnet's enum-based dispatch (no allocations)

Result

Zero-allocation operator dispatch via nested match. No regressions. Eliminates Tuple2 allocation in hot evaluator paths.

)
}

// Nested match avoids Tuple2 allocation from (visitExpr(...), visitExpr(...)) match
Copy link
Copy Markdown
Contributor Author

@He-Pin He-Pin Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is ok in Scala 3.8.x but not for older scala versions, so this is needed.

@He-Pin He-Pin marked this pull request as ready for review April 5, 2026 12:30
@He-Pin He-Pin force-pushed the perf/tuple2-avoidance-plus-lookup branch 2 times, most recently from e3121ae to 5e9c296 Compare April 9, 2026 00:56
Rewrite OP_%, visitLookup, and OP_in from tuple pattern match to nested
instanceof match, eliminating Tuple2 allocation per binary operation.
This follows the same pattern used by comparison operators.

OP_+ is intentionally left unchanged as controlled benchmarks showed
its more complex nested dispatch caused JIT regression on object-heavy
workloads.

Changes:
- visitLookup: nested match with per-type fallback error messages
- OP_%: nested match using case class extraction for raw doubles
- OP_in: nested match for (Str, Obj) dispatch

Upstream: jit branch commits 2530390, 671454c
@stephenamar-db stephenamar-db merged commit a89b373 into databricks:master Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants