Skip to content

perf: fast-path strict json imports#840

Open
He-Pin wants to merge 2 commits intodatabricks:masterfrom
He-Pin:perf/strict-json-import-fast-path
Open

perf: fast-path strict json imports#840
He-Pin wants to merge 2 commits intodatabricks:masterfrom
He-Pin:perf/strict-json-import-fast-path

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented May 11, 2026

Motivation:
Real-world workloads such as kube-prometheus import large strict JSON files. Parsing those files through the full Jsonnet parser creates avoidable parser, AST, and manifestation work.

Key Design Decision:
The fast path only accepts strict .json input and falls back to normal Jsonnet parsing for malformed JSON, duplicate keys, non-finite numbers, parser-depth overflow, and defensive numeric parse failures. Imported JSON objects use a race-free inline-array Val.Obj layout: field caching is disabled for parse-cache-shared literals, and single-field objects avoid lazy value0 mutation. JSON value positions reuse fileScope.noOffsetPos because strict JSON imports do not execute code and per-element positions are not consumed for stack traces.

Modification:
Add a shared strict JSON import visitor in CachedResolver, wire it into Preloader, trim duplicate-key/string visitor work, use inline object layout for imported JSON objects, and add shared-cache/concurrency regression tests.

Benchmark Results:

Benchmark master PR #840 Delta Notes
Scala Native hyperfine kube-prometheus (entry-kube-prometheus.jsonnet -J vendor) 223.351 +/- 12.742 ms 139.041 +/- 2.847 ms -37.75% Output matched master.
Scala Native hyperfine stacked exploration after JSON refinements ~235 ms starting stack 135.9 +/- 1.2 ms ~ -42% cumulative Source-built jrsonnet comparison: 88.087 +/- 1.737 ms, remaining gap 1.54x.
Focused JMH guards manifestJsonEx 0.053, realistic2 39.485, large_string_template 1.102, gen_big_object 0.801 stable From final position-reuse exploration run.

Analysis:
The PR keeps Jsonnet compatibility by treating the JSON parser as an optional strict fast path, not a replacement parser. The added JVM-only concurrency tests exercise shared parse-cache reuse; cross-platform JSON import tests cover the platform-neutral behavior.

References:
Source exploration commits: He-Pin/sjsonnet 995b0822, 114bd69f, 52e22696, ca6f24d5.

Result:
Local ./mill --no-server -j 1 __.reformat and ./mill --no-server -j 1 __.test passed on this split branch (2066/2066).

Motivation:
Kube-prometheus imports large strict JSON files; parsing them through the full Jsonnet parser creates avoidable AST and materialization work.

Modification:
Add a strict .json import fast path shared by CachedResolver and Preloader, trim visitor work, build race-free inline-array objects for imported JSON, and reuse the no-offset position for imported literals.

Result:
Strict JSON imports keep Jsonnet fallback semantics for non-strict inputs while reducing parse and manifestation overhead for large imported JSON data.
@He-Pin He-Pin marked this pull request as ready for review May 11, 2026 09:36
Motivation:
Preloaded importstr and importbin entries for the same path were separated in Preloader, but CachedImporter still keyed reads only by Path. Interpreter evaluation could then reuse a text resolved file for binary reads or the reverse.

Modification:
Key CachedImporter entries by both Path and binaryData, update the top-level source cache insertion to use the text key, and add interpreter-level regression coverage for both import orders.

Result:
Preloaded text and binary imports for the same path remain independent through normal Interpreter evaluation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant