Skip to content

perf: parse strict JSON imports from bytes#854

Open
He-Pin wants to merge 1 commit into
databricks:masterfrom
He-Pin:perf/strict-json-byte-imports
Open

perf: parse strict JSON imports from bytes#854
He-Pin wants to merge 1 commit into
databricks:masterfrom
He-Pin:perf/strict-json-byte-imports

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented May 13, 2026

Motivation:

PR #840 added a strict JSON import fast path, but cached file imports still decoded every small file to a UTF-8 String before strict .json parsing. Real-world Jsonnet projects such as kube-prometheus import many JSON files, so decoding bytes to text only to parse them as UTF-8 JSON adds avoidable allocation and CPU cost.

Key Design Decision:

Use ujson.ByteArrayParser for strict .json imports and keep small cached resolved files byte-backed until text is explicitly needed by importstr or Fastparse. This keeps the JSON path byte-native while preserving existing text behavior for Jsonnet source parsing.

Modification:

  • CachedResolvedFile now caches small resolved files as Array[Byte], lazily materializes text for getParserInput / readString, and returns cached bytes directly for readRawBytes.
  • CachedResolver.parseJsonImport now parses strict JSON imports with ujson.ByteArrayParser.transform(content.readRawBytes(), visitor).
  • PreloaderTests now assert strict JSON preload does not fall back to Fastparse or decode via readString.

Result:

Output equality was preserved against upstream sjsonnet and source-built jrsonnet:

  • kube-prometheus realworld output: 7,506,029 bytes, byte-identical.
  • large_string_template guard output: 549,674 bytes, byte-identical.

Benchmark Results:

Native whole-process kube-prometheus, command run from jrsonnet/tests/realworld with -J vendor, hyperfine --shell=none --warmup 5 --runs 50:

order upstream/master this PR jrsonnet
forward 143.351 +/- 2.556 ms 135.806 +/- 3.388 ms 91.752 +/- 7.863 ms
reverse 145.082 +/- 2.988 ms 136.392 +/- 2.034 ms 91.365 +/- 1.419 ms

Native guard, bench/resources/cpp_suite/large_string_template.jsonnet:

upstream/master this PR jrsonnet
11.036 +/- 0.807 ms 10.762 +/- 0.650 ms 5.204 +/- 0.511 ms

JMH guard, bench.runRegressions:

benchmark upstream/master this PR
cpp_suite/large_string_template.jsonnet 0.757 ms/op 0.691 ms/op
go_suite/manifestJsonEx.jsonnet 0.055 ms/op 0.054 ms/op

Analysis:

The target workload improves by about 5-6% in both command orders. The remaining gap to jrsonnet is not eliminated, but this change removes one clear extra decode from strict JSON import handling without regressing the non-target renderer guard or JMH guards.

Validation:

  • ./mill --no-server --ticker false --color false -j 1 __.checkFormat
  • ./mill --no-server --ticker false --color false -j 1 __.test (Tests: 440, Passed: 440, Failed: 0)

References:

Follow-up to #840

Motivation:
PR databricks#840 introduced a strict JSON fast path for .json imports but still
forces a full UTF-8 string decode for every cached file before handing
the text to ujson.StringParser. Real-world workloads (e.g. kube-prometheus)
import many .json files; decoding each one twice (once into String for
parsing, again as cache content) is pure overhead.

Key Design Decision:
ujson 4.4.3 ships ByteArrayParser, which parses UTF-8 JSON directly from
a byte array without an intermediate String. Cache small resolved files
as raw bytes (already what we read from disk) and lazily decode text
only when the importstr/parser-input path actually needs it. Preserve
parse-cache content identity by hashing the cached bytes with SHA-256
(length + hex digest) so external ParseCache implementations keep the
same collision resistance as the old full-string key.

Modification:
* Importer.scala: CachedResolver.parseJsonImport now calls
  ujson.ByteArrayParser.transform(content.readRawBytes(), visitor)
  instead of decoding the whole file to String first.
* CachedResolvedFile.scala (JVM/Native): small files are cached as
  Array[Byte]; getParserInput / readString materialize the String
  lazily; readRawBytes returns the cached bytes directly; contentHash
  is length + SHA-256 over the cached bytes; binary imports still use
  StaticBinaryResolvedFile.
* PreloaderTests.scala: tighten the strict-JSON fast-path coverage so
  it fails if the fast path ever falls back to readString().

Result:
* Output equality vs upstream sjsonnet and jrsonnet preserved on
  kube-prometheus and large_string_template.
* Native kube-prometheus hyperfine A/B (forward & reverse):
  clean 139.4 +/- 2.8 ms -> candidate 132.7 +/- 1.9 ms (forward)
  candidate 132.1 +/- 1.9 ms vs clean 140.3 +/- 2.6 ms (reverse)
* Full ./mill __.test green.

References:
Follow-up to databricks#840
@He-Pin He-Pin marked this pull request as ready for review May 13, 2026 10:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant