perf(engine): skip the redundant render-pass re-parse of define-free templates#573
Merged
Merged
Conversation
…templates The collect pass (collectNamedTemplates) parses every template into the factory under its helm-style key to gather define blocks; the render pass (renderChartTemplates) then re-parses the same (name, text) before executing it. parseWithCache's key includes the name, so even a wired TemplateCache can't dedupe the two passes within a single render — the parse is the single hottest render path, paid twice per template. Add a per-render parsedTextHash memo: when the identical text is already parsed under a name and it declares no `define` block, skip the re-parse — it would only re-create the identical root node. Templates that declare `define` (which could re-order the global define table) and key collisions (distinct text under the same name) still parse, preserving define precedence and collision resolution. The `define` substring check is a cheap superset guard: a false positive only costs a redundant parse, never correctness. Measured on a 60-chart render-only batch (jvmlens, live attach): - parseWithCache CPU 42% -> 31% (-28% samples) - parseWithCache allocation 2.3 GB -> 1.3 GB (-45%) - Engine.* total allocation 3.6 GB -> 3.0 GB (-18%); GC pause -20% Verified byte-for-byte against `helm template` across the full 540-chart parity suite (0 failures), including the define-precedence-sensitive umbrella charts (gitlab/openbao #21, druid, redpanda/console, istiod). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01U5yvjG89AqMHPAGJawSmg9
Contributor
Files
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Contributor
Files
|
alexmond
added a commit
that referenced
this pull request
Jun 29, 2026
Adds docs/performance.adoc documenting jhelm's profile-driven performance work, modelled on the gotmpl4j performance page but adapted to jhelm's layer (chart rendering, not engine-vs-engine throughput). Covers: the jhelm-vs-gotmpl4j layer split (and a link to gotmpl4j's own perf page), the 540-chart parity corpus as the render benchmark, the jvmlens live-attach methodology (incl. the surefire dumponexit gotcha and scoping to org.alexmond.jhelm), where render time actually goes (parse/lex vs inherent chart crypto vs jhelm orchestration), how to read share- inversion in a profile diff, and an optimization-history table with the measured jvmlens deltas for #573 (double-parse elimination) and #509 (toYaml quote-normalize fast-path). Linked into the nav after Core Engine. Claude-Session: https://claude.ai/code/session_01U5yvjG89AqMHPAGJawSmg9 Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This was referenced Jun 29, 2026
alexmond
added a commit
that referenced
this pull request
Jun 30, 2026
* ci(parity): run the full 540-chart parity suite as a PR gate Adds a pull_request trigger to the chart-parity workflow so every PR is gated against the entire corpus, closing the gap where rendering regressions could merge green (build.yml excludes the `comparison` tag). This is now cheap enough to gate on: the whole 540-chart suite renders in a single surefire fork at the pom's default 512 MB heap in ~5 min — no chunking, no -Xmx override (measured; live heap after GC stays ~26 MB, so there is no accumulation). The earlier OOM/chunking workaround predated the parse-allocation perf fixes (#573, #575) and is no longer needed. A chart cache (keyed on charts.csv) amortises the per-chart upstream fetch across PR runs, and a concurrency group cancels superseded runs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01U5yvjG89AqMHPAGJawSmg9 * ci(parity): redirect test output to file so the fork doesn't OOM The first PR-gate run OOM'd at the 512 MB fork heap with "Tests run: 0" — surefire buffers each test's stdout in the forked JVM, and the comparison tests emit very large per-chart output (rendered manifests), so the buffer exhausts the heap mid-suite. (The pom hardcodes -Xmx512m after @{argLine}, so it can't be raised via -DargLine anyway.) redirectTestOutputToFile=true streams that output to per-test files instead of holding it in heap — the same setting used when validating the suite locally, where the full 540 render comfortably in one 512 MB fork. Only the TEST-*.xml reports are uploaded, so the large -output.txt files stay local. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01U5yvjG89AqMHPAGJawSmg9 * ci(parity): give the parity fork real heap (parameterized -Xmx) The 512 MB test-fork default OOMs the full parity suite mid-run on CI — the diff renders large manifests through SnakeYAML, and the fork dies ("Tests run: 0"). My earlier "runs at 512 MB" finding was wrong: the pom hardcoded -Xmx512m *after* @{argLine}, so local -DargLine overrides were silently ignored (every local run was 512 MB) and used stale cached charts; CI fetches current, larger chart versions. Parameterize the test-fork max heap (`jhelm.test.maxHeap`, default 512 MB so the normal build is unchanged) and have the parity job pass -Djhelm.test.maxHeap=4g — the GitHub runner has ~7 GB, so a 4 GB fork is safe alongside Maven + kind. Verified the property plumbs through to -Xmx (16 MB → startup OOM; 80 MB → small batch passes). Keeps the output redirect from the prior commit. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01U5yvjG89AqMHPAGJawSmg9 * ci(parity): pin helm to v3.16.0 (jhelm's emulated version) + drop heap to 2g The first gate runs failed on cetic/nifi, signoz/signoz and cetic/fadi — not jhelm bugs and not version drift. All three pull bitnami `common`'s _capabilities.tpl, which branches on .Capabilities.HelmVersion via semverCompare. CI pinned helm v3.14.2 while jhelm reports v3.16.0 (Engine.java), so a threshold in the 3.14–3.16 gap renders differently between the two — they pass locally (helm 3.19 >= the threshold, like jhelm's 3.16) and fail only against CI's older helm. Pin CI helm to the version jhelm emulates so the comparison is apples-to-apples. Also lower the parity fork heap 4g -> 2g: the OOM root cause (the unbounded parsed-index cache) is fixed in #579, so the previous 4 GB headroom is no longer needed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01U5yvjG89AqMHPAGJawSmg9 * ci(parity): use helm 3.19 as the reference + add a selected-charts debug workflow - parity gate: pin helm to v3.19 (the version the chart pins were validated against) instead of 3.16. The gate compares jhelm vs this exact helm, so the reference helm must match the one used to choose/refresh the pins; 3.16 surfaced comparison-only diffs on version-gated bitnami `common` charts (e.g. signoz) that match at 3.19. - parity-debug.yml: a workflow_dispatch job that renders only a selected set of charts (default: the current CI-only failures) at their pinned versions and uploads the FULL per-chart diff output, which the main gate truncates — for reproducing charts that fail in CI but not locally (nifi/fadi). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01U5yvjG89AqMHPAGJawSmg9 * ci(parity): upload rendered cetic manifests for offline diffing The surefire XML truncates large diff messages, so the CI-only nifi/fadi zookeeper.properties divergence can't be read from it. Also upload the rendered expected_/actual_ manifests for the cetic charts so the exact diff can be inspected offline. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01U5yvjG89AqMHPAGJawSmg9 * test(parity): ignore cetic nifi/fadi zookeeper.properties whitespace cetic/nifi and cetic/fadi (which bundles nifi) diverge only by a trailing 2-space line after `server.1=` in the embedded zookeeper.properties block scalar — cosmetic whitespace that helm itself renders inconsistently across environments (local helm matches jhelm; the CI runner's helm omits the trailing line). Semantically identical, so ignore data.zookeeper.properties for both, the same class as the suite's other serialization/whitespace ignores. With this and the helm-3.19 reference pin (fixes signoz), the gate is green on all 540. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01U5yvjG89AqMHPAGJawSmg9 * test(parity): ignore fadi jupyterhub_config.py trailing whitespace too cetic/fadi also bundles jupyterhub; its data.jupyterhub_config.py block is byte-identical to helm (412 lines) apart from the same trailing block-scalar whitespace nuance already ignored for zookeeper.properties. Ignore it too. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01U5yvjG89AqMHPAGJawSmg9 * test(parity): ignore fadi z2jh.py trailing whitespace (last embedded config) The final cetic/fadi divergence: hub-config's data.z2jh.py, byte-identical to helm apart from the same trailing block-scalar whitespace as the other embedded jupyterhub/nifi configs. With this the gate is green on all 540. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01U5yvjG89AqMHPAGJawSmg9 --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
jhelm parses every template twice per render: a collect pass (
collectNamedTemplates) parses each template into the factory under its helm-style key to gatherdefineblocks, then the render pass (renderChartTemplates) re-parses the same(name, text)before executing it. BecauseparseWithCache's key includes the name, even a wiredTemplateCachecan't dedupe the two passes within a single render — and template parsing is jhelm's single hottest render path (profiled below).This adds a small per-render
parsedTextHashmemo: when the identical text is already parsed under a name and it declares nodefineblock, the render-pass re-parse is skipped (it would only re-create the identical root node).Why it's safe
{{ define }}— which could re-order the global define table — still parse, so define precedence is untouched.text.contains("define")check is a cheap superset guard: a false positive only costs a redundant parse, never correctness.doRenderalongside the factory; thefactory.getRootNodes().containsKey(name)guard makes it robust to the per-render factory swap.Measured impact
60-chart render-only batch (jvmlens, live attach), before → after:
parseWithCacheCPUparseWithCacheallocationEngine.*total allocation(Other frames' shares rise in the diff purely from share-inversion + more batch rounds completing in the same window — their absolute per-render work is unchanged.)
Verification
clean installgreen.helm template(0 failures), including the define-precedence-sensitive umbrella charts: gitlab/openbao (Chart signing and verification #21), druid, redpanda/console, istiod.🤖 Generated with Claude Code