perf(es): lazy-load ES schemas per endpoint to cut startup heap (#171)#218
Conversation
✅MegaLinter analysis: Success
See detailed reports in MegaLinter artifacts MegaLinter is graciously provided by OX Security |
Binary size impactFlagging this up-front: the memory win comes with a real package-size regression that deserves review attention. Measurements (clean builds,
|
| Metric | Main | PR | Multiplier |
|---|---|---|---|
dist/ total |
23 MB | 152 MB | 6.6× |
| npm tarball (compressed) | 2.4 MB | 21.9 MB | 9.1× |
| npm package unpacked | 25.7 MB | 196.1 MB | 7.6× |
Where the bytes live in dist/ (PR branch):
| File type | Main | PR |
|---|---|---|
.js |
3 MB | 35 MB |
.d.ts |
15 MB | 72 MB |
.js.map |
2 MB | 26 MB |
.d.ts.map |
1 MB | 17 MB |
Nearly all of it (146 MB of the 152 MB dist/) is src/es/apis/schemas/. 588 per-endpoint files, each with its transitive type closure inlined and compiled once per file. Source maps and declaration maps balloon disproportionately because Zod's generic types expand into a lot of detail.
Mitigation options, sized
| Strategy | Unpacked | Complexity |
|---|---|---|
Current (ships src/ + full dist/) |
196 MB | baseline |
Drop src/ from files[] in package.json |
152 MB | one line |
Also drop *.map files from the package |
108 MB | npmignore / files glob |
Also drop *.d.ts* under dist/es/apis/schemas/ |
40 MB | postbuild rm. No public-API loss - schemas carry @ts-nocheck so their external types are already any |
Drop all .d.ts* from dist/ |
36 MB | loses public-API types |
The 40 MB target is the natural stopping point: public-API types preserved, internal schema artefacts stripped. It extends the "schemas are codegen, consumers don't introspect their types" stance that @ts-nocheck already assumes.
Proposed handling
Not on this PR, unless reviewers disagree. #171 is framed as a memory issue; the size regression is a side effect that deserves its own review cycle and its own commit history. Plan:
- Merge this PR as-is. The memory fix stands on its own.
- Open a follow-up with:
- Strip
src/fromfiles[]. - Add a postbuild step (
rm dist/es/apis/schemas/*.d.ts dist/es/apis/schemas/*.d.ts.map dist/es/apis/schemas/*.js.map). - Result: 196 MB → 40 MB unpacked, 21.9 MB → ~5 MB tarball. Still ~2× over today, but the rest is genuinely new content (294 inlined closures) that only Option A (ajv-standalone from elastic/elastic-client-generator-js#164) would compress further.
- Strip
Happy to fold the mitigation into this PR as a third commit if reviewers want the size win in the same merge. It would delay merge slightly because changing files[] needs checking against supply-chain scanners, npm view tooling, and GitHub's Dependency Graph, which I haven't audited.
763feb0 to
45caf4f
Compare
Let's just open an issue for package size mitigation. It's a real concern, but you've already mapped a clear path to solving it and reviewing separately will be much easier. |
Thanks for the input Josh! I'll get this tidied up and ready to merge first thing tomorrow |
The `src/es/apis.ts` barrel statically imported every namespace file, which transitively pulled in all ~294 Zod request schemas on any `elastic stack es ...` invocation (even `--help`). Issue #171 measured this as a 373 MB eager heap floor and `elastic stack es --help` at 599 MB RSS. Short-lived CLI processes tolerated it, but `docs chat`, agent-loop invocations, and any future long-running mode cannot. Switch to per-endpoint lazy dispatch. The regenerated code this commit relies on (per-endpoint Zod schemas from elastic/elastic-client-generator-js#164, per-endpoint api files, the manifest) ships in the following commit so this one stays review-able as an architecture change. Wiring changes: - `src/es/apis.ts` is rewritten as a lazy loader. It re-exports the cheap `apiManifest` (metadata only, no Zod refs) and adds `loadEsApi(meta)` / `loadEsApisInFile(stem)` / `loadAllEsApis()` for dynamic-import on demand. - `src/es/register.ts` splits into: - `registerEsCommands(definitions)` - synchronous, takes an explicit list of `EsApiDefinition`. Used by tests (argument stays required, no default-to-`allApis` any more). - `registerEsCommandsLazy()` - async, the production path. Sniffs `process.argv` to identify the invoked leaf, pre-loads only that one endpoint's Zod schema so Commander can register its flags before parsing, and registers every other leaf as a stub. Stubs also lazy-load on demand if the sniff misses (defence in depth). - `src/cli.ts` awaits `registerEsCommandsLazy()`. - `test/es/register.test.ts` updates the "all built-in API schemas are JSON-Schema-serializable" test to use `loadAllEsApis()` explicitly, since `registerEsCommands` no longer defaults. Three integration scripts in `scripts/` post-process generator output until elastic/elastic-client-generator-js#166 folds them upstream: - `ts-nocheck-schemas.mjs` prepends `// @ts-nocheck` to each per- endpoint schema file. Without this, `tsc` OOMs at 8 GB trying to type-check 46 MB of heavily-generic Zod types across 588 files. The schemas are auto-generated and consumers introspect them via `z.infer`, not the internal type structure, so this is safe. - `split-api-files.mjs` splits each namespace-grouped `apis/*.ts` (e.g. `indices.ts` with ~60 APIs) into one file per endpoint (`indices_create.ts`, `indices_delete.ts`, ...) so each file imports exactly one schema. Without this, invoking `elastic stack es indices create` would load all 60 indices schemas (~1 GB), defeating per-endpoint lazy dispatch. - `build-api-manifest.mjs` scans the now-split api files and emits `src/es/api-manifest.ts` (294 entries, metadata only, no Zod refs). Measured impact (after applying this commit + the regenerated code): | Scenario | Before | After | | --------------------------------- | ------- | ---------- | | apis.ts import heap delta | 373 MB | 0.27 MB | | elastic stack es --help RSS | 599 MB | 102 MB | | elastic stack es bulk --help RSS | ~599 MB | 165 MB | | elastic stack es search --help | ~599 MB | 166 MB | | indices create --help RSS | ~599 MB | 216 MB | Per-leaf deltas match the spot-check table in elastic/elastic-client-generator-js#164 exactly (bulk 34 MB, search 36 MB, ml.get-job-stats 0.9 MB, cat.indices 10 MB). Closes #171.
58a5403 to
e513853
Compare
|
Opened #250 to track the package size mitigation. |
On Windows, os.homedir() reads USERPROFILE rather than HOME. Without USERPROFILE: dir, discoverConfigFile() probed the CI runner's actual home (potentially a slow network path), causing the subprocess to exceed Bun's 5-second timeout. That single timeout left Bun's node:test runner in a broken state, cascading into ~20 describe() inside another test() failures.
- Add `codegen:kibana` target invoking `npx tsx cli/kibana/index.ts` and copying `output/kibana/apis/*.ts` into `src/kb/apis/`. The hand-written lazy loader at `src/kb/apis.ts` is intentionally not overwritten. - Run `scripts/build-api-manifest.mjs` after ES codegen and `scripts/build-kb-manifest.mts` after Kibana codegen so the per-endpoint manifests stay in sync with the generated namespace files. - Stop copying `output/es/index.ts` over `src/es/apis.ts` (it became a hand-written lazy loader in #218). - Fix `build-kb-manifest.mts` to load definitions directly from each `src/kb/apis/*.ts` file (the previous `allKbApis` import was dead since the lazy-loader refactor in #266). Manifest regen also picks up the one endpoint that had been missing as a result. - Workflow: append the kibana step, expand diff-detection and `add-paths` to include `src/kb`, refresh the PR body to list the kibana outputs. - CONTRIBUTING: document the new target. Refs: #79
- Add `codegen:kibana` target invoking `npx tsx cli/kibana/index.ts` and copying `output/kibana/apis/*.ts` into `src/kb/apis/`. The hand-written lazy loader at `src/kb/apis.ts` is intentionally not overwritten. - Run `scripts/build-api-manifest.mjs` after ES codegen and `scripts/build-kb-manifest.mts` after Kibana codegen so the per-endpoint manifests stay in sync with the generated namespace files. - Stop copying `output/es/index.ts` over `src/es/apis.ts` (it became a hand-written lazy loader in #218). - Fix `build-kb-manifest.mts` to load definitions directly from each `src/kb/apis/*.ts` file (the previous `allKbApis` import was dead since the lazy-loader refactor in #266). Manifest regen also picks up the one endpoint that had been missing as a result. - Workflow: append the kibana step, expand diff-detection and `add-paths` to include `src/kb`, refresh the PR body to list the kibana outputs. - CONTRIBUTING: document the new target. Refs: #79
Summary
Fixes #171 by switching
elastic stack es ...to per-endpoint lazy schema loading, using the per-endpoint Zod layout from elastic/elastic-client-generator-js#164 (now merged).apis.tsimport heap deltaelastic stack es --helpRSSelastic stack es bulk --helpRSSelastic stack es search --helpRSSelastic stack es indices createRSSPer-leaf deltas match the spot-check table in elastic/elastic-client-generator-js#164 (bulk 34 MB, search 36 MB, ml.get-job-stats 0.9 MB, cat.indices 10 MB).
Review-ability
Split into two commits:
perf(es): lazy-load ES schemas per endpoint to cut startup heap (#171)- the architecture change. 7 files, +462/-148. This is what reviewers should read.chore(es): regenerate API surface with per-endpoint Zod layout- the regenerated output (842 new files, 64 deleted). Review it by running the pipeline documented in the commit body.How it works
src/es/apis.tsis no longer an eager barrel. It re-exports the metadata-onlyapiManifest(294 entries, no Zod refs) and addsloadEsApi()/loadEsApisInFile()for dynamic-import on demand.src/es/register.tssplits into a syncregisterEsCommands(defs)(explicit definitions, used by tests) and an asyncregisterEsCommandsLazy()(production). The lazy path sniffsprocess.argvto identify the invoked leaf, pre-loads only that endpoint's schema so Commander can register its flags before parsing, and registers every other leaf as a stub.scripts/*.mjspost-process generator output: prepend@ts-nocheckto 588 schema files (fixes atscOOM at 8 GB), split namespace-grouped api files into per-endpoint, build the manifest. They live in this repo for now; elastic/elastic-client-generator-js#166 tracks folding them into the generator.Behavioural notes for reviewers
registerEsCommands()no longer has a default argument. Callers must pass an explicitEsApiDefinition[]or switch toregisterEsCommandsLazy(). Only one test was affected; it now usesloadAllEsApis().elastic stack es --helpwith the same names, namespaces, and descriptions. The only user-visible change is that top-level help is now fast and light.--help --jsonoutput are unchanged for every leaf.Test plan
tsc --noEmitclean (8 GB heap; would OOM without@ts-nocheckon schemas)npm testpasses 867/867elastic stack es --help,elastic stack es bulk --help,elastic stack es indices --help,elastic stack es indices create --help,elastic stack es search --helpall render correct flags and descriptions/usr/bin/time -lpfor every case in the table abovenpm run test:functional:es) against a live ES. I did not run them.--help).Follow-ups
--layout per-endpointgets ts-nocheck, per-endpoint api files, and a manifest natively.docs chator a future long-running mode is still heap-constrained.Closes #171.