Cut real disk per kson checkout#394
Open
holodorum wants to merge 6 commits into
Open
Conversation
Both test entry points called downloadAndUnzipVSCode('stable') with no
cachePath, so @vscode/test-electron materialized a full ~100-150M VS Code
into a per-checkout .vscode-test/ for every checkout and CI run. Switch to
the options-object form with a cachePath resolved from VSCODE_TEST_CACHE,
falling back to an OS-appropriate user cache dir, so multiple checkouts and
CI reuse a single shared download.
The ~257M JSONTestSuite / JSON-Schema-Test-Suite repos were cloned on
essentially every build because CleanGitCheckout clones in its constructor
and GenerateJsonTestSuiteTask constructed the checkouts in its init {} (so
the clone happened at configuration time), while a universal withType<Task>
dependsOn made every task depend on the generator. Since the 84 generated
test files are git-tracked, the clone is only ever needed to regenerate
them. Defer constructing the checkouts and generator into the @TaskAction
so the clone happens only when the task runs, compute the @OutputDirectory
from the source root + package alone (no checkout), and wire the generator
solely onto the test-compile tasks (compileTestKotlinJvm/Js, hence
jvmTest/jsTest/allTests/check) instead of every task. Downstream consumers
building assemble / -x test no longer pay the clone; local and CI test runs
still get a fresh, verified suite.
Switch the standalone Node/TS projects under tooling/ (language-server-protocol and the lsp-clients workspace + monaco demos) from npm to pnpm so node_modules is hardlinked from pnpm's shared global store instead of being a full per-checkout copy. A second checkout's lsp-clients install now costs ~10M of real disk (df free-space delta) against a ~374M apparent node_modules, versus ~364M cold. These projects are viable for pnpm precisely because they are NOT Kotlin-managed: they carry only the `base` Gradle plugin and run installs through plain PixiExecTask wrappers, so the package manager is a free choice. The Kotlin-compiled JS (build/js, kotlin-js-store/yarn.lock) is untouched and out of scope. node-linker=hoisted keeps a flat node_modules byte-for-byte the same layout as npm, so esbuild/vite/vitest/vsce/@vscode/test-electron all see what they expect while still hardlinking from the store. Under hoisted, file: directory deps are PACKED per the files/exports allowlist (not symlinked to source), which actually improves the demos' "third-party consumer" fidelity over npm's symlink — a demo now sees only what a real downstream `npm install` would ship. The demos are kept OUT of the pnpm workspace (installed with --ignore-workspace) so that packed consumption is preserved; listing them would relink @kson/monaco-editor to source. Packing surfaced a latent gap: the iframe demo serves @kson/monaco-editor/dist-iframe but dist-iframe was missing from monaco's files allowlist, so it never actually shipped (npm's symlink hid this). Added dist-iframe to files so the iframe assets ship for real downstream consumers too. pnpm is added to the lsp-clients and language-server-protocol pixi envs (not kson-lib's, which those builds never use) so `pixiw run pnpm` resolves reproducibly. Per the JSON-nit preference, the only new non-JSON files are the pnpm-mandated pnpm-workspace.yaml and pnpm-lock.yaml; build-script approvals live in package.json via pnpm.onlyBuiltDependencies. Gradle task names are unchanged. Also corrects a stale command in language-server-protocol/README.md (npm run build -> pnpm run compile): the referenced `build` script never existed, so the old line would have failed.
Adds a pnpm global-store cache (keyed on both pnpm-lock.yaml files) so CI reuses the store across runs, and fixes the existing VS Code cache. The VS Code cache is repointed to the electron download directories that vscodeTestCache.ts actually uses (the OS defaults under ~/.cache/vscode-test and ~/Library/Caches/vscode-test, plus the web variant), dropping the now-empty .vscode-test path left behind when the download moved to a shared OS cache dir. Its key is bumped to v2 because CircleCI caches are immutable per key, so the changed paths would otherwise never be saved. config.yml is generated from config.kson by transpileCircleCiConfigTask; both are committed together.
The desktop test (runNodeTests) left VS Code's --user-data-dir at its default in-repo .vscode-test/user-data, whose IPC socket path overflows macOS's 103-char AF_UNIX limit in deeply nested checkouts (e.g. git worktrees), failing the run with `listen EINVAL`. Point it at a short os.tmpdir() path, mirroring runExtension.ts, so the test runs regardless of checkout depth. CI is unaffected (paths are short there already).
The full clone pulled ~268M per checkout (mostly JSONTestSuite's .git history and unused parsers/ binaries) only to use ~5M of it: test_parsing from JSONTestSuite and tests from JSON-Schema-Test-Suite. A sparse + treeless (--filter=blob:none) + shallow (--depth 1) git fetch now pulls only those subdirectories, taking each checkout from ~268M to ~5M (JSONTestSuite 251M->2.0M, JSON-Schema-Test-Suite 5.8M->3.4M) and leaving the generated tests byte-identical. Implemented as an optional sparse mode on CleanGitCheckout: passing sparsePaths switches it from the JGit full clone to a system-git-CLI sparse fetch, while the default (empty) keeps the existing JGit behavior untouched. The two suite classes pass their subdir and keep the same constructor signatures and checkoutDir, so the generator and its wiring are unchanged. The sparse paths are anchored (/path/) since this git is non-cone, so a nested dir of the same name upstream could never be pulled in. Like the JGit path, the sparse mode honors the clean-checkout contract: it verifies the working tree with `git status` and throws DirtyRepoException on a modified checkout rather than silently reusing or nuking it; only a clean checkout that is missing, broken, or at the wrong SHA is refetched.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This fixes the three biggest disk hogs at the source. Numbers are real free-space deltas (
df), notdu— on APFSduovercounts reflinked files.Changes
language-server-protocoland thelsp-clientsworkspace move to pnpm withnode-linker=hoisted(flat, npm-identical layout; hardlinked from the shared global store). A second checkout'snode_modulescosts ~10M instead of ~374M.-x testno longer fetch it) and shrunk via a sparse/treelessgitfetch of only the two dirs we read: 268M → ~5M for test runs.VSCODE_TEST_CACHE(OS-default cache dir) instead of a per-checkout.vscode-test/, so all checkouts/CI reuse one download (~150M once).Supporting
--user-data-dirfor the VS Code desktop test (fits macOS's 103-char socket limit in deep checkouts).dist-iframewas missing from monaco'sfiles, so the iframe assets never shipped.Testing
./gradlew check/allTestsgreen (JVM + JS); the built vsix is byte-identical to the npm baseline; gating verified. The Kotlin-compiled JS (yarn-bound) is untouched.