[pull] main from CodebuffAI:main#103
Merged
Merged
Conversation
The tree-sitter wasm regression that crashed freebuff 0.0.62 only manifested on real Windows. CI was Linux-only, macOS dev machines behaved fine, and the Windows binary was only built+smoked at release time (cli-release-build.yml). So the bug shipped twice before being caught by user reports. Add a windows-latest job to freebuff-e2e.yml that builds the freebuff binary natively on Windows and runs the long smoke test against it. The full tmux-based e2e matrix can't follow — Windows runners don't ship tmux, and porting tmuxStart/tmuxSend would be substantial — but smoke-binary.ts catches the failure mode that bit us: it spawns the binary, waits long enough for the late renderer-cleanup rejection handler to fire, and asserts both that no fatal markers appeared and that the boot screen actually rendered. Mirrors the Windows-specific bits from cli-release-build.yml's build-windows-binary job: explicit `bun install --cwd cli` and the @OpenTui workspace symlink fix, both needed because bun workspace linking doesn't work reliably on Windows runners. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Freebuff 0.0.64 still crashed for users with the same wasm error even
though it was built from a commit that contained the base64 embed. The
runtime stack trace pointed at the path-resolution fallback in
init-node.ts:76, meaning the embed didn't reach the SDK bundle's
globalThis check at runtime — the binary fell through to fs.existsSync
which never works on Windows bunfs paths.
Two hardening passes so this can't ship silently again:
- cli/src/pre-init/tree-sitter-wasm.ts: hidden `--smoke-tree-sitter`
flag, handled in the very first import. Calls Parser.init({ wasmBinary
}) directly with the embedded base64 and exits 0/1. Lives here (not
commander) on purpose — it tests *the embed*, not the broader init
path that has a path-resolution fallback that would mask a broken
embed by passing in dev mode.
- cli/scripts/build-binary.ts: post-bun-compile, scan the output binary
for the wasm's base64 prefix. Build fails if the bytes didn't actually
make it through bundling (e.g. bun dropping a huge string literal,
bundle cache reading a stale empty stub). Always-on log of which path
the wasm was resolved from so CI logs make the embed step diagnosable.
More resilient resolve: search workspace root, cli/node_modules, and
sdk/node_modules before falling back to createRequire — Windows CI's
`bun install --cwd cli` lays out web-tree-sitter differently than
a hoisted root install.
- packages/code-map/src/init-node.ts: accept bunfs paths
(`/~BUN/root/...`) without an fs.existsSync check. fs.existsSync
inconsistently returns false for bun --compile asset paths on Windows
even though the runtime can read them, so the existing path-resolution
fallback was permanently broken on Windows. Belt-and-braces: this
makes the fallback work even if the embed step regresses.
- cli/scripts/smoke-binary.ts: run --smoke-tree-sitter as a deterministic
pre-check before the long-window boot smoke. A broken embed fails fast
with a clear "exit code 1, no boot ok marker" error instead of a 10s
timeout that depends on render-loop timing.
Verified locally: build embeds 205KB wasm as 274KB base64, post-build
verification finds the prefix in the compiled binary, --smoke-tree-sitter
exits 0 with "tree-sitter smoke ok", full smoke passes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…le' }`
The base64-in-source approach didn't survive `bun --compile` on Windows.
The CI build's `verifyTreeSitterWasmEmbedded` step caught it:
Embedded tree-sitter.wasm from D:\a\...\tree-sitter.wasm (205488 bytes
→ 273984 chars base64)
[343ms] minify -16.58 MB
Embedded tree-sitter wasm prefix not found in D:\a\...\codebuff.exe.
So the embed step wrote the bytes to disk and bun read them, but the
274KB string literal didn't end up in the compiled output — likely
tree-shaken or transformed by the minifier on Windows. The same code
worked on macOS and Linux locally and in CI.
Switch to Bun's documented asset-embed mechanism: import the wasm with
`with { type: 'file' }`. Bun handles this through the bundler's asset
pipeline rather than as a generic string literal, and the resulting
binary contains the wasm bytes verbatim at a bunfs path.
- cli/src/pre-init/tree-sitter-wasm.ts: import the wasm path, set the
env var (for the locateFile fallback), and try a synchronous read so
Parser.init can take the wasmBinary fast path. If the read throws
(some Windows configurations have done this), log loudly so user
reports include the diagnostic, then fall through to the locateFile
flow — which init-node.ts now accepts bunfs paths through, even when
fs.existsSync misreports them.
- The --smoke-tree-sitter handler is now a top-level `await` instead
of a fire-and-forget IIFE. Without that, commander.parse() ran
synchronously in main() and failed on the unknown flag before the
smoke handler could exit cleanly.
- cli/scripts/build-binary.ts: drop the base64 stub-overwrite step
entirely. New verifyTreeSitterWasmEmbedded reads a 64-byte chunk
from the *middle* of the source wasm and asserts it appears in the
compiled binary — that proves *this specific* tree-sitter.wasm
shipped, not just any wasm (OpenTUI also embeds tree-sitter language
wasms, so a magic-bytes-only scan would false-pass).
- Delete cli/src/pre-init/tree-sitter-wasm-bytes.ts: no longer used.
Verified locally: build embeds tree-sitter.wasm via the file-attribute
import, post-build verification finds the source bytes at offset
77319353 of the compiled binary, --smoke-tree-sitter exits 0 with
"tree-sitter smoke ok (wasmBinary, 205488 bytes)".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Last attempt put the handler at top-level in the pre-init module behind
a top-level await, on the theory that ESM would pause subsequent module
evaluation until it resolved. That worked on macOS locally but not on
Windows in CI:
smoke-binary: spawning ./codebuff.exe for 10s…
error: tree-sitter smoke failed with exit code 1
error: unknown option '--smoke-tree-sitter'
So commander.parse() ran before our handler exited, which means
top-level await is not actually blocking parent-module evaluation in
the bun --compile output on Windows (or it's getting transformed away
by `--production` minification).
Move the handler to the top of main() in cli/src/index.tsx, before
parseArgs(). At that point commander hasn't run yet, so we can short-
circuit cleanly. The pre-init module's only job is now to publish the
embedded wasm bytes (globalThis) and path (env var); the handler reads
those out of the same channels the production runtime uses.
Verified locally: ./codebuff --smoke-tree-sitter prints
"tree-sitter smoke ok (wasmBinary, 205488 bytes)" and exits 0; full
smoke-binary.ts run passes both the tree-sitter pre-check and the
boot-screen window.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e' }`
On Windows, bun --compile bundles the wasm bytes (build verification
finds them at a known offset) but the JS-level binding from a
node_modules subpath import returns falsy at runtime:
import wasmPath from 'web-tree-sitter/tree-sitter.wasm'
with { type: 'file' }
// wasmPath is undefined on Windows even though the bytes are in
// the binary
Smoke check on the failed release confirmed it directly:
tree-sitter smoke FAIL: pre-init published neither globalThis bytes
nor an env path. The `with { type: 'file' }` import returned falsy.
OpenTUI's own tree-sitter assets work because they're imported via
*relative* paths from inside the package. Mirror that: copy the wasm
into cli/src/pre-init/ before `bun build --compile`, import it
relatively, remove the copy after the build.
- cli/scripts/build-binary.ts: stagePreInitWasm() copies the source
wasm to cli/src/pre-init/tree-sitter.wasm; cleanup runs after the
compile and is also wired to process.on('exit') so a build-script
crash doesn't leave a multi-MB untracked file in the working tree.
The findWebTreeSitterWasm() lookup is shared with the post-build
verification.
- cli/src/pre-init/tree-sitter-wasm.ts: import is now `./tree-sitter.wasm`
(relative). The file is .gitignored so dev-mode runs see no wasm here
and fall through to init-node.ts's path-based resolution, which
works locally because node_modules has the file.
- cli/.gitignore: ignore the staged copy.
Verified locally: build stages then cleans up the wasm,
post-build verification finds the bytes, --smoke-tree-sitter exits 0
with "tree-sitter smoke ok (wasmBinary, 205488 bytes)".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three previous approaches all failed on Windows in subtly different ways:
1. Single 274KB base64 string literal: bun's Windows minifier dropped
or transformed it (build verified the prefix wasn't in the binary
even though the embed step wrote the file).
2. `with { type: 'file' }` from a node_modules subpath: bytes ended up
in the binary but the import variable was bound to undefined at
runtime — bun on Windows mishandles the JS-level binding for that
attribute.
3. `with { type: 'file' }` from a relative path (wasm copied into
pre-init/): same as #2 — confirms it's not subpath-vs-relative,
it's a bun/Windows bug with the import-attribute binding.
Round 4: write the base64 as ~268 small chunks (1024 chars each) in an
exported array, joined and decoded at runtime in the pre-init. Each
chunk is referenced unconditionally at runtime via .join(''), so DCE
can't eliminate it; each is small enough that no minifier heuristic
would treat it as a special "huge string literal" worth dropping.
- cli/scripts/build-binary.ts: embedTreeSitterWasmAsChunks() writes the
full array, returns sample chunks (start/middle/end) for the post-
build verification scan to look for in the compiled binary. Restores
the empty stub eagerly + via process.on('exit').
- cli/src/pre-init/tree-sitter-wasm-bytes.ts: re-introduced as a stub
exporting an empty readonly string[]. Dev-mode and unit tests see
the empty stub; production builds get the real chunks written in by
build-binary.ts.
- cli/src/pre-init/tree-sitter-wasm.ts: import the chunks, .join(''),
Buffer.from(_, 'base64'), publish on globalThis. The if() guard
remains because dev mode legitimately has zero chunks.
Verified locally: build embeds 268 chunks, post-build verifies 3 sample
chunks at distinct offsets in the compiled binary, --smoke-tree-sitter
exits 0 with "tree-sitter smoke ok (wasmBinary, 205488 bytes)", full
smoke passes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Round 4 (chunked array literals) still failed on Windows: the build's own verification step caught the first chunk missing from the compiled binary. So either: - Bun's bundler reads tree-sitter-wasm-bytes.ts at static-analysis time, sees `export const X = []` (the committed stub), inlines `X` into pre-init's call sites, then DCEs the conditional branch that would have referenced the chunks. Whatever my embed script wrote later is treated as unused and dropped. - OR the file write doesn't propagate to disk before bun reads it on Windows. Switch the export from `const` to a function. Function return values aren't statically inlinable — the bundler can't substitute a literal empty array at the call site. The chunks live inside the function body, only materialized when the pre-init calls `getTreeSitterWasmChunks()`. Add a sanity re-read after writing the embed file: if NTFS buffers the write and bun reads the stale stub, the embed step itself fails *during the build*, with a clear "wrote N chunks but re-read does not contain chunk[0]" message — instead of letting the build silently produce a broken artifact. Verified locally: build embeds 268 chunks, post-build verifies 3 chunks in the compiled binary, --smoke-tree-sitter exits 0, boot smoke passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five attempts to embed the wasm into the bun --compile binary all
failed on Windows in different ways. Each one's bytes ended up in the
binary (we verified this directly), but every JS-level retrieval
mechanism we tried got stripped by the time the runtime ran:
1. `with { type: 'file' }` of `web-tree-sitter/tree-sitter.wasm`
subpath — bytes embedded, import variable bound to undefined.
2. `with { type: 'file' }` of a copied-in relative .wasm — same as #1.
3. Single 274KB base64 string literal — got dropped by the minifier.
4. ~268 chunked base64 string literals — same fate.
5. Function-export wrapping the chunked array, with eager file write
verification on disk — chunks confirmed on disk after embed,
still not present in the compiled output.
The bun-compile-on-Windows code path is doing something destructive
to JS-source-level wasm asset references that we cannot reliably
work around from the source. So bypass the bundler entirely: ship
tree-sitter.wasm as a *sibling file* next to the binary.
- cli/scripts/build-binary.ts: copies the wasm from node_modules to
cli/bin/tree-sitter.wasm after `bun build --compile`, alongside the
binary. Drops all the embed/verify machinery from previous rounds.
- cli/src/pre-init/tree-sitter-wasm.ts: at runtime, looks for
`dirname(process.execPath)/tree-sitter.wasm`, sets the env var that
init-node.ts reads, and (best-effort) reads the bytes synchronously
to publish on globalThis for the wasmBinary fast path. Both
channels feed the same SDK init.
- cli/src/pre-init/tree-sitter-wasm-bytes.ts: deleted. No more
generated module.
- .github/workflows/cli-release-build.yml: tarball includes
`tree-sitter.wasm` next to the binary (both matrix and Windows-
specific job).
- cli/release/index.js + freebuff/cli/release/index.js: the npm
postinstall downloader now also moves tree-sitter.wasm out of the
temp extraction dir to live next to the installed binary.
Verified locally: build copies the wasm into bin/, --smoke-tree-sitter
exits 0 with "tree-sitter smoke ok (wasmBinary, 205488 bytes)", full
boot smoke passes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Round 6 (sibling-file approach) still failed on Windows. The smoke handler reports the same pre-init-state-empty error even though the build script copied tree-sitter.wasm next to the binary just before the smoke step ran. Add a diagnostic dump that prints process.execPath, dirname, the computed siblingPath, existsSync result, the dir listing, env var, and globalThis state. Whatever the next CI Windows run shows here is what we need to fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dows
Round 6's diagnostic dump on Windows revealed why
existsSync(siblingPath) was returning false even though the wasm
file was right next to the binary:
[smoke diag] execPath=D:\a\codebuff\codebuff\cli\bin\codebuff.exe
[smoke diag] siblingExists=true (in main())
[smoke diag] globalThis wasmBinary bytes=0 (set by pre-init)
Aborted(Error: ENOENT: no such file or directory, open
'B:\~BUN\root\tree-sitter.wasm')
Pre-init runs at module load. main() runs later. The diag is in
main(), which sees execPath as the disk path. But the ENOENT line
shows what pre-init actually saw: `B:\~BUN\root\tree-sitter.wasm`
— the *bunfs internal* path. So inside a bun --compile binary on
Windows, `process.execPath` returns the bunfs path during early
module evaluation and only switches to the disk path later. Pre-init
silently bailed because that bunfs sibling doesn't exist.
Switch pre-init to use process.argv[0] instead. argv[0] is the path
the binary was *invoked with* — always a real disk path, not a bunfs
internal one. Try execPath as a fallback for environments where
argv[0] is somehow exotic. Whichever yields an existing sibling wins.
Verified locally on macOS where execPath was already the disk path:
build copies wasm to bin/, pre-init finds and reads it,
--smoke-tree-sitter exits 0.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…back
Round 8 (argv[0] in pre-init) failed on Windows for the same reason
round 7 (execPath in pre-init) did:
[pre-init diag] argv[0]=bun # not a path!
[pre-init diag] execPath=B:\~BUN\root\<binary>.exe # bunfs
Pre-init runs at module evaluation time. Inside a bun --compile binary
on Windows during that phase, both `process.argv[0]` and
`process.execPath` lie:
- argv[0] is `"bun"` (the runtime name), not a real path
- execPath is the *bunfs internal* path (`B:\~BUN\root\...`),
not the disk path of the .exe
Both stabilize to real paths by the time main() runs (round 7's main()
diag confirmed that), but the SDK's eager Parser.init has already
fired by then with bad path data.
The fix: do the sibling-file lookup *inside the locateFile callback*
in code-map's init-node.ts. emscripten calls that callback during
Parser.init's async work, after process.execPath has stabilized to
the disk path. By then, `dirname(process.execPath) +
'tree-sitter.wasm'` resolves correctly.
- packages/code-map/src/init-node.ts: add a sibling-of-execPath
check between the existing scriptDir fallback and the require.resolve
fallback. Improves the thrown-error message to include the
attempted execPath dir so future failures are easier to diagnose.
- cli/src/pre-init/tree-sitter-wasm.ts: keep the eager lookup as a
best-effort fast path (it works on macOS/Linux where execPath is
the disk path from module-load); on Windows it silently no-ops and
the locateFile callback handles things lazily. Diagnostic dump
remains gated on --smoke-tree-sitter so we can see what each phase
thinks the paths are.
The SDK dist also needs rebuilding so the bundled init-node.ts copy
picks up this change — included in the diff.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nyway
Round 9 logs showed our locateFile fallback was returning the bunfs
path (`B:\~BUN\root\tree-sitter.wasm`), and emscripten then ENOENT'd
on it. The sibling-of-execPath fallback I added in the previous
commit never ran because the scriptDir branch above it took the
`isBunEmbeddedPath` shortcut and returned early.
The shortcut was based on a wrong assumption: that emscripten could
read bunfs paths. It can't — emscripten's `readAsync` calls
`fs.readFile` under the hood, and `fs.readFile('B:\~BUN\root\...')`
fails the same way `fs.existsSync` does on those paths.
Remove the shortcut. Now resolveTreeSitterWasm only returns paths
that `fs.existsSync` confirms — which on Windows means we skip the
bunfs scriptDir fallback and fall through to the
`dirname(process.execPath)` sibling, where the build script copied
tree-sitter.wasm next to the binary.
Verified locally: build copies wasm to bin/, --smoke-tree-sitter
exits 0.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Round 10 still failed Windows because the smoke handler in main() doesn't go through init-node's locateFile callback at all — it calls Parser.init directly, so my init-node sibling fallback (rounds 9-10) never runs during the smoke step. Diagnostic confirmed: at main() time, process.execPath is the disk path on Windows AND the sibling tree-sitter.wasm exists right next to it. Pre-init couldn't reach the file (execPath was bunfs at that phase), so wasmBinary and wasmPath were both empty when smoke ran. Add the sibling lookup directly to the smoke handler, gated on those being empty. By main() time the disk path is reliable, so fs.existsSync(dirname(execPath) + 'tree-sitter.wasm') resolves correctly and we have something to feed Parser.init. Real users (no --smoke-tree-sitter flag) still go through the init-node sibling fallback in the SDK's eager Parser.init — that's unaffected by this change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )