Skip to content

[pull] main from CodebuffAI:main#103

Merged
pull[bot] merged 36 commits into
axistore80-coder:mainfrom
CodebuffAI:main
May 4, 2026
Merged

[pull] main from CodebuffAI:main#103
pull[bot] merged 36 commits into
axistore80-coder:mainfrom
CodebuffAI:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented May 4, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

jahooma and others added 30 commits May 4, 2026 00:49
The tree-sitter wasm regression that crashed freebuff 0.0.62 only
manifested on real Windows. CI was Linux-only, macOS dev machines
behaved fine, and the Windows binary was only built+smoked at release
time (cli-release-build.yml). So the bug shipped twice before being
caught by user reports.

Add a windows-latest job to freebuff-e2e.yml that builds the freebuff
binary natively on Windows and runs the long smoke test against it.
The full tmux-based e2e matrix can't follow — Windows runners don't
ship tmux, and porting tmuxStart/tmuxSend would be substantial — but
smoke-binary.ts catches the failure mode that bit us: it spawns the
binary, waits long enough for the late renderer-cleanup rejection
handler to fire, and asserts both that no fatal markers appeared and
that the boot screen actually rendered.

Mirrors the Windows-specific bits from cli-release-build.yml's
build-windows-binary job: explicit `bun install --cwd cli` and the
@OpenTui workspace symlink fix, both needed because bun workspace
linking doesn't work reliably on Windows runners.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Freebuff 0.0.64 still crashed for users with the same wasm error even
though it was built from a commit that contained the base64 embed. The
runtime stack trace pointed at the path-resolution fallback in
init-node.ts:76, meaning the embed didn't reach the SDK bundle's
globalThis check at runtime — the binary fell through to fs.existsSync
which never works on Windows bunfs paths.

Two hardening passes so this can't ship silently again:

- cli/src/pre-init/tree-sitter-wasm.ts: hidden `--smoke-tree-sitter`
  flag, handled in the very first import. Calls Parser.init({ wasmBinary
  }) directly with the embedded base64 and exits 0/1. Lives here (not
  commander) on purpose — it tests *the embed*, not the broader init
  path that has a path-resolution fallback that would mask a broken
  embed by passing in dev mode.
- cli/scripts/build-binary.ts: post-bun-compile, scan the output binary
  for the wasm's base64 prefix. Build fails if the bytes didn't actually
  make it through bundling (e.g. bun dropping a huge string literal,
  bundle cache reading a stale empty stub). Always-on log of which path
  the wasm was resolved from so CI logs make the embed step diagnosable.
  More resilient resolve: search workspace root, cli/node_modules, and
  sdk/node_modules before falling back to createRequire — Windows CI's
  `bun install --cwd cli` lays out web-tree-sitter differently than
  a hoisted root install.
- packages/code-map/src/init-node.ts: accept bunfs paths
  (`/~BUN/root/...`) without an fs.existsSync check. fs.existsSync
  inconsistently returns false for bun --compile asset paths on Windows
  even though the runtime can read them, so the existing path-resolution
  fallback was permanently broken on Windows. Belt-and-braces: this
  makes the fallback work even if the embed step regresses.
- cli/scripts/smoke-binary.ts: run --smoke-tree-sitter as a deterministic
  pre-check before the long-window boot smoke. A broken embed fails fast
  with a clear "exit code 1, no boot ok marker" error instead of a 10s
  timeout that depends on render-loop timing.

Verified locally: build embeds 205KB wasm as 274KB base64, post-build
verification finds the prefix in the compiled binary, --smoke-tree-sitter
exits 0 with "tree-sitter smoke ok", full smoke passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…le' }`

The base64-in-source approach didn't survive `bun --compile` on Windows.
The CI build's `verifyTreeSitterWasmEmbedded` step caught it:

    Embedded tree-sitter.wasm from D:\a\...\tree-sitter.wasm (205488 bytes
      → 273984 chars base64)
    [343ms]  minify  -16.58 MB
    Embedded tree-sitter wasm prefix not found in D:\a\...\codebuff.exe.

So the embed step wrote the bytes to disk and bun read them, but the
274KB string literal didn't end up in the compiled output — likely
tree-shaken or transformed by the minifier on Windows. The same code
worked on macOS and Linux locally and in CI.

Switch to Bun's documented asset-embed mechanism: import the wasm with
`with { type: 'file' }`. Bun handles this through the bundler's asset
pipeline rather than as a generic string literal, and the resulting
binary contains the wasm bytes verbatim at a bunfs path.

- cli/src/pre-init/tree-sitter-wasm.ts: import the wasm path, set the
  env var (for the locateFile fallback), and try a synchronous read so
  Parser.init can take the wasmBinary fast path. If the read throws
  (some Windows configurations have done this), log loudly so user
  reports include the diagnostic, then fall through to the locateFile
  flow — which init-node.ts now accepts bunfs paths through, even when
  fs.existsSync misreports them.
- The --smoke-tree-sitter handler is now a top-level `await` instead
  of a fire-and-forget IIFE. Without that, commander.parse() ran
  synchronously in main() and failed on the unknown flag before the
  smoke handler could exit cleanly.
- cli/scripts/build-binary.ts: drop the base64 stub-overwrite step
  entirely. New verifyTreeSitterWasmEmbedded reads a 64-byte chunk
  from the *middle* of the source wasm and asserts it appears in the
  compiled binary — that proves *this specific* tree-sitter.wasm
  shipped, not just any wasm (OpenTUI also embeds tree-sitter language
  wasms, so a magic-bytes-only scan would false-pass).
- Delete cli/src/pre-init/tree-sitter-wasm-bytes.ts: no longer used.

Verified locally: build embeds tree-sitter.wasm via the file-attribute
import, post-build verification finds the source bytes at offset
77319353 of the compiled binary, --smoke-tree-sitter exits 0 with
"tree-sitter smoke ok (wasmBinary, 205488 bytes)".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Last attempt put the handler at top-level in the pre-init module behind
a top-level await, on the theory that ESM would pause subsequent module
evaluation until it resolved. That worked on macOS locally but not on
Windows in CI:

    smoke-binary: spawning ./codebuff.exe for 10s…
    error: tree-sitter smoke failed with exit code 1
    error: unknown option '--smoke-tree-sitter'

So commander.parse() ran before our handler exited, which means
top-level await is not actually blocking parent-module evaluation in
the bun --compile output on Windows (or it's getting transformed away
by `--production` minification).

Move the handler to the top of main() in cli/src/index.tsx, before
parseArgs(). At that point commander hasn't run yet, so we can short-
circuit cleanly. The pre-init module's only job is now to publish the
embedded wasm bytes (globalThis) and path (env var); the handler reads
those out of the same channels the production runtime uses.

Verified locally: ./codebuff --smoke-tree-sitter prints
"tree-sitter smoke ok (wasmBinary, 205488 bytes)" and exits 0; full
smoke-binary.ts run passes both the tree-sitter pre-check and the
boot-screen window.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e' }`

On Windows, bun --compile bundles the wasm bytes (build verification
finds them at a known offset) but the JS-level binding from a
node_modules subpath import returns falsy at runtime:

    import wasmPath from 'web-tree-sitter/tree-sitter.wasm'
      with { type: 'file' }
    // wasmPath is undefined on Windows even though the bytes are in
    // the binary

Smoke check on the failed release confirmed it directly:

    tree-sitter smoke FAIL: pre-init published neither globalThis bytes
    nor an env path. The `with { type: 'file' }` import returned falsy.

OpenTUI's own tree-sitter assets work because they're imported via
*relative* paths from inside the package. Mirror that: copy the wasm
into cli/src/pre-init/ before `bun build --compile`, import it
relatively, remove the copy after the build.

- cli/scripts/build-binary.ts: stagePreInitWasm() copies the source
  wasm to cli/src/pre-init/tree-sitter.wasm; cleanup runs after the
  compile and is also wired to process.on('exit') so a build-script
  crash doesn't leave a multi-MB untracked file in the working tree.
  The findWebTreeSitterWasm() lookup is shared with the post-build
  verification.
- cli/src/pre-init/tree-sitter-wasm.ts: import is now `./tree-sitter.wasm`
  (relative). The file is .gitignored so dev-mode runs see no wasm here
  and fall through to init-node.ts's path-based resolution, which
  works locally because node_modules has the file.
- cli/.gitignore: ignore the staged copy.

Verified locally: build stages then cleans up the wasm,
post-build verification finds the bytes, --smoke-tree-sitter exits 0
with "tree-sitter smoke ok (wasmBinary, 205488 bytes)".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three previous approaches all failed on Windows in subtly different ways:

 1. Single 274KB base64 string literal: bun's Windows minifier dropped
    or transformed it (build verified the prefix wasn't in the binary
    even though the embed step wrote the file).
 2. `with { type: 'file' }` from a node_modules subpath: bytes ended up
    in the binary but the import variable was bound to undefined at
    runtime — bun on Windows mishandles the JS-level binding for that
    attribute.
 3. `with { type: 'file' }` from a relative path (wasm copied into
    pre-init/): same as #2 — confirms it's not subpath-vs-relative,
    it's a bun/Windows bug with the import-attribute binding.

Round 4: write the base64 as ~268 small chunks (1024 chars each) in an
exported array, joined and decoded at runtime in the pre-init. Each
chunk is referenced unconditionally at runtime via .join(''), so DCE
can't eliminate it; each is small enough that no minifier heuristic
would treat it as a special "huge string literal" worth dropping.

- cli/scripts/build-binary.ts: embedTreeSitterWasmAsChunks() writes the
  full array, returns sample chunks (start/middle/end) for the post-
  build verification scan to look for in the compiled binary. Restores
  the empty stub eagerly + via process.on('exit').
- cli/src/pre-init/tree-sitter-wasm-bytes.ts: re-introduced as a stub
  exporting an empty readonly string[]. Dev-mode and unit tests see
  the empty stub; production builds get the real chunks written in by
  build-binary.ts.
- cli/src/pre-init/tree-sitter-wasm.ts: import the chunks, .join(''),
  Buffer.from(_, 'base64'), publish on globalThis. The if() guard
  remains because dev mode legitimately has zero chunks.

Verified locally: build embeds 268 chunks, post-build verifies 3 sample
chunks at distinct offsets in the compiled binary, --smoke-tree-sitter
exits 0 with "tree-sitter smoke ok (wasmBinary, 205488 bytes)", full
smoke passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Round 4 (chunked array literals) still failed on Windows: the build's
own verification step caught the first chunk missing from the compiled
binary. So either:

 - Bun's bundler reads tree-sitter-wasm-bytes.ts at static-analysis
   time, sees `export const X = []` (the committed stub), inlines `X`
   into pre-init's call sites, then DCEs the conditional branch that
   would have referenced the chunks. Whatever my embed script wrote
   later is treated as unused and dropped.
 - OR the file write doesn't propagate to disk before bun reads it on
   Windows.

Switch the export from `const` to a function. Function return values
aren't statically inlinable — the bundler can't substitute a literal
empty array at the call site. The chunks live inside the function
body, only materialized when the pre-init calls
`getTreeSitterWasmChunks()`.

Add a sanity re-read after writing the embed file: if NTFS buffers
the write and bun reads the stale stub, the embed step itself fails
*during the build*, with a clear "wrote N chunks but re-read does not
contain chunk[0]" message — instead of letting the build silently
produce a broken artifact.

Verified locally: build embeds 268 chunks, post-build verifies 3
chunks in the compiled binary, --smoke-tree-sitter exits 0,
boot smoke passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five attempts to embed the wasm into the bun --compile binary all
failed on Windows in different ways. Each one's bytes ended up in the
binary (we verified this directly), but every JS-level retrieval
mechanism we tried got stripped by the time the runtime ran:

  1. `with { type: 'file' }` of `web-tree-sitter/tree-sitter.wasm`
     subpath — bytes embedded, import variable bound to undefined.
  2. `with { type: 'file' }` of a copied-in relative .wasm — same as #1.
  3. Single 274KB base64 string literal — got dropped by the minifier.
  4. ~268 chunked base64 string literals — same fate.
  5. Function-export wrapping the chunked array, with eager file write
     verification on disk — chunks confirmed on disk after embed,
     still not present in the compiled output.

The bun-compile-on-Windows code path is doing something destructive
to JS-source-level wasm asset references that we cannot reliably
work around from the source. So bypass the bundler entirely: ship
tree-sitter.wasm as a *sibling file* next to the binary.

- cli/scripts/build-binary.ts: copies the wasm from node_modules to
  cli/bin/tree-sitter.wasm after `bun build --compile`, alongside the
  binary. Drops all the embed/verify machinery from previous rounds.
- cli/src/pre-init/tree-sitter-wasm.ts: at runtime, looks for
  `dirname(process.execPath)/tree-sitter.wasm`, sets the env var that
  init-node.ts reads, and (best-effort) reads the bytes synchronously
  to publish on globalThis for the wasmBinary fast path. Both
  channels feed the same SDK init.
- cli/src/pre-init/tree-sitter-wasm-bytes.ts: deleted. No more
  generated module.
- .github/workflows/cli-release-build.yml: tarball includes
  `tree-sitter.wasm` next to the binary (both matrix and Windows-
  specific job).
- cli/release/index.js + freebuff/cli/release/index.js: the npm
  postinstall downloader now also moves tree-sitter.wasm out of the
  temp extraction dir to live next to the installed binary.

Verified locally: build copies the wasm into bin/, --smoke-tree-sitter
exits 0 with "tree-sitter smoke ok (wasmBinary, 205488 bytes)", full
boot smoke passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Round 6 (sibling-file approach) still failed on Windows. The smoke
handler reports the same pre-init-state-empty error even though the
build script copied tree-sitter.wasm next to the binary just before
the smoke step ran.

Add a diagnostic dump that prints process.execPath, dirname, the
computed siblingPath, existsSync result, the dir listing, env var,
and globalThis state. Whatever the next CI Windows run shows here is
what we need to fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dows

Round 6's diagnostic dump on Windows revealed why
existsSync(siblingPath) was returning false even though the wasm
file was right next to the binary:

    [smoke diag] execPath=D:\a\codebuff\codebuff\cli\bin\codebuff.exe
    [smoke diag] siblingExists=true   (in main())
    [smoke diag] globalThis wasmBinary bytes=0   (set by pre-init)

    Aborted(Error: ENOENT: no such file or directory, open
      'B:\~BUN\root\tree-sitter.wasm')

Pre-init runs at module load. main() runs later. The diag is in
main(), which sees execPath as the disk path. But the ENOENT line
shows what pre-init actually saw: `B:\~BUN\root\tree-sitter.wasm`
— the *bunfs internal* path. So inside a bun --compile binary on
Windows, `process.execPath` returns the bunfs path during early
module evaluation and only switches to the disk path later. Pre-init
silently bailed because that bunfs sibling doesn't exist.

Switch pre-init to use process.argv[0] instead. argv[0] is the path
the binary was *invoked with* — always a real disk path, not a bunfs
internal one. Try execPath as a fallback for environments where
argv[0] is somehow exotic. Whichever yields an existing sibling wins.

Verified locally on macOS where execPath was already the disk path:
build copies wasm to bin/, pre-init finds and reads it,
--smoke-tree-sitter exits 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…back

Round 8 (argv[0] in pre-init) failed on Windows for the same reason
round 7 (execPath in pre-init) did:

    [pre-init diag] argv[0]=bun                              # not a path!
    [pre-init diag] execPath=B:\~BUN\root\<binary>.exe       # bunfs

Pre-init runs at module evaluation time. Inside a bun --compile binary
on Windows during that phase, both `process.argv[0]` and
`process.execPath` lie:

 - argv[0] is `"bun"` (the runtime name), not a real path
 - execPath is the *bunfs internal* path (`B:\~BUN\root\...`),
   not the disk path of the .exe

Both stabilize to real paths by the time main() runs (round 7's main()
diag confirmed that), but the SDK's eager Parser.init has already
fired by then with bad path data.

The fix: do the sibling-file lookup *inside the locateFile callback*
in code-map's init-node.ts. emscripten calls that callback during
Parser.init's async work, after process.execPath has stabilized to
the disk path. By then, `dirname(process.execPath) +
'tree-sitter.wasm'` resolves correctly.

- packages/code-map/src/init-node.ts: add a sibling-of-execPath
  check between the existing scriptDir fallback and the require.resolve
  fallback. Improves the thrown-error message to include the
  attempted execPath dir so future failures are easier to diagnose.
- cli/src/pre-init/tree-sitter-wasm.ts: keep the eager lookup as a
  best-effort fast path (it works on macOS/Linux where execPath is
  the disk path from module-load); on Windows it silently no-ops and
  the locateFile callback handles things lazily. Diagnostic dump
  remains gated on --smoke-tree-sitter so we can see what each phase
  thinks the paths are.

The SDK dist also needs rebuilding so the bundled init-node.ts copy
picks up this change — included in the diff.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jahooma and others added 6 commits May 4, 2026 03:09
…nyway

Round 9 logs showed our locateFile fallback was returning the bunfs
path (`B:\~BUN\root\tree-sitter.wasm`), and emscripten then ENOENT'd
on it. The sibling-of-execPath fallback I added in the previous
commit never ran because the scriptDir branch above it took the
`isBunEmbeddedPath` shortcut and returned early.

The shortcut was based on a wrong assumption: that emscripten could
read bunfs paths. It can't — emscripten's `readAsync` calls
`fs.readFile` under the hood, and `fs.readFile('B:\~BUN\root\...')`
fails the same way `fs.existsSync` does on those paths.

Remove the shortcut. Now resolveTreeSitterWasm only returns paths
that `fs.existsSync` confirms — which on Windows means we skip the
bunfs scriptDir fallback and fall through to the
`dirname(process.execPath)` sibling, where the build script copied
tree-sitter.wasm next to the binary.

Verified locally: build copies wasm to bin/, --smoke-tree-sitter
exits 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Round 10 still failed Windows because the smoke handler in main()
doesn't go through init-node's locateFile callback at all — it
calls Parser.init directly, so my init-node sibling fallback
(rounds 9-10) never runs during the smoke step.

Diagnostic confirmed: at main() time, process.execPath is the disk
path on Windows AND the sibling tree-sitter.wasm exists right next
to it. Pre-init couldn't reach the file (execPath was bunfs at that
phase), so wasmBinary and wasmPath were both empty when smoke ran.

Add the sibling lookup directly to the smoke handler, gated on
those being empty. By main() time the disk path is reliable, so
fs.existsSync(dirname(execPath) + 'tree-sitter.wasm') resolves
correctly and we have something to feed Parser.init.

Real users (no --smoke-tree-sitter flag) still go through the
init-node sibling fallback in the SDK's eager Parser.init — that's
unaffected by this change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pull pull Bot locked and limited conversation to collaborators May 4, 2026
@pull pull Bot added the ⤵️ pull label May 4, 2026
@pull pull Bot merged commit 86ebd09 into axistore80-coder:main May 4, 2026
4 of 7 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant