Skip to content

fix(gemini-web): prefer latest non-empty stream chunk in parser#154

Closed
manhtruong03 wants to merge 1 commit into
steipete:mainfrom
manhtruong03:fix/gemini-streaming-parse-empty-text
Closed

fix(gemini-web): prefer latest non-empty stream chunk in parser#154
manhtruong03 wants to merge 1 commit into
steipete:mainfrom
manhtruong03:fix/gemini-streaming-parse-empty-text

Conversation

@manhtruong03
Copy link
Copy Markdown

@manhtruong03 manhtruong03 commented May 1, 2026

Summary

Fixes the (no text output) symptom for gemini-3-pro / gemini-3.1-pro browser-engine runs by making parseGeminiStreamGenerateResponse walk every wrb.fr chunk and prefer the latest one with non-empty text, instead of break-ing at the first chunk with a candidate (which is usually an empty placeholder under streaming).

Fixes #153.

What changed

  • src/gemini-web/client.ts — two adjustments inside parseGeminiStreamGenerateResponse:
    • drop the early break in the chunk-selection loop; iterate to the end and pick the latest chunk whose text is non-empty. The first chunk with a candidate is kept as the anchor so coalesced 1-chunk responses still parse.
    • switch the candidate-text read from getNestedValue<string>(...) to getNestedValue<unknown>(...) and narrow with typeof === "string". The helper does not validate the runtime shape; this matches the pattern already used elsewhere in the file (lines 256, 269) and prevents a silent lock-in if the schema drifts.
  • tests/gemini-web/parse.test.ts — 3 new unit tests covering multi-chunk streaming, single-chunk regression guard, and a defensive empty-after-non-empty case. Two small synthetic-body helpers added to keep the new cases readable.

Diff size: +56 / -7 across 2 files.

Why this approach

  • Smallest possible change that resolves the symptom. Buffer-and-concat would risk double-printing because each chunk is already cumulative on the wire.
  • Preserves existing single-chunk behavior (the route that masks this bug on macOS) — that path was working and stays unchanged.
  • Image-generation branch — not exercised in this PR. The image extraction loop (src/gemini-web/client.ts:297-328) iterates from bodyIndex forward looking for chunks with image data. With the new predicate, bodyIndex is the latest chunk with non-empty text instead of the first chunk with a candidate. In every captured response the image-bearing chunks arrive at-or-after the final text chunk, so the loop still finds them — but I do not have an automated test for this and a maintainer with --generate-image access may want to spot-check before merging.

How to test

pnpm install
pnpm typecheck
pnpm vitest run tests/gemini-web        # 18 / 18 pass (15 existing + 3 new)
pnpm vitest run                         # 615 pass / 0 fail / 49 skip

# End-to-end (needs a Gemini account already signed in via --browser-manual-login)
pnpm start --engine browser --browser-manual-login \
  --model gemini-3-pro -p "ping" --file README.md --force

Expected end-to-end: a non-empty Answer: block and ↓<n> tokens > 0 (was ↓0 before the fix).

Test plan

  • pnpm typecheck passes.
  • pnpm vitest run tests/gemini-web passes (18/18).
  • Full suite pnpm vitest run passes (615/0/49).
  • Manual end-to-end repro on Windows 11 / Node 25, browser engine, gemini-3-pro — answer text renders, output tokens > 0 (was ↓0 before the fix).
  • Cross-platform validation deferred to CI; I do not have macOS/Linux to run locally. If CI does not exercise the browser engine path, I am happy to add one or to debug failures the maintainer surfaces.
  • --generate-image branch not exercised — see "Risks" below.

Risks and non-goals

  • No public API change. Parser shape is internal; only text extraction logic moves.
  • bodyIndex semantic shift. Before: first chunk with a candidate. After: the chunk eventually selected as body (typically the last with non-empty text). Only one in-function consumer (hasGenerated) reads bodyIndex; behavior verified for the text path, not actively tested for the image path. Documented as a known limitation in the issue comment.
  • Trailing-error-chunk edge case. The new predicate prefers the last chunk with non-empty text. If the server ever ships a final chunk whose text is a server-side error string, it would silently override the real answer. Not observed in captured payloads; will harden if it surfaces.
  • Out of scope: adding a log?.warn(...) when res.ok but text === "" — separate follow-up PR. "Longest text" predicate also out of scope (no data justifying it yet).
  • No new dependencies, no config changes, no migrations.

Notes for the reviewer

  • The typeof guard reads slightly defensive for a 4-line patch; rationale is in the inline comment and in the issue thread (the helper does not validate runtime types).
  • A 48 KB raw response payload from a real gemini-3-pro run was used to derive the chunk sequence. It is not committed because (a) the synthetic minimal bodies in the new tests cover the same shapes more readably, and (b) the raw response embeds an account-bound conversation URL that I would need to scrub before sharing. I can attach a scrubbed copy to the issue if a reviewer wants it.

Branch: fix/gemini-streaming-parse-empty-text (single commit: fix(gemini-web): prefer latest non-empty stream chunk in parser).

Gemini streams BardChatUi responses across multiple wrb.fr chunks; the
first chunk with a candidate often holds an empty placeholder. The
parser broke at that placeholder and returned text="", surfacing as
'(no text output)' for gemini-3-pro / gemini-3.1-pro browser runs. The
bug was hard to spot because some routes (notably macOS) coalesce the
stream into a single chunk and never hit the placeholder.

Loop through all chunks instead, preferring the latest one whose text
is non-empty. Add a typeof guard since getNestedValue does not
validate the runtime type - without it, a future schema drift would
silently lock the parser onto a wrong chunk.

Adds 3 unit tests in tests/gemini-web/parse.test.ts: multi-chunk
streaming, single-chunk regression guard, and empty-after-non-empty
reset.
@manhtruong03 manhtruong03 force-pushed the fix/gemini-streaming-parse-empty-text branch from a23e8e4 to d92baef Compare May 1, 2026 16:36
@steipete
Copy link
Copy Markdown
Owner

steipete commented May 3, 2026

Superseded by #157, which landed the same Gemini streaming parser fix with regression tests and kept the generated-image scan anchored at the first candidate chunk. Thanks again @manhtruong03; credited in CHANGELOG.md.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Gemini browser mode returns (no text output) for gemini-3-pro / gemini-3.1-pro

2 participants