Fix mid-word completion inserting a stray space on accept#622
Merged
Conversation
insertionChunk synthesized a separator whenever a completion's first word met the user's partial word, so "after" + model "noon" committed as "after noon" even though the ghost text showed "afternoon". The space was injected only at accept time, so it never appeared in the overlay: a non-WYSIWYG surprise on every mid-word accept. The base-model prompt ends at a clean boundary, so the model's first token already encodes intent: a leading space means new word, none means continue the current word. Honor it by removing the boundary synthesis (Rule 2) and typing the chunk verbatim. The leading-space dedup (Rule 1) stays, because the prompt is trimmed before generation so the model always emits a leading space for a new word that would otherwise double against whitespace the field already provides.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
On Tab, a completion that continued the user's partial word (caret at the end of
after, model suggestsnoon) committed asafter nooninstead ofafternoon. The accept path synthesized a word boundary the ghost text never showed, so the inserted text disagreed with the overlay. This removes that synthesis and trusts the model's own leading-space signal, so what Tab inserts matches what the ghost showed.The base-model prompt ends at a clean boundary (
BaseCompletionPromptRenderertrims trailing whitespace), so the model's first token already encodes intent: a leading space means "new word", none means "continue the current word".insertionChunknow types the chunk verbatim and lets that decide the boundary, instead of injecting a separator whenever a word character met a word character.Validation
Added a regression test pinning the reported case:
insertionChunk(forAcceptedChunk: "noon", precedingText: "after") == "noon"(was" noon"). Manually confirmed in-app that mid-word completions now accept without a stray space, and that genuine new-word completions still arrive with their space.Linked issues
Master tracking issue: #623
Fixes #621
Fixes #620
Fixes #617
Fixes #615
Fixes #614
Fixes #559
Fixes #557
Fixes #553
Fixes #548
Fixes #547
Fixes #525
Fixes #491
Fixes #479
Fixes #395
Fixes #549
Fixes #543
Fixes #507
Risk / rollout notes
insertionChunkno longer synthesizes a separator when a word-character chunk meets a word-character prefix; it types the chunk verbatim. The model's leading space (already carried into the first chunk bynextAcceptanceChunk) is what now marks a new word.Hello+World), the words now glue intoHelloWorldrather than being separated. This is WYSIWYG, the ghost text showed the glue, and it is strictly less confusing than the previous silent injection that producedafter noonfrom a ghost readingafternoon.afternoonglued. The data to tune it lives inllm-io.jsonl.Greptile Summary
Fixes a bug where mid-word Tab completions committed with a stray synthesized space —
after+noonbecameafter nooninstead ofafternoon. The fix removes the old word-boundary synthesis ininsertionChunk, leaving only the double-space dedup, so the accepted text now matches exactly what the ghost overlay showed (WYSIWYG).SuggestionSessionReconciler.insertionChunk: Old Rule 2 (inject a space whenever chunk-starts-with-word-char meets prefix-ends-in-word-char) is dropped entirely. The function now either strips a redundant leading space (when the live field already ends in whitespace) or types the chunk verbatim, trusting the model's own leading-space signal.test_insertionChunk_continuesPartialWordWhenModelOmitsLeadingSpaceregression case locks in the exactafter/noon→afternoonscenario from issue [Bug] Auto-Complete Bug #621.Confidence Score: 4/5
Safe to merge; change is confined to the accept path and makes committed text match what the ghost overlay showed.
The removal of word-boundary synthesis is clearly correct and matches the documented WYSIWYG intent. The doc comment for
insertionChunkgrounds its reasoning inBaseCompletionPromptRenderer's prompt-trimming behaviour, but the function is also exercised by the Foundation Model backend whose prompt renderer does not trim trailing whitespace — the contract still holds, but the stated rationale only covers one of the two code paths.The doc comment in
SuggestionSessionReconciler.insertionChunk(lines 415–420) deserves a second look to confirm it accurately describes both the llama and Foundation Model backends.Important Files Changed
insertionChunk, leaving only the double-space dedup (old Rule 1); logic is clean and well-commented, with a minor doc-comment gap for the Foundation Model pathFlowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A["insertionChunk(chunk, precedingText)"] --> B{precedingText ends\nin horizontal whitespace?} B -- Yes --> C["Drop leading horizontal\nwhitespace from chunk"] C --> D["Return stripped chunk\n(prevents double-space)"] B -- No --> E["Return chunk verbatim\n(trust model's leading-space signal)"] E --> F{Did model include\nleading space?} F -- "Yes, e.g. ' World'" --> G["'Hello' + ' World'\n→ 'Hello World' ✓"] F -- "No, e.g. 'noon'" --> H["'after' + 'noon'\n→ 'afternoon' ✓ (was 'after noon')"]Reviews (1): Last reviewed commit: "Trust the model's word boundary on sugge..." | Re-trigger Greptile