Tune local model prompt and output cleanup#205
Open
Jam-Cai wants to merge 3 commits into
Open
Conversation
fdf51a2 to
f5c7288
Compare
- Update suggestedPredictionTokenBudget doc-comment to ~2x (values are 14/7, 24/12, 40/20 = exactly 2.0x), not the stale ~1.5x. - Only strip <s>/</s> and [INST]/[/INST] at the start/end of raw output. These are valid in user content (HTML strikethrough, prompt-template docs), so global stripping could silently mangle a correct mid-completion. Unambiguous <|...|> markers still strip globally.
Owner
|
was this for support mlx?? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Validation
swiftlint lint --quiet tabby/Models/SuggestionModels.swift tabby/Support/LlamaPromptRenderer.swift tabby/Support/SuggestionTextNormalizer.swift tabbyTests/LlamaPromptRendererTests.swift tabbyTests/ModelAndPresentationValueTests.swift tabbyTests/SuggestionRequestFactoryTests.swift tabbyTests/SuggestionTextNormalizerTests.swiftxcodebuild -project tabby.xcodeproj -scheme tabby -destination 'platform=macOS' CODE_SIGNING_ALLOWED=NO buildxcodebuild -project tabby.xcodeproj -scheme tabby -destination 'platform=macOS' CODE_SIGNING_ALLOWED=NO build-for-testingGreptile Summary
This PR tunes Cotabby's local-model completion pipeline: token budgets are raised from ~1.5× to ~2× the upper word bound, the context window is widened, two new prompt instructions are added for tone/code fidelity, and backend-specific control-token cleanup is consolidated into a new
stripKnownControlTokenshelper that handles five additional template markers. All affected tests are updated to match the new contracts.SuggestionModels.swift):maxPredictionTokensgoes from 8 → 16,maxPrefixCharactersfrom 1000 → 2000, and each preset'ssuggestedPredictionTokenBudgetis roughly doubled to accommodate modern subword tokenizers.LlamaPromptRenderer.swift): Two new bullets instruct the model to match the user's language/tone/casing and to preserve code symbols, with corresponding test assertions.SuggestionTextNormalizer.swift): Replaces the two hardcodedreplacingOccurrencescalls with a unified helper that strips five globally-safe tokens everywhere and four ambiguous tokens (<s>,</s>,[INST],[/INST]) only at response boundaries.Confidence Score: 5/5
Safe to merge — all changes are confined to pure helper logic and their tests, with no side-effectful service boundaries touched.
Every changed file is either a pure value type, a stateless helper, or a test. The token-budget and context-window increases are well-commented and consistently reflected in tests. The new stripKnownControlTokens helper correctly handles the most common template-leak patterns, and the boundary-only treatment of
/</s> directly addresses the HTML-strikethrough concern from the previous review thread.No files require special attention; the only nuance is in SuggestionTextNormalizer.swift's boundary-strip loop ordering, which is a latent edge case rather than an active defect.
Important Files Changed
stripKnownControlTokens; expands the strip-everywhere list with five new tokens and adds boundary-only handling for<s>,</s>,[INST],[/INST]. Single-pass loop ordering can miss a<s>exposed after stripping a leading</s>.maxPredictionTokensdoubled to 16 andmaxPrefixCharactersdoubled to 2000. Well-commented and consistently tested.maxPredictionTokensassertion to 40 and extracted a long inline string into a named variable for readability.Reviews (3): Last reviewed commit: "Address review: fix token-budget doc and..." | Re-trigger Greptile