Skip to content

Tune local model prompt and output cleanup#205

Open
Jam-Cai wants to merge 3 commits into
mainfrom
feature/llm-engineering-improvements
Open

Tune local model prompt and output cleanup#205
Jam-Cai wants to merge 3 commits into
mainfrom
feature/llm-engineering-improvements

Conversation

@Jam-Cai
Copy link
Copy Markdown
Collaborator

@Jam-Cai Jam-Cai commented May 25, 2026

Summary

  • Increases local completion token budgets and prompt context window for better short-form continuation quality.
  • Tightens local-model prompt instructions around tone, casing, indentation, punctuation, and code continuation.
  • Expands backend-neutral output cleanup for MLX/HuggingFace chat-template control tokens.
  • Updates prompt, normalizer, request factory, and value-model tests for the new contracts.

Validation

  • swiftlint lint --quiet tabby/Models/SuggestionModels.swift tabby/Support/LlamaPromptRenderer.swift tabby/Support/SuggestionTextNormalizer.swift tabbyTests/LlamaPromptRendererTests.swift tabbyTests/ModelAndPresentationValueTests.swift tabbyTests/SuggestionRequestFactoryTests.swift tabbyTests/SuggestionTextNormalizerTests.swift
  • xcodebuild -project tabby.xcodeproj -scheme tabby -destination 'platform=macOS' CODE_SIGNING_ALLOWED=NO build
  • xcodebuild -project tabby.xcodeproj -scheme tabby -destination 'platform=macOS' CODE_SIGNING_ALLOWED=NO build-for-testing

Greptile Summary

This PR tunes Cotabby's local-model completion pipeline: token budgets are raised from ~1.5× to ~2× the upper word bound, the context window is widened, two new prompt instructions are added for tone/code fidelity, and backend-specific control-token cleanup is consolidated into a new stripKnownControlTokens helper that handles five additional template markers. All affected tests are updated to match the new contracts.

  • Token budget and context window increases (SuggestionModels.swift): maxPredictionTokens goes from 8 → 16, maxPrefixCharacters from 1000 → 2000, and each preset's suggestedPredictionTokenBudget is roughly doubled to accommodate modern subword tokenizers.
  • Prompt instruction additions (LlamaPromptRenderer.swift): Two new bullets instruct the model to match the user's language/tone/casing and to preserve code symbols, with corresponding test assertions.
  • Control-token cleanup expansion (SuggestionTextNormalizer.swift): Replaces the two hardcoded replacingOccurrences calls with a unified helper that strips five globally-safe tokens everywhere and four ambiguous tokens (<s>, </s>, [INST], [/INST]) only at response boundaries.

Confidence Score: 5/5

Safe to merge — all changes are confined to pure helper logic and their tests, with no side-effectful service boundaries touched.

Every changed file is either a pure value type, a stateless helper, or a test. The token-budget and context-window increases are well-commented and consistently reflected in tests. The new stripKnownControlTokens helper correctly handles the most common template-leak patterns, and the boundary-only treatment of /</s> directly addresses the HTML-strikethrough concern from the previous review thread.

No files require special attention; the only nuance is in SuggestionTextNormalizer.swift's boundary-strip loop ordering, which is a latent edge case rather than an active defect.

Important Files Changed

Filename Overview
Cotabby/Support/SuggestionTextNormalizer.swift Extracted control-token stripping into stripKnownControlTokens; expands the strip-everywhere list with five new tokens and adds boundary-only handling for <s>, </s>, [INST], [/INST]. Single-pass loop ordering can miss a <s> exposed after stripping a leading </s>.
CotabbyTests/SuggestionTextNormalizerTests.swift Adds a combined MLX/HuggingFace token test; existing tests all pass. Four of the five newly added tokens have no isolated test cases.
Cotabby/Models/SuggestionModels.swift Token budgets raised from ~1.5× to ~2× per preset; maxPredictionTokens doubled to 16 and maxPrefixCharacters doubled to 2000. Well-commented and consistently tested.
Cotabby/Support/LlamaPromptRenderer.swift Two new instruction bullets added for tone/casing/indentation matching and code continuation guidance. Tests updated to assert on the new strings.
CotabbyTests/LlamaPromptRendererTests.swift Two new assertions verify the added prompt instruction lines are present in the rendered output.
CotabbyTests/ModelAndPresentationValueTests.swift Token budget assertions updated to match the new ~2× multiplier values. Straightforward numeric updates.
CotabbyTests/SuggestionRequestFactoryTests.swift Updated the maxPredictionTokens assertion to 40 and extracted a long inline string into a named variable for readability.

Fix All in Codex Fix All in Claude Code

Reviews (3): Last reviewed commit: "Address review: fix token-budget doc and..." | Re-trigger Greptile

Comment thread Cotabby/Support/SuggestionTextNormalizer.swift Outdated
@FuJacob FuJacob force-pushed the feature/llm-engineering-improvements branch from fdf51a2 to f5c7288 Compare May 25, 2026 04:29
FuJacob added 2 commits May 25, 2026 04:03
- Update suggestedPredictionTokenBudget doc-comment to ~2x (values are 14/7,
  24/12, 40/20 = exactly 2.0x), not the stale ~1.5x.
- Only strip <s>/</s> and [INST]/[/INST] at the start/end of raw output. These
  are valid in user content (HTML strikethrough, prompt-template docs), so
  global stripping could silently mangle a correct mid-completion. Unambiguous
  <|...|> markers still strip globally.
@FuJacob
Copy link
Copy Markdown
Owner

FuJacob commented May 25, 2026

was this for support mlx??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants