Skip to content

Add base-model completion pipeline behind a default-off flag#495

Merged
FuJacob merged 1 commit into
mainfrom
experimental/base-completion
Jun 1, 2026
Merged

Add base-model completion pipeline behind a default-off flag#495
FuJacob merged 1 commit into
mainfrom
experimental/base-completion

Conversation

@FuJacob
Copy link
Copy Markdown
Owner

@FuJacob FuJacob commented Jun 1, 2026

Summary

First step of transitioning the Open Source (llama) path off instruction-tuned models onto base-model continuation. Instruct checkpoints fed Cotabby's instruction-blob prompt tend to leak assistant-style replies and echo scaffolding; a base model continues the user's text natively. This lands the base-model prompt machinery behind a default-off flag so it can be validated against a base model before the default is flipped.

When useBaseCompletionPipeline is on and the Open Source engine is selected, SuggestionRequestFactory renders via the new BaseCompletionPromptRenderer: no task preamble, no standalone Label: lines, persona/style/language/context folded into a short conditioning preface (a base model conditions on description, it does not obey instructions), and the caret prefix emitted last with trailing whitespace trimmed so generation starts at a clean word boundary. The path reuses the already-merged generation-time constraints (single-line masking, mid-word continuation).

Validation

xcodebuild build-for-testing -project Cotabby.xcodeproj -scheme Cotabby \
  -destination 'platform=macOS' -derivedDataPath build/DerivedData
# ** BUILD SUCCEEDED **  (full Cotabby app + CotabbyTests target compile)

swiftlint lint --config .swiftlint.yml --quiet   # touched files, exit 0

New BaseCompletionPromptRendererTests adds 7 pure-function cases (prefix-last invariant, no-preamble/no-labels, trailing-trim with mid-word preserved, persona conditioning, context-only-when-supplied) and compiles under the test target.

Local test execution is blocked by a Team ID / code-signing mismatch loading the app-hosted CotabbyTests.xctest bundle into the signed host (both CODE_SIGNING_ALLOWED=NO and normal signing hit it). Per the repo's testing note, reporting it and relying on build-for-testing; CI runs the suite with proper signing.

Linked issues

None — experimental, flag-gated.

Risk / rollout notes

  • Dark by default. useBaseCompletionPipeline defaults false. With it off, SuggestionRequestFactory uses the existing LlamaPromptRenderer and behavior is byte-for-byte unchanged. The new renderer and its tests are inert until the flag is set (defaults write <bundle id> cotabbyBaseCompletionPipelineEnabled -bool YES).
  • pbxproj migration: two new source files added via xcodegen generate (additive 8-line diff, no unrelated churn).
  • No cross-repo engine change: the base path reuses the already-merged generation-time constraints, so cotabbyinference needs nothing new for this.
  • Follow-ups (need an on-device base model + validation, cannot run in CI):
    • Provision / auto-select a base GGUF when the flag is on (today a user imports a base GGUF; recommended sizes mirror the field, e.g. a ~1.5B base).
    • A/B the base path vs current instruct output over real llm-io.jsonl prefixes, then flip the default.
    • Visual-context hygiene (drop the LLM OCR summarizer, add corruption filters) and per-app prose/code policy.

🤖 Generated with Claude Code

Greptile Summary

This PR introduces a BaseCompletionPromptRenderer behind a default-off useBaseCompletionPipeline flag, routing the Open Source (llama) engine through a bare-continuation prompt instead of the existing instruction-blob format when the flag is enabled.

  • New renderer (BaseCompletionPromptRenderer) omits all instruction scaffolding, folds persona/style/language into a conditioning preface, and guarantees prefixText is the final bytes of the prompt with trailing whitespace trimmed.
  • Settings plumbing threads the new flag through SuggestionSettingsModelSuggestionSettingsSnapshotSuggestionRequestFactory, and wires it into the CombineLatest4 reactive chain so snapshot emissions reflect live flag state.
  • Existing behaviour is byte-for-byte unchanged when the flag is off (the default); the new code path is entirely inert until the user explicitly sets the cotabbyBaseCompletionPipelineEnabled default.

Confidence Score: 4/5

Safe to merge — the new renderer is completely inert by default and the existing Llama path is untouched.

The change is well-scoped behind a default-off flag and the existing path is provably unchanged. The only rough edges are in the new renderer itself: the language-only framing sentence is weak conditioning, a test case for that path is missing, and the persistence helper breaks the established pattern in the settings model. None of these affect users until the flag is deliberately enabled.

BaseCompletionPromptRenderer.swift — the authorFraming language-only case; SuggestionSettingsModel.swift — inline persistence vs dedicated helper pattern.

Important Files Changed

Filename Overview
Cotabby/Support/BaseCompletionPromptRenderer.swift New renderer for the base-model pipeline; conditioning-preface logic is sound, but the language-only framing produces a weak sentence and the helper is exposed as internal when it could be private.
Cotabby/Support/SuggestionRequestFactory.swift Flag-guarded branch routes to the new renderer only when both the flag and the Open Source engine are active; existing Llama path is byte-for-byte unchanged.
Cotabby/Models/SuggestionSettingsModel.swift Adds the new published property and threads it through the CombineLatest4 chain correctly; persistence is written inline rather than through a dedicated helper, inconsistent with the rest of the file.
Cotabby/Models/SuggestionEngineModels.swift Adds useBaseCompletionPipeline field to the snapshot struct; straightforward and correct.
CotabbyTests/BaseCompletionPromptRendererTests.swift Seven pure-function tests cover the key invariants; missing a test for the language-only framing path (no name, no rules, just languageInstruction).
CotabbyTests/CotabbyTestFixtures.swift Adds useBaseCompletionPipeline default parameter to the fixture factory; correctly defaults to false to preserve existing test behavior.
Cotabby.xcodeproj/project.pbxproj Additive 8-line diff registers the two new source files in the correct target groups; no unrelated churn.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[SuggestionRequestFactory.buildRequest] --> B{useBaseCompletionPipeline\nAND selectedEngine == .llamaOpenSource?}
    B -- Yes --> C[BaseCompletionPromptRenderer.prompt]
    B -- No --> D[LlamaPromptRenderer.prompt]
    C --> E1[authorFraming\npersona / style / language]
    C --> E2[extendedContext\nnotes preface]
    C --> E3[visualContextSummary\nnearby on screen]
    C --> E4[clipboardContext\non the clipboard]
    E1 --> F[preface joined with newlines]
    E2 --> F
    E3 --> F
    E4 --> F
    F --> G[preface + blank line + trimmedPrefix]
    G --> H[SuggestionRequest]
    D --> H
Loading

Fix All in Codex Fix All in Claude Code

Reviews (1): Last reviewed commit: "Add base-model completion pipeline behin..." | Re-trigger Greptile

Greptile also left 3 inline comments on this PR.

First step of moving the Open Source (llama) path off instruction-tuned models onto base-model continuation.

Adds BaseCompletionPromptRenderer: no task preamble, no standalone labels, persona/style/context folded into a conditioning preface, and the caret prefix emitted last with trailing whitespace trimmed. SuggestionRequestFactory routes to it when the new useBaseCompletionPipeline flag is on and the Open Source engine is selected; otherwise the existing LlamaPromptRenderer is used unchanged.

The flag defaults off (read from the cotabbyBaseCompletionPipelineEnabled default, no UI), so default behavior is byte-for-byte unchanged. The base path reuses the already-merged generation-time constraints (single-line masking, mid-word continuation). Regenerated Cotabby.xcodeproj for the two new files.
@FuJacob FuJacob merged commit 10a7bd6 into main Jun 1, 2026
4 checks passed
Comment on lines +89 to +101
var sentence = "The following is text"
if !name.isEmpty {
sentence += " written by \(name)"
}
if !rules.isEmpty {
sentence += " in a \(rules.joined(separator: ", ")) style"
}
sentence += "."
if !language.isEmpty {
// `languageInstruction` is already a soft directive sentence; append it verbatim.
sentence += " \(language)"
}
return sentence
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Language-only framing produces a vacuous sentence. When userName and customRules are both empty but languageInstruction is supplied, the sentence becomes "The following is text. Write in English." — the first clause carries no conditioning signal and the imperative "Write in English." looks out of place in what is meant to read as a document description. Either skip the wrapper clause entirely for the language-only case, or return language on its own.

Suggested change
var sentence = "The following is text"
if !name.isEmpty {
sentence += " written by \(name)"
}
if !rules.isEmpty {
sentence += " in a \(rules.joined(separator: ", ")) style"
}
sentence += "."
if !language.isEmpty {
// `languageInstruction` is already a soft directive sentence; append it verbatim.
sentence += " \(language)"
}
return sentence
// If we only have a language directive and no other framing, return it directly — wrapping
// it in "The following is text." adds nothing and makes the preface read oddly.
if name.isEmpty && rules.isEmpty {
return language.isEmpty ? nil : language
}
var sentence = "The following is text"
if !name.isEmpty {
sentence += " written by \(name)"
}
if !rules.isEmpty {
sentence += " in a \(rules.joined(separator: ", ")) style"
}
sentence += "."
if !language.isEmpty {
// `languageInstruction` is already a soft directive sentence; append it verbatim.
sentence += " \(language)"
}
return sentence

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Codex Fix in Claude Code

Comment on lines +59 to +68
func test_trailingWhitespaceTrimmedButMidWordPreserved() {
XCTAssertEqual(
BaseCompletionPromptRenderer.prompt(prefixText: "doing my aft", applicationName: "X", userName: nil),
"doing my aft"
)
XCTAssertEqual(
BaseCompletionPromptRenderer.prompt(prefixText: "see you \n", applicationName: "X", userName: nil),
"see you"
)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing test for the language-only framing path. The seven existing tests cover userName, customRules, and combinations, but there is no case where languageInstruction is non-nil/non-empty while both userName is nil and customRules is empty. That path hits authorFraming with a non-empty language but empty name and rules, producing "The following is text. Write in English." — a distinct output shape that would catch the fix suggested in the renderer.

Fix in Codex Fix in Claude Code

persistSelectedWordCountPreset(resolvedWordCountPreset)
persistClipboardContextEnabled(resolvedClipboardContextEnabled)
persistFastModeEnabled(resolvedFastModeEnabled)
userDefaults.set(resolvedBaseCompletionPipelineEnabled, forKey: Self.baseCompletionPipelineEnabledDefaultsKey)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Persistence for the new flag is written inline in persistSettings() while every other setting in this function delegates to a dedicated persist…() helper (e.g. persistFastModeEnabled). Keeping the same pattern makes the function easier to scan and the flag easier to move to a dedicated resetToDefaults-style call later.

Suggested change
userDefaults.set(resolvedBaseCompletionPipelineEnabled, forKey: Self.baseCompletionPipelineEnabledDefaultsKey)
persistBaseCompletionPipelineEnabled(resolvedBaseCompletionPipelineEnabled)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Codex Fix in Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant