Skip to content

Add required-prefix admissibility to the constrained beam decoder#526

Merged
FuJacob merged 1 commit into
mainfrom
feat/required-prefix-constraint
Jun 2, 2026
Merged

Add required-prefix admissibility to the constrained beam decoder#526
FuJacob merged 1 commit into
mainfrom
feat/required-prefix-constraint

Conversation

@FuJacob
Copy link
Copy Markdown
Owner

@FuJacob FuJacob commented Jun 2, 2026

Summary

Adds the lowest-parity decode-core capability: a byte-exact required-prefix admissibility rule that steers the constrained decoder onto a known continuation without ever letting it emit bytes that diverge. Given a prefix the completion must still produce, a token is admissible only if it completes the prefix (token starts with prefix) or is a step toward it (prefix starts with token); the consumed bytes advance and the remainder carries forward. This is the foundation for constraining a completion to a required string — for example, finishing a specific partially-formed word.

  • RequiredPrefixConstraint.step (new, pure): the two-way prefix rule, in raw bytes so it stays correct when a token splits a multi-byte UTF-8 scalar. Deliberately trie-free — a vocabulary trie would only speed the per-step lookup, never change the result, so it can be layered in later.
  • ConstrainedBeamSearch now tracks a per-branch remainingPrefix: a branch may complete (EOG, single-line newline, sentence boundary, budget) only once it is empty, and while it is non-empty the search scans the full admissible vocabulary rather than the logit-capped top-K, because the required token can rank far below the model's local preference.

Validation

xcodebuild ... test ... CODE_SIGNING_ALLOWED=NO CODE_SIGNING_REQUIRED=NO \
  -only-testing:CotabbyTests/RequiredPrefixConstraintTests \
  -only-testing:CotabbyTests/ConstrainedBeamSearchTests
# ** TEST SUCCEEDED **
#   10 pure rule tests (complete / overshoot / advance / single-byte / divergence both ways /
#     multi-byte scalar split across tokens / empty prefix / admits predicate)
#   3 new beam tests: steers onto a low-logit required token against the model's preference;
#     refuses to stop (EOG) before the prefix is satisfied; spans tokens of differing lengths
#   9 existing beam tests unchanged and green (empty prefix = identical behavior)

swiftlint lint --strict --quiet <changed files>   # exit 0
xcodegen generate                                  # registered the two new files (drift guard passes)

Linked issues

None. Decode-core parity: the required-prefix / ACPF constraint foundation.

Risk / rollout notes

  • No behavior change to the shipping decoder. requiredPrefix defaults to empty everywhere, which makes every token immediately satisfied and every code path identical to before. The existing beam tests pass untouched.
  • No runtime or engine change in this PR — this lands the pure, tested capability. The per-request trigger that supplies a required prefix needs on-device design (Cotabby's append-only ghost text differs from a word-replacement healing model), so it is intentionally a separate step rather than guessed here.
  • project.pbxproj regenerated by XcodeGen for the two new files.

Greptile Summary

This PR adds required-prefix admissibility to the constrained beam decoder, letting the search steer generation onto a specific byte-exact continuation without ever emitting diverging bytes. The RequiredPrefixConstraint.step pure function implements the two-way prefix rule; ConstrainedBeamSearch carries a per-branch remainingPrefix and expands to the full vocabulary (instead of the logit-capped top-K) while any prefix bytes remain unmet.

  • RequiredPrefixConstraint (new): pure, trie-free, byte-level admissibility rule; three cases \u2014 satisfied, advanced(remaining:), and rejected \u2014 handled correctly for all documented scenarios including multi-byte UTF-8 splits.
  • ConstrainedBeamSearch: remainingPrefix added to BeamCandidate with a default of [] so every existing call site is unaffected; all five completion gates (EOG, newline, sentence boundary, stalled branch, budget exhaustion) guard on an empty remaining prefix.
  • Tests: 10 unit tests for the constraint rule and 3 new beam-level integration tests covering low-logit steering, early-stop prevention, and multi-token spanning.

Confidence Score: 4/5

Safe to merge — the change is additive, the empty-prefix default preserves all existing behaviour, and every completion gate was updated consistently.

The core constraint logic and its beam integration are correct and well-tested. The empty tokenBytes edge case in step is admitted with no forward progress, harmless today but a real rough edge for external callers of admits. The undocumented interaction between isMidWord and a non-empty requiredPrefix can silently return empty results when the two constraints are incompatible.

RequiredPrefixConstraint.swift (empty-token edge case in step/admits) and ConstrainedBeamSearch.swift (undocumented isMidWord + requiredPrefix interaction).

Important Files Changed

Filename Overview
Cotabby/Support/RequiredPrefixConstraint.swift New pure constraint module implementing byte-exact prefix admissibility; logic is correct for all documented cases but the empty-tokenBytes edge case can silently admit a token that makes zero progress toward the prefix
Cotabby/Support/ConstrainedBeamSearch.swift Required-prefix integration is correct; adds per-branch remainingPrefix, expands to full vocabulary while prefix is unmet, and guards every completion gate — minor redundant filter in the final return and an undocumented isMidWord + requiredPrefix interaction
CotabbyTests/RequiredPrefixConstraintTests.swift 10 focused unit tests cover the key cases: empty prefix, exact match, overshoot, advance, single-byte step, both divergence directions, multi-byte UTF-8 split, and the admits predicate
CotabbyTests/ConstrainedBeamSearchTests.swift 3 new beam tests cover low-logit steering, EOG suppression, and multi-token spanning; all 9 existing tests unaffected due to the default empty prefix
Cotabby.xcodeproj/project.pbxproj XcodeGen-regenerated project file correctly registers RequiredPrefixConstraint.swift in Sources and RequiredPrefixConstraintTests.swift in the test target

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[ConstrainedBeamSearch.search
requiredPrefix: UInt8] --> B[Engine.run
frontier = BeamCandidate
remainingPrefix = requiredPrefix]
    B --> C{frontier empty
or budget exhausted?}
    C -- No --> D[expand each branch]
    D --> E{remainingPrefix
empty?}
    E -- Yes --> F[effectiveTopK = config.topK]
    E -- No --> G[effectiveTopK = logits.count
full vocab scan]
    F & G --> H[rankedAdmissibleTokens]
    H --> I{Token type?}
    I -- EOG or Newline --> J{remainingPrefix
empty?}
    J -- Yes --> K[completed.append branch]
    J -- No --> L[skip token]
    I -- Normal token --> M[RequiredPrefixConstraint.step]
    M --> N{Step result?}
    N -- rejected --> O[continue / skip]
    N -- satisfied --> P[remainingAfterToken = empty]
    N -- advanced --> Q[remainingAfterToken = tail]
    P & Q --> R{remainingAfterToken
empty AND
sentence complete?}
    R -- Yes --> K
    R -- No --> S[live.append child
with new remainingPrefix]
    S --> C
    C -- Yes --> T[completed += frontier
where remainingPrefix empty]
    T --> U[return sorted by meanLogprob]
Loading

Comments Outside Diff (1)

  1. Cotabby/Support/ConstrainedBeamSearch.swift, line 162-163 (link)

    P2 Silent empty result when isMidWord and requiredPrefix are both active

    isMidWord filters the first-step candidates to tokens where profile.continuesWordMidStream is true, and this filter runs before the prefix-admissibility check. If the first byte(s) of requiredPrefix can only be satisfied by a token that doesn't pass continuesWordMidStream (e.g., the required continuation starts with a space or punctuation), every candidate is rejected and the search returns nothing — with no indication of why. Neither the parameter documentation nor the call sites warn about this incompatibility. At minimum a precondition or a doc-comment on search should note that callers are responsible for ensuring the two constraints are compatible.

    Fix in Codex Fix in Claude Code

Fix All in Codex Fix All in Claude Code

Reviews (1): Last reviewed commit: "Add required-prefix admissibility to the..." | Re-trigger Greptile

Greptile also left 2 inline comments on this PR.

This is the missing piece for steering generation onto a known continuation: a
byte-exact rule that, given a prefix the completion must still produce, admits
only tokens that complete that prefix or step toward it, and rejects anything
that diverges. It is the foundation for constraining a completion to a required
string (for example, finishing a specific partially-formed word) without ever
emitting bytes that wander off it.

RequiredPrefixConstraint.step is the pure rule (token starts-with prefix =
satisfied; prefix starts-with token = advance by the consumed bytes; else
reject), working in raw bytes so it stays correct when a token splits a
multi-byte UTF-8 scalar. ConstrainedBeamSearch tracks a per-branch
remainingPrefix: a branch may finish only once it is empty, and while it is
non-empty the search scans the full admissible vocabulary instead of the
logit-capped top-K, because the required token can rank far below the model's
preference.

Behavior is unchanged for the existing decoder: requiredPrefix defaults to
empty, which makes every token immediately satisfied and every code path
identical to before (the prior beam tests pass untouched). No runtime or engine
change; the per-request trigger that supplies a required prefix is a separate
step that needs on-device design for Cotabby's append-only ghost text.
@FuJacob FuJacob merged commit aafe9f3 into main Jun 2, 2026
4 checks passed
@FuJacob FuJacob deleted the feat/required-prefix-constraint branch June 2, 2026 02:54
Comment on lines +36 to +39
static func step(remainingPrefix: [UInt8], tokenBytes: [UInt8]) -> Step {
guard !remainingPrefix.isEmpty else {
return .satisfied
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Empty tokenBytes is silently admitted and makes no forward progress. When tokenBytes is [], remainingPrefix.starts(with: []) is always true, so step returns .advanced(remaining: remainingPrefix) — the token is considered admissible but the remaining prefix is unchanged. Within the beam search this is harmless because rankedAdmissibleTokens already filters control/empty tokens, but the docstring explicitly advertises admits for external greedy-mask callers who might not share that invariant. An empty token in their vocabulary would be wrongly admitted. Adding an early guard removes the ambiguity.

Suggested change
static func step(remainingPrefix: [UInt8], tokenBytes: [UInt8]) -> Step {
guard !remainingPrefix.isEmpty else {
return .satisfied
}
static func step(remainingPrefix: [UInt8], tokenBytes: [UInt8]) -> Step {
guard !remainingPrefix.isEmpty else {
return .satisfied
}
guard !tokenBytes.isEmpty else {
return .rejected
}

Fix in Codex Fix in Claude Code

Comment on lines 130 to 132
return completed
.filter { !$0.tokenIDs.isEmpty }
.filter { !$0.tokenIDs.isEmpty && $0.remainingPrefix.isEmpty }
.sorted { $0.meanLogprob > $1.meanLogprob }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The .remainingPrefix.isEmpty predicate in the return filter is redundant. Every code path that adds a candidate to completed already enforces an empty remaining prefix — via explicit if branch.remainingPrefix.isEmpty guards on EOG/newline/stalled branches, via the if remainingAfterToken.isEmpty guard on sentence boundaries, and via the .filter { $0.remainingPrefix.isEmpty } applied to the frontier at budget exhaustion. The second filter in the return statement can never catch anything the earlier guards missed, so removing it reduces noise without changing behaviour.

Suggested change
return completed
.filter { !$0.tokenIDs.isEmpty }
.filter { !$0.tokenIDs.isEmpty && $0.remainingPrefix.isEmpty }
.sorted { $0.meanLogprob > $1.meanLogprob }
return completed
.filter { !$0.tokenIDs.isEmpty }
.sorted { $0.meanLogprob > $1.meanLogprob }

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Codex Fix in Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant