Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 70 additions & 15 deletions Cotabby/App/Coordinators/SuggestionCoordinator+Input.swift
Original file line number Diff line number Diff line change
Expand Up @@ -173,22 +173,77 @@ extension SuggestionCoordinator {
return false
}

/// Schedules a fresh prediction after a short delay so Chromium-based editors (Chrome, Edge,
/// Slack, Discord, Notion, every Electron app) have time to publish the new contenteditable
/// text to AX. Without the delay, `schedulePrediction()` fires synchronously from inside the
/// CGEvent tap — which runs *before* the host app processes the keystroke — so the AX read
/// inside generation still sees pre-keystroke text. The result feels like Cotabby swallowed
/// the key: the active suggestion vanishes, a new one appears that does not reflect the
/// character the user just typed, and the user concludes their keystroke was eaten.
/// 150ms is chosen empirically: long enough for Chromium's contenteditable AX to settle on a
/// busy page, short enough that the rescheduled suggestion still feels responsive after the
/// user has just diverged from the previous one. `schedulePrediction()` internally
/// `replaceDebouncedWork`s, so back-to-back keystrokes still collapse cleanly.
/// Maximum wall time we'll wait for the host app to publish post-keystroke AX before giving
/// up and generating against whatever's there. Chosen empirically: long enough to cover
/// Chrome's slower contenteditable publish on a busy page, short enough that the user can
/// always type ahead without the rescheduled suggestion feeling stuck.
private static let hostPublishWaitCeilingMs = 400

/// Interval between AX polls while waiting for the host publish. Same order of magnitude as
/// the focus poll itself (default 80ms) but tighter so we catch the publish promptly without
/// burning CPU on AX queries that are themselves 5–15ms each.
private static let hostPublishPollIntervalMs = 30

/// Schedules a fresh prediction once the host app has actually published the new
/// contenteditable text to AX. The previous fix waited a fixed 150ms — see PR #376 — but the
/// logs in #381 showed Chromium's publish lag can exceed that ceiling on a busy page, so the
/// rescheduled generation still read pre-keystroke text and produced a suggestion that looked
/// like Cotabby swallowed the character.
///
/// We now snapshot the AX state at keystroke time (focused element identity, preceding text,
/// selection) and poll `focusModel` until the snapshot actually moves on. The poll is capped
/// at `hostPublishWaitCeilingMs` so a silent host can't hang the pipeline — once the cap is
/// reached we generate against whatever's there, matching the old fixed-delay behavior.
/// `schedulePrediction()` internally `replaceDebouncedWork`s, so back-to-back keystrokes
/// still collapse cleanly.
private func schedulePredictionAfterHostPublishDelay() {
DispatchQueue.main.asyncAfter(deadline: .now() + .milliseconds(150)) { [weak self] in
guard let self else { return }
self.focusModel.refreshNow()
self.schedulePrediction()
let baseline = focusModel.snapshot.context
pollForHostPublish(
baselineText: baseline?.precedingText,
baselineElementID: baseline?.elementIdentifier,
baselineSelectionLocation: baseline?.selection.location,
elapsedMs: 0
)
}
Comment on lines 199 to +207
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Concurrent poll loops accumulate for rapid typing

Each call to schedulePredictionAfterHostPublishDelay (invoked on every shouldSchedulePrediction keystroke across all three call sites) spawns an independent 400ms polling chain. Rapid typing — say 8 keys in 200ms — leaves up to 8 simultaneous loops each calling focusModel.refreshNow() every 30ms. That is ~8× the budgeted AX read rate during a burst, and refreshNow() resets the tracker's idle backoff on every call, so the normal back-off never gets a chance to engage. The PR's "up to ~13 polls" budget analysis only covers the single-keystroke case.

Consider tracking whether a poll loop is already in flight (e.g., a Bool flag set in schedulePredictionAfterHostPublishDelay and cleared when schedulePrediction() fires) and returning early when one is already running. The replaceDebouncedWork inside schedulePrediction() already collapses generations, so only the most recent baseline matters anyway.

Fix in Codex Fix in Claude Code


/// Drives the snapshot-changed gate. Reads the live focus snapshot, fires `schedulePrediction`
/// as soon as any of the captured baseline fields move on, and otherwise tail-calls itself
/// after `hostPublishPollIntervalMs` until the ceiling is hit.
private func pollForHostPublish(
baselineText: String?,
baselineElementID: String?,
baselineSelectionLocation: Int?,
elapsedMs: Int
) {
focusModel.refreshNow()
Comment on lines +212 to +218
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 First poll fires synchronously, blocking the main actor

pollForHostPublish is called synchronously inside handleInputEvent (both are @MainActor), so focusModel.refreshNow() — a full AX tree walk of 5–15ms — runs before the main run loop returns from keyboard event handling. Since the CGEvent tap fires before Chrome processes the keystroke, this first refresh will almost always see unchanged text, making the read wasted. The old asyncAfter(150ms) design never touched AX during event delivery. With rapid typing this also means one synchronous AX read is added to every single keystroke's event-handler cost.

A simple fix is to skip the synchronous call entirely and start with a 30ms asyncAfter (same as every subsequent poll), since there is no information to be gained at t=0 before Chrome has processed the key.

Fix in Codex Fix in Claude Code

let currentContext = focusModel.snapshot.context

// No focus context at all means the user moved away from any editable field — let
// `schedulePrediction` and its downstream guards handle the disabled / unsupported state.
let textChanged = currentContext?.precedingText != baselineText
let elementChanged = currentContext?.elementIdentifier != baselineElementID
let selectionChanged = currentContext?.selection.location != baselineSelectionLocation
if textChanged || elementChanged || selectionChanged {
schedulePrediction()
return
}

// Ceiling: stop polling and generate anyway. Without this fallback a host that genuinely
// produces no AX change (rare but possible — e.g. dead-key composition) would never get
// its next prediction.
let nextElapsed = elapsedMs + Self.hostPublishPollIntervalMs
guard nextElapsed < Self.hostPublishWaitCeilingMs else {
schedulePrediction()
return
}
Comment on lines +234 to +238
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 elapsedMs tracks logical intervals, not wall time

elapsedMs increments by 30ms per iteration regardless of how long each AX read takes. Since refreshNow() resolves in 5–15ms, the real wall-clock duration of the ceiling path is up to ≈390ms + (13 polls × 15ms) ≈ 585ms — substantially more than the 400ms stated in comments and the PR description. The discrepancy matters when callers reason about "how long can a stale prediction be generated" in the dead-key / non-publishing host path.

Fix in Codex Fix in Claude Code


DispatchQueue.main.asyncAfter(deadline: .now() + .milliseconds(Self.hostPublishPollIntervalMs)) { [weak self] in
self?.pollForHostPublish(
baselineText: baselineText,
baselineElementID: baselineElementID,
baselineSelectionLocation: baselineSelectionLocation,
elapsedMs: nextElapsed
)
}
}

Expand Down