feat: hold-to-talk hotkey mode (1.0.01 / A1003) — closes #1#2
Conversation
OpenLessHotkey now emits explicit .pressed / .released edge events on the modifier-key trigger; the coordinator interprets them per the new HotkeyMode preference (toggle stays the default, hold = press-and-hold). Hold mode rules: .pressed on idle starts the session; .released stops it from .listening; releasing during .starting cancels (no audio useful was captured yet). Esc-cancel sets a one-shot suppress flag so the trailing .released after a cancel does not re-trigger end. Settings hub gets a "录音方式" segmented picker and a one-line hint explaining the tradeoff. Bump version to 1.1.0; update README roadmap and USAGE walkthrough to document the new mode.
Reviewer's GuideImplements a configurable hotkey mode system by introducing a new HotkeyMode (toggle/hold), changing hotkey events to explicit pressed/released edges, updating DictationCoordinator and settings to interpret those edges per user preference, and documenting/bumping build metadata for the new hold-to-talk feature. Sequence diagram for hotkey pressed/released handling in toggle vs hold modessequenceDiagram
actor User
participant HotkeyMonitor
participant DictationCoordinator
participant VolcengineStreamingASR as ASR
rect rgb(240,240,240)
Note over User,DictationCoordinator: Toggle mode (default)
User->>HotkeyMonitor: press trigger
HotkeyMonitor->>DictationCoordinator: HotkeyEvent.pressed
DictationCoordinator->>DictationCoordinator: handlePressed()
DictationCoordinator->>DictationCoordinator: handleToggle() begin
DictationCoordinator->>ASR: beginSession()
User-->>HotkeyMonitor: release trigger
HotkeyMonitor-->>DictationCoordinator: HotkeyEvent.released
DictationCoordinator-->>DictationCoordinator: handleReleased() (ignored in toggle)
User->>HotkeyMonitor: press trigger again
HotkeyMonitor->>DictationCoordinator: HotkeyEvent.pressed
DictationCoordinator->>DictationCoordinator: handlePressed()
DictationCoordinator->>DictationCoordinator: handleToggle() end
DictationCoordinator->>ASR: endSession()
end
rect rgb(235,245,255)
Note over User,DictationCoordinator: Hold mode
User->>HotkeyMonitor: press trigger
HotkeyMonitor->>DictationCoordinator: HotkeyEvent.pressed
DictationCoordinator->>DictationCoordinator: handlePressed()
DictationCoordinator->>DictationCoordinator: handleHoldStart()
DictationCoordinator->>ASR: beginSession()
User-->>HotkeyMonitor: release trigger
HotkeyMonitor-->>DictationCoordinator: HotkeyEvent.released
DictationCoordinator-->>DictationCoordinator: handleReleased()
alt sessionPhase == listening
DictationCoordinator->>ASR: endSession()
else sessionPhase == starting
DictationCoordinator->>DictationCoordinator: handleCancel()
end
end
rect rgb(255,240,240)
Note over User,DictationCoordinator: Hold mode Esc cancel while holding
User->>DictationCoordinator: Esc key
DictationCoordinator->>DictationCoordinator: handleCancel()
DictationCoordinator->>DictationCoordinator: suppressNextRelease = true
User-->>HotkeyMonitor: release trigger
HotkeyMonitor-->>DictationCoordinator: HotkeyEvent.released
DictationCoordinator-->>DictationCoordinator: handleReleased()
DictationCoordinator-->>DictationCoordinator: suppressNextRelease consumed, no endSession
end
Updated class diagram for hotkey mode and event handlingclassDiagram
class HotkeyMode {
<<enum>>
toggle
hold
String rawValue
String displayName()
String hint()
}
class HotkeyEvent {
<<enum>>
pressed
released
cancelled
}
class UserPreferences {
-UserDefaults defaults
+static UserPreferences shared
+HotkeyMode hotkeyMode
+HotkeyBinding.Trigger hotkeyTrigger
+PolishMode polishMode
+Bool hasCompletedOnboarding
}
class HotkeyMonitor {
-CFMachPort eventTap
-CFRunLoopSource runLoopSource
-Bool triggerHeld
-AsyncStream~HotkeyEvent~.Continuation continuation
+AsyncStream~HotkeyEvent~ events
+start()
+stop()
-handleEvent(eventType, flags)
}
class DictationCoordinator {
-VolcengineStreamingASR asr
-BufferingAudioConsumer audioConsumer
-Date sessionStartedAt
-SessionPhase sessionPhase
-HotkeyServiceProtocol hotkey
-Bool suppressNextRelease
+init(hotkeyService)
+start()
-handlePressed()
-handleReleased()
-handleHoldStart()
-handleToggle()
-handleCancel()
-beginSession()
-endSession()
}
class SettingsHubTab {
<<SwiftUIView>>
-HotkeyBinding.Trigger trigger
-HotkeyMode hotkeyMode
-PolishMode mode
+body()
}
class VolcengineStreamingASR {
+startStreaming()
+stopStreaming()
}
UserPreferences --> HotkeyMode : stores
UserPreferences --> HotkeyBinding.Trigger : stores
DictationCoordinator --> HotkeyMonitor : subscribes_events
DictationCoordinator --> UserPreferences : reads_hotkeyMode
DictationCoordinator --> VolcengineStreamingASR : controls
DictationCoordinator --> HotkeyEvent : consumes
HotkeyMonitor --> HotkeyEvent : produces
SettingsHubTab --> UserPreferences : reads_writes_hotkeyMode
SettingsHubTab --> HotkeyMode : displays_picker_options
SettingsHubTab --> NotificationCenter : posts_openLessHotkeyChanged
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="Sources/OpenLessApp/DictationCoordinator.swift" line_range="175-178" />
<code_context>
+ case .listening:
+ sessionPhase = .processing
+ Task { await endSession() }
+ case .starting:
+ // 用户没等到 ASR 连上就松手 — 当作取消,不发送任何已采集音频。
+ Log.write("[session] hold: starting 阶段松手,取消")
+ handleCancel()
+ case .idle, .processing:
+ return
</code_context>
<issue_to_address>
**issue (bug_risk):** Cancelling from `.starting` in hold mode sets `suppressNextRelease`, which will suppress the *next* legitimate release and prevent the next session from ending.
In this path, `handleReleased` calls `handleCancel()` while in `.starting` hold mode. By then the key is already up and the corresponding `.released` has been handled, but `handleCancel()` still sets `suppressNextRelease = true`. That flag then incorrectly applies to the *next* press/release cycle, causing the next `.released` to be ignored and the subsequent hold session to never exit `.listening`/`.processing`. To avoid this, the `.starting` early-release path should cancel without setting `suppressNextRelease`, or `handleCancel` should only set this flag when invoked while the key is actually still down (e.g., Esc-based cancel).
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| case .starting: | ||
| // 用户没等到 ASR 连上就松手 — 当作取消,不发送任何已采集音频。 | ||
| Log.write("[session] hold: starting 阶段松手,取消") | ||
| handleCancel() |
There was a problem hiding this comment.
issue (bug_risk): Cancelling from .starting in hold mode sets suppressNextRelease, which will suppress the next legitimate release and prevent the next session from ending.
In this path, handleReleased calls handleCancel() while in .starting hold mode. By then the key is already up and the corresponding .released has been handled, but handleCancel() still sets suppressNextRelease = true. That flag then incorrectly applies to the next press/release cycle, causing the next .released to be ignored and the subsequent hold session to never exit .listening/.processing. To avoid this, the .starting early-release path should cancel without setting suppressNextRelease, or handleCancel should only set this flag when invoked while the key is actually still down (e.g., Esc-based cancel).
…e on secret fields Settings window had isMovableByWindowBackground = true, so any drag inside the content view dragged the whole window — making it impossible to select words inside a TextField. Disable it; the native title-bar strip (with the traffic lights) still drags the window. PasteableCredentialField now exposes an eye / eye.slash toggle when secure: true, so users can verify pasted API keys before saving.
When the user removes the Ark API key (or the Ark call errors out), DictationCoordinator was still showing the green "已插入" capsule and recording the session in history under the originally-selected polish mode. From the user's side it looked like polish had succeeded, even though only the raw transcript reached the cursor. Track a PolishOutcome (ok / skippedNoCredentials / failed) through insertText. Capsule now uses a new CapsuleState.warning (orange) state to surface honest messages — "已插入原文 · 未润色" / "润色失败 · 已用原文" — and history saves the session as PolishMode.raw whenever polish didn't actually run. The error capsule + 1.5s artificial sleep that preceded a fake "success" insert is gone; warning state stays for the normal 2.5s.
Some weaker chat models would emit a single flat list when the user asked for Structured polish, even when the dictation clearly covered multiple topics. The mode is meant to produce something pasteable as an AI prompt or a working doc, not a wall of bullets. Rewrite the .structured system prompt to be prescriptive: when the input contains ≥2 themes, emit three explicit levels — top "1./2./3." sections, "1)/2)/3)" sub-points indented 3 spaces, and "a./b./c." detail items indented another 3 — and to fall back to a plain paragraph when the input is short / single-topic. Add a concrete format example so weaker models have a pattern to copy.
#79) closes Codex audit blockers on PR #78 ## HIGH #1 — begin_session 内 await 期间 cancel 被覆盖 背景:volcengine open_session().await + Recorder::start 都是异步,期间用户按 Esc 调用 cancel_session 把 phase 改回 Idle 并 cancelled=true。但原代码后续无 条件 `state.phase = SessionPhase::Listening`,把 Idle 又翻回 Listening → 用户的 cancel 被吞掉,session 仍然继续。 修: - 新增 `cancel_raced_during_starting(inner)` helper:持锁查 cancelled or phase != Starting - ASR open_session().await 后调用:如已 race,asr.cancel() + 回 Idle - Recorder::start Ok 分支用 BeginOutcome { Started / PendingStop / CancelRaced } 原子在同一 lock 内决定,CancelRaced 时清理 recorder + asr 资源不进 Listening ## HIGH #2 — cancelled check 与 inserter.insert 之间的窗口 背景:end_session 检查 cancelled 后到调用 inserter.insert 之间释放了 lock, 此时 Esc 触发 cancel_session 设 cancelled=true 已经晚了 — Cmd+V 即将发出, 撤销不掉。但 cancel_session 仍然 emit "已取消" → UI 与实际行为相反(已插入但 显示已取消)。 修: - 新增 SessionPhase::Inserting:表示「已过最后一次 cancel 检查、即将/正在 调用 inserter.insert」的窗口 - end_session:把 polish 后的 cancel check + phase 转换 atomic 在同一 lock 内:cancelled → Idle + return;否则 → phase=Inserting 后 release lock 走 insert - cancel_session:phase==Inserting → 直接 return(不设 cancelled、不 emit 取 消)。理由:物理上无法撤销 Cmd+V,硬装"已取消"只会让 UI 撒谎 cargo check 通过,13 个 warnings 全部是 pre-existing。 Co-authored-by: baiqing <lbx12309@icloud.com>
…attr 修复 Codex audit 对 main HEAD a9c81e6 的 2 条 HIGH + 把 macOS auto-update 的 "重启后 Gatekeeper 拦截"问题封死。 ## HIGH #2: 版本号不一致 PR #84 没顺手 bump 版本,main 仍自报 1.2.2,下次直接打 v1.2.3-tauri 会让 updater manifest 自报 1.2.2,已安装的 1.2.2 客户端永远拿不到 "有新版本"。 修:package.json + tauri.conf.json + Cargo.toml 全部改 1.2.3。 ## HIGH #1: updater pubkey 所有权 PR #84 使用的 pubkey 是外部贡献者本地生成的,appergb 不掌握对应 private key。 任何持有那把私钥的人都能签发更新 -> 客户端 OTA 装上恶意版本。 修: - 用 npx @tauri-apps/cli signer generate --ci 生成 appergb 自己的 keypair - 新 pubkey: F0FCDE68E08E6D4E (写入 tauri.conf.json plugins.updater.pubkey) - 私钥已通过 gh secret set TAURI_SIGNING_PRIVATE_KEY 配到 GitHub repo secret - 私钥本地副本只在 /tmp,不进 git,commit 后会清除 ## macOS auto-update 后 strip xattr restart_app 在 app.restart() 前对 .app bundle 跑 /usr/bin/xattr -cr。这是 Tauri auto-updater + 未公证应用的组合下唯一让"自动更新对用户零摩擦"的解法 — 否则 Gatekeeper 在重启时会拦说"OpenLess 已损坏",用户必须开终端 xattr 才能 继续用,违反 auto-update 的本意。 未来发版逻辑必须保留这一步。release-tauri.yml 上一次 PR (#83) 已经在 CI 侧 strip 过一次,本次在 client 侧重启时再 strip 一次,双保险覆盖 "下载 -> 解压 -> 安装 -> 重启"全链路。
Summary
OpenLessHotkeynow emits explicit.pressed/.releasededge events;DictationCoordinatorinterprets them perUserPreferences.hotkeyMode. Toggle remains the default — no behavior change for existing users.1.0.01/A1003.Behavior matrix
.pressedwhile idle.pressedwhile listening.releasedstarting; ignored otherwiseEscwhile listeningsuppressNextReleaseso the trailing key-up does not re-fire endFiles
Sources/OpenLessCore/HotkeyMode.swift(new) —enum HotkeyMode { toggle, hold }Sources/OpenLessHotkey/HotkeyEvent.swift— rename.toggled→.pressed, add.releasedSources/OpenLessHotkey/HotkeyMonitor.swift— yield.releasedon the trigger-up edgeSources/OpenLessPersistence/UserPreferences.swift—hotkeyModegetter/setterSources/OpenLessApp/DictationCoordinator.swift—handlePressed/handleReleased/handleHoldStart;suppressNextReleasefor cancelSources/OpenLessApp/Settings/SettingsView.swift— segmented picker + hint lineTests/OpenLessHotkeyTests/HotkeyEventTests.swift— updated for the event-set renamescripts/build-app.sh—APP_VERSION=1.0.01,BUILD_NUMBER=A1003README.md/README.zh.md— drop hold-to-talk from roadmapUSAGE.md— note the new modeCodex review
Codex caught one stale
.toggledreference inTests/OpenLessHotkeyTests/HotkeyEventTests.swift(the target is not registered inPackage.swiftso it didn't breakswift build, but kept consistent for future wiring). Fixed in 9031f30.Test plan
swift build(release, ad-hoc) passes locally — bundle assembled atbuild/OpenLess.app.starting(slow ASR connect) — cleanly cancels.Escwhile still holding the key — cancels; trailing key-up is ignored.swift testnot run — local CommandLineTools env lacks XCTest (documented in CLAUDE.md).Closes
Closes #1.
Summary by Sourcery
Add a configurable hold-to-talk hotkey mode alongside the existing toggle mode and wire it through the hotkey pipeline, dictation coordinator, settings UI, and user preferences.
New Features:
Enhancements:
Build:
Documentation:
Tests: