Skip to content

feat: hold-to-talk hotkey mode (1.0.01 / A1003) — closes #1#2

Merged
appergb merged 6 commits into
mainfrom
feature/hold-to-talk
Apr 27, 2026
Merged

feat: hold-to-talk hotkey mode (1.0.01 / A1003) — closes #1#2
appergb merged 6 commits into
mainfrom
feature/hold-to-talk

Conversation

@appergb
Copy link
Copy Markdown
Collaborator

@appergb appergb commented Apr 27, 2026

Summary

  • Adds a hold-to-talk hotkey mode alongside the existing toggle mode (closes 建议增加新功能 #1, requested by @YD-233).
  • OpenLessHotkey now emits explicit .pressed / .released edge events; DictationCoordinator interprets them per UserPreferences.hotkeyMode. Toggle remains the default — no behavior change for existing users.
  • Settings hub → 「输入与输出」gets a segmented "录音方式" picker with a one-line tradeoff hint.
  • Build: 1.0.01 / A1003.

Behavior matrix

State / Event toggle (default) hold
.pressed while idle begin session begin session
.pressed while listening end session (ignored)
.released (ignored) end if listening; cancel if still in starting; ignored otherwise
Esc while listening cancel cancel; sets suppressNextRelease so the trailing key-up does not re-fire end

Files

  • Sources/OpenLessCore/HotkeyMode.swift (new) — enum HotkeyMode { toggle, hold }
  • Sources/OpenLessHotkey/HotkeyEvent.swift — rename .toggled.pressed, add .released
  • Sources/OpenLessHotkey/HotkeyMonitor.swift — yield .released on the trigger-up edge
  • Sources/OpenLessPersistence/UserPreferences.swifthotkeyMode getter/setter
  • Sources/OpenLessApp/DictationCoordinator.swifthandlePressed / handleReleased / handleHoldStart; suppressNextRelease for cancel
  • Sources/OpenLessApp/Settings/SettingsView.swift — segmented picker + hint line
  • Tests/OpenLessHotkeyTests/HotkeyEventTests.swift — updated for the event-set rename
  • scripts/build-app.shAPP_VERSION=1.0.01, BUILD_NUMBER=A1003
  • README.md / README.zh.md — drop hold-to-talk from roadmap
  • USAGE.md — note the new mode

Codex review

Codex caught one stale .toggled reference in Tests/OpenLessHotkeyTests/HotkeyEventTests.swift (the target is not registered in Package.swift so it didn't break swift build, but kept consistent for future wiring). Fixed in 9031f30.

Test plan

  • swift build (release, ad-hoc) passes locally — bundle assembled at build/OpenLess.app.
  • Toggle mode (default after first install) — start/stop unchanged.
  • Hold mode — press-and-hold → release ends and inserts.
  • Hold mode short tap (<300 ms) — no crash, no stuck capsule.
  • Hold mode release during starting (slow ASR connect) — cleanly cancels.
  • Hold mode Esc while still holding the key — cancels; trailing key-up is ignored.
  • Switching modes from settings while idle — picker persists, no restart needed.
  • Menu-bar 录音菜单项 in either mode — uses explicit toggle, behaves the same.

swift test not run — local CommandLineTools env lacks XCTest (documented in CLAUDE.md).

Closes

Closes #1.

Summary by Sourcery

Add a configurable hold-to-talk hotkey mode alongside the existing toggle mode and wire it through the hotkey pipeline, dictation coordinator, settings UI, and user preferences.

New Features:

  • Introduce a HotkeyMode enum supporting toggle and hold hotkey behaviors.
  • Add a user-selectable recording mode (toggle vs hold-to-talk) in the settings UI with descriptive hints.
  • Emit distinct pressed and released hotkey events from the hotkey monitor to support edge-based handling.

Enhancements:

  • Update DictationCoordinator state handling to interpret pressed/released events according to the selected hotkey mode, including Esc-based cancellation behavior.
  • Persist the selected hotkey mode in user preferences so it survives app restarts.

Build:

  • Bump the app version to 1.0.01 with build number A1003 in the build script.

Documentation:

  • Document the new hold-to-talk mode in usage docs and remove it from the roadmap in the READMEs.

Tests:

  • Adjust hotkey event tests to cover the new pressed and released events instead of the old toggled event.

baiqing added 3 commits April 27, 2026 16:56
OpenLessHotkey now emits explicit .pressed / .released edge events on
the modifier-key trigger; the coordinator interprets them per the new
HotkeyMode preference (toggle stays the default, hold = press-and-hold).

Hold mode rules: .pressed on idle starts the session; .released stops
it from .listening; releasing during .starting cancels (no audio
useful was captured yet). Esc-cancel sets a one-shot suppress flag so
the trailing .released after a cancel does not re-trigger end.

Settings hub gets a "录音方式" segmented picker and a one-line hint
explaining the tradeoff. Bump version to 1.1.0; update README roadmap
and USAGE walkthrough to document the new mode.
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented Apr 27, 2026

Reviewer's Guide

Implements a configurable hotkey mode system by introducing a new HotkeyMode (toggle/hold), changing hotkey events to explicit pressed/released edges, updating DictationCoordinator and settings to interpret those edges per user preference, and documenting/bumping build metadata for the new hold-to-talk feature.

Sequence diagram for hotkey pressed/released handling in toggle vs hold modes

sequenceDiagram
    actor User
    participant HotkeyMonitor
    participant DictationCoordinator
    participant VolcengineStreamingASR as ASR

    rect rgb(240,240,240)
    Note over User,DictationCoordinator: Toggle mode (default)
    User->>HotkeyMonitor: press trigger
    HotkeyMonitor->>DictationCoordinator: HotkeyEvent.pressed
    DictationCoordinator->>DictationCoordinator: handlePressed()
    DictationCoordinator->>DictationCoordinator: handleToggle() begin
    DictationCoordinator->>ASR: beginSession()

    User-->>HotkeyMonitor: release trigger
    HotkeyMonitor-->>DictationCoordinator: HotkeyEvent.released
    DictationCoordinator-->>DictationCoordinator: handleReleased() (ignored in toggle)

    User->>HotkeyMonitor: press trigger again
    HotkeyMonitor->>DictationCoordinator: HotkeyEvent.pressed
    DictationCoordinator->>DictationCoordinator: handlePressed()
    DictationCoordinator->>DictationCoordinator: handleToggle() end
    DictationCoordinator->>ASR: endSession()
    end

    rect rgb(235,245,255)
    Note over User,DictationCoordinator: Hold mode
    User->>HotkeyMonitor: press trigger
    HotkeyMonitor->>DictationCoordinator: HotkeyEvent.pressed
    DictationCoordinator->>DictationCoordinator: handlePressed()
    DictationCoordinator->>DictationCoordinator: handleHoldStart()
    DictationCoordinator->>ASR: beginSession()

    User-->>HotkeyMonitor: release trigger
    HotkeyMonitor-->>DictationCoordinator: HotkeyEvent.released
    DictationCoordinator-->>DictationCoordinator: handleReleased()
    alt sessionPhase == listening
        DictationCoordinator->>ASR: endSession()
    else sessionPhase == starting
        DictationCoordinator->>DictationCoordinator: handleCancel()
    end
    end

    rect rgb(255,240,240)
    Note over User,DictationCoordinator: Hold mode Esc cancel while holding
    User->>DictationCoordinator: Esc key
    DictationCoordinator->>DictationCoordinator: handleCancel()
    DictationCoordinator->>DictationCoordinator: suppressNextRelease = true

    User-->>HotkeyMonitor: release trigger
    HotkeyMonitor-->>DictationCoordinator: HotkeyEvent.released
    DictationCoordinator-->>DictationCoordinator: handleReleased()
    DictationCoordinator-->>DictationCoordinator: suppressNextRelease consumed, no endSession
    end
Loading

Updated class diagram for hotkey mode and event handling

classDiagram
    class HotkeyMode {
        <<enum>>
        toggle
        hold
        String rawValue
        String displayName()
        String hint()
    }

    class HotkeyEvent {
        <<enum>>
        pressed
        released
        cancelled
    }

    class UserPreferences {
        -UserDefaults defaults
        +static UserPreferences shared
        +HotkeyMode hotkeyMode
        +HotkeyBinding.Trigger hotkeyTrigger
        +PolishMode polishMode
        +Bool hasCompletedOnboarding
    }

    class HotkeyMonitor {
        -CFMachPort eventTap
        -CFRunLoopSource runLoopSource
        -Bool triggerHeld
        -AsyncStream~HotkeyEvent~.Continuation continuation
        +AsyncStream~HotkeyEvent~ events
        +start()
        +stop()
        -handleEvent(eventType, flags)
    }

    class DictationCoordinator {
        -VolcengineStreamingASR asr
        -BufferingAudioConsumer audioConsumer
        -Date sessionStartedAt
        -SessionPhase sessionPhase
        -HotkeyServiceProtocol hotkey
        -Bool suppressNextRelease
        +init(hotkeyService)
        +start()
        -handlePressed()
        -handleReleased()
        -handleHoldStart()
        -handleToggle()
        -handleCancel()
        -beginSession()
        -endSession()
    }

    class SettingsHubTab {
        <<SwiftUIView>>
        -HotkeyBinding.Trigger trigger
        -HotkeyMode hotkeyMode
        -PolishMode mode
        +body()
    }

    class VolcengineStreamingASR {
        +startStreaming()
        +stopStreaming()
    }

    UserPreferences --> HotkeyMode : stores
    UserPreferences --> HotkeyBinding.Trigger : stores

    DictationCoordinator --> HotkeyMonitor : subscribes_events
    DictationCoordinator --> UserPreferences : reads_hotkeyMode
    DictationCoordinator --> VolcengineStreamingASR : controls
    DictationCoordinator --> HotkeyEvent : consumes

    HotkeyMonitor --> HotkeyEvent : produces

    SettingsHubTab --> UserPreferences : reads_writes_hotkeyMode
    SettingsHubTab --> HotkeyMode : displays_picker_options
    SettingsHubTab --> NotificationCenter : posts_openLessHotkeyChanged
Loading

File-Level Changes

Change Details Files
Introduce HotkeyMode enum and persist user preference for toggle vs hold hotkey behavior.
  • Add HotkeyMode enum with toggle/hold cases plus displayName and hint helpers.
  • Add hotkeyMode property to UserPreferences backed by UserDefaults with sane default and migration-safe fallback.
  • Wire hotkeyMode state into SettingsHubTab using a segmented Picker that updates UserPreferences and notifies listeners.
Sources/OpenLessCore/HotkeyMode.swift
Sources/OpenLessPersistence/UserPreferences.swift
Sources/OpenLessApp/Settings/SettingsView.swift
Refactor hotkey handling to use explicit pressed/released events and implement hold-to-talk state machine in DictationCoordinator.
  • Change HotkeyEvent from a single toggled case to pressed/released edge events plus cancelled, updating docs/comments accordingly.
  • Update HotkeyMonitor to emit pressed on key-down and released on key-up while tracking triggerHeld state.
  • Refactor DictationCoordinator to route pressed/released through handlePressed/handleReleased/handleHoldStart based on UserPreferences.hotkeyMode, preserving existing toggle behavior as default.
  • Add suppressNextRelease flag so Esc-cancel in hold mode ignores the trailing key-up release event.
  • Adjust sessionPhase transitions so hold mode press starts (.starting), release ends (.processing) or cancels if still starting, matching the behavior matrix.
Sources/OpenLessHotkey/HotkeyEvent.swift
Sources/OpenLessHotkey/HotkeyMonitor.swift
Sources/OpenLessApp/DictationCoordinator.swift
Update tests and documentation plus bump build metadata for the new behavior.
  • Update HotkeyEventTests to validate distinct pressed/released/cancelled events instead of the old toggled case.
  • Document the new hold-to-talk mode and settings in USAGE.md and remove it from the future roadmap in both READMEs.
  • Bump APP_VERSION to 1.0.01 and BUILD_NUMBER to A1003 in build-app.sh to reflect the new feature build.
Tests/OpenLessHotkeyTests/HotkeyEventTests.swift
USAGE.md
README.md
README.zh.md
scripts/build-app.sh

Assessment against linked issues

Issue Objective Addressed Explanation
#1 Add a hold-to-talk (press-and-hold hotkey) recording mode alongside the existing toggle mode, including appropriate handling in the hotkey pipeline and dictation state machine.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="Sources/OpenLessApp/DictationCoordinator.swift" line_range="175-178" />
<code_context>
+        case .listening:
+            sessionPhase = .processing
+            Task { await endSession() }
+        case .starting:
+            // 用户没等到 ASR 连上就松手 — 当作取消,不发送任何已采集音频。
+            Log.write("[session] hold: starting 阶段松手,取消")
+            handleCancel()
+        case .idle, .processing:
+            return
</code_context>
<issue_to_address>
**issue (bug_risk):** Cancelling from `.starting` in hold mode sets `suppressNextRelease`, which will suppress the *next* legitimate release and prevent the next session from ending.

In this path, `handleReleased` calls `handleCancel()` while in `.starting` hold mode. By then the key is already up and the corresponding `.released` has been handled, but `handleCancel()` still sets `suppressNextRelease = true`. That flag then incorrectly applies to the *next* press/release cycle, causing the next `.released` to be ignored and the subsequent hold session to never exit `.listening`/`.processing`. To avoid this, the `.starting` early-release path should cancel without setting `suppressNextRelease`, or `handleCancel` should only set this flag when invoked while the key is actually still down (e.g., Esc-based cancel).
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +175 to +178
case .starting:
// 用户没等到 ASR 连上就松手 — 当作取消,不发送任何已采集音频。
Log.write("[session] hold: starting 阶段松手,取消")
handleCancel()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Cancelling from .starting in hold mode sets suppressNextRelease, which will suppress the next legitimate release and prevent the next session from ending.

In this path, handleReleased calls handleCancel() while in .starting hold mode. By then the key is already up and the corresponding .released has been handled, but handleCancel() still sets suppressNextRelease = true. That flag then incorrectly applies to the next press/release cycle, causing the next .released to be ignored and the subsequent hold session to never exit .listening/.processing. To avoid this, the .starting early-release path should cancel without setting suppressNextRelease, or handleCancel should only set this flag when invoked while the key is actually still down (e.g., Esc-based cancel).

baiqing added 3 commits April 27, 2026 18:12
…e on secret fields

Settings window had isMovableByWindowBackground = true, so any drag
inside the content view dragged the whole window — making it
impossible to select words inside a TextField. Disable it; the native
title-bar strip (with the traffic lights) still drags the window.

PasteableCredentialField now exposes an eye / eye.slash toggle when
secure: true, so users can verify pasted API keys before saving.
When the user removes the Ark API key (or the Ark call errors out),
DictationCoordinator was still showing the green "已插入" capsule and
recording the session in history under the originally-selected polish
mode. From the user's side it looked like polish had succeeded, even
though only the raw transcript reached the cursor.

Track a PolishOutcome (ok / skippedNoCredentials / failed) through
insertText. Capsule now uses a new CapsuleState.warning (orange) state
to surface honest messages — "已插入原文 · 未润色" / "润色失败 · 已用原文" —
and history saves the session as PolishMode.raw whenever polish
didn't actually run. The error capsule + 1.5s artificial sleep that
preceded a fake "success" insert is gone; warning state stays for the
normal 2.5s.
Some weaker chat models would emit a single flat list when the user
asked for Structured polish, even when the dictation clearly covered
multiple topics. The mode is meant to produce something pasteable as
an AI prompt or a working doc, not a wall of bullets.

Rewrite the .structured system prompt to be prescriptive: when the
input contains ≥2 themes, emit three explicit levels — top "1./2./3."
sections, "1)/2)/3)" sub-points indented 3 spaces, and "a./b./c."
detail items indented another 3 — and to fall back to a plain
paragraph when the input is short / single-topic. Add a concrete
format example so weaker models have a pattern to copy.
@appergb appergb merged commit 56c3e09 into main Apr 27, 2026
1 check passed
@appergb appergb deleted the feature/hold-to-talk branch April 30, 2026 03:19
appergb added a commit that referenced this pull request Apr 30, 2026
#79)

closes Codex audit blockers on PR #78

## HIGH #1 — begin_session 内 await 期间 cancel 被覆盖
背景:volcengine open_session().await + Recorder::start 都是异步,期间用户按
Esc 调用 cancel_session 把 phase 改回 Idle 并 cancelled=true。但原代码后续无
条件 `state.phase = SessionPhase::Listening`,把 Idle 又翻回 Listening →
用户的 cancel 被吞掉,session 仍然继续。

修:
- 新增 `cancel_raced_during_starting(inner)` helper:持锁查 cancelled or phase
  != Starting
- ASR open_session().await 后调用:如已 race,asr.cancel() + 回 Idle
- Recorder::start Ok 分支用 BeginOutcome { Started / PendingStop / CancelRaced }
  原子在同一 lock 内决定,CancelRaced 时清理 recorder + asr 资源不进 Listening

## HIGH #2 — cancelled check 与 inserter.insert 之间的窗口
背景:end_session 检查 cancelled 后到调用 inserter.insert 之间释放了 lock,
此时 Esc 触发 cancel_session 设 cancelled=true 已经晚了 — Cmd+V 即将发出,
撤销不掉。但 cancel_session 仍然 emit "已取消" → UI 与实际行为相反(已插入但
显示已取消)。

修:
- 新增 SessionPhase::Inserting:表示「已过最后一次 cancel 检查、即将/正在
  调用 inserter.insert」的窗口
- end_session:把 polish 后的 cancel check + phase 转换 atomic 在同一 lock
  内:cancelled → Idle + return;否则 → phase=Inserting 后 release lock 走 insert
- cancel_session:phase==Inserting → 直接 return(不设 cancelled、不 emit 取
  消)。理由:物理上无法撤销 Cmd+V,硬装"已取消"只会让 UI 撒谎

cargo check 通过,13 个 warnings 全部是 pre-existing。

Co-authored-by: baiqing <lbx12309@icloud.com>
appergb pushed a commit that referenced this pull request Apr 30, 2026
…attr

修复 Codex audit 对 main HEAD a9c81e6 的 2 条 HIGH + 把 macOS auto-update 的
"重启后 Gatekeeper 拦截"问题封死。

## HIGH #2: 版本号不一致
PR #84 没顺手 bump 版本,main 仍自报 1.2.2,下次直接打 v1.2.3-tauri 会让 updater
manifest 自报 1.2.2,已安装的 1.2.2 客户端永远拿不到 "有新版本"。

修:package.json + tauri.conf.json + Cargo.toml 全部改 1.2.3。

## HIGH #1: updater pubkey 所有权
PR #84 使用的 pubkey 是外部贡献者本地生成的,appergb 不掌握对应 private key。
任何持有那把私钥的人都能签发更新 -> 客户端 OTA 装上恶意版本。

修:
- 用 npx @tauri-apps/cli signer generate --ci 生成 appergb 自己的 keypair
- 新 pubkey: F0FCDE68E08E6D4E (写入 tauri.conf.json plugins.updater.pubkey)
- 私钥已通过 gh secret set TAURI_SIGNING_PRIVATE_KEY 配到 GitHub repo secret
- 私钥本地副本只在 /tmp,不进 git,commit 后会清除

## macOS auto-update 后 strip xattr
restart_app 在 app.restart() 前对 .app bundle 跑 /usr/bin/xattr -cr。这是
Tauri auto-updater + 未公证应用的组合下唯一让"自动更新对用户零摩擦"的解法 —
否则 Gatekeeper 在重启时会拦说"OpenLess 已损坏",用户必须开终端 xattr 才能
继续用,违反 auto-update 的本意。

未来发版逻辑必须保留这一步。release-tauri.yml 上一次 PR (#83) 已经在 CI 侧
strip 过一次,本次在 client 侧重启时再 strip 一次,双保险覆盖
"下载 -> 解压 -> 安装 -> 重启"全链路。
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

建议增加新功能

1 participant