Releases · AlexFlanker/whisper-input-next-mac-kit

02 Jun 08:02

AlexFlanker

v0.2.1

55339ee

v0.2.1 — freeze resilience (hold-to-restart + diagnostics) Latest

Latest

Occasional hangs? Now there's an escape hatch and a black box.

Added

🆘 Hold-to-restart — if the service ever freezes, press & hold the on-screen indicator ~1.8s: a red ring fills, then it force-restarts (launchd brings it back in ~1s). It also snapshots all thread stacks right before exiting, so the freeze is captured even if you restart immediately.
🔬 Freeze watchdog — when something stays stuck in a busy state too long (transcription 45s / recording 180s, configurable), it dumps every thread's stack to logs/freeze_diagnostics.log for diagnosis. Diagnostics-only — never auto-restarts.
⏱️ Real whisper-cli timeout (WHISPER_SUBPROCESS_TIMEOUT, default 90s) — kills a hung child instead of leaking it for 3 minutes behind the old thread-based wrapper.

Fixed

The on-screen indicator now receives clicks. The app's status_bar.py ran AppHelper.runConsoleEventLoop(), which pumps the run loop (animations/callAfter/display all work) but never dispatches NSApp mouse events to windows — so the hold-to-restart overlay got nothing. Switched to AppHelper.runEventLoop() (activation policy unchanged). Diagnosed empirically: under the console loop, clicks fired neither acceptsFirstMouse nor mouseDown under both Prohibited and Accessory; runEventLoop fixed it.

How it ships

Two kit-owned files (src/utils/freeze_watchdog.py, src/ui/hold_restart.py, MIT) copied in by install.sh; the patcher wires 6 thin hooks into main.py / status_bar.py / local_whisper.py. Verified byte-identical to a known-good install (30 edits + 5 shipped files) and idempotent.

macOS only. Not affiliated with or endorsed by upstream. See CREDITS.md.

Assets 2

02 Jun 06:52

AlexFlanker

v0.2.0

91bb011

v0.2.0 — local LLM polish (your dictation, cleaned up on-device)

The first release on the core-quality axis: v0.1.x made dictation pleasant to use and manage; v0.2.0 makes it read better — fully on-device.

Added

🧠 Local LLM polish — after transcription, an optional local Ollama model tidies the text before it's pasted:
- light — faithful: fixes punctuation, typos, obvious stumbles, keeps your wording.
- concise — trims filler / repetition / 口头禅 for a much shorter result.
- Default model glm4 (GLM-4-9B); 100% offline; opt-in (POLISH_ENABLED, off by default); a hard timeout falls back to the raw transcript, so a slow or absent Ollama never blocks dictation.
🤖 MCP set_polish (mode / model / enabled) — flip light↔concise, swap models, or toggle it, right from Claude Desktop. POLISH_* added to the writable allowlist; polish state shows in status().

Enable it

brew install ollama && ollama pull glm4 (GLM-4-9B, ~5.5 GB; or qwen2.5:3b for lighter/faster).
POLISH_ENABLED=true in .env and restart — or ask Claude Desktop "turn on polish, concise mode".

How it ships

Kit-owned helper src/transcription/ollama_polish.py (MIT) copied in by install.sh; the patcher only wires 4 thin main.py hooks. Verified byte-identical to a known-good install (24 edits + 3 shipped files) and idempotent. No upstream code redistributed.

macOS only. Not affiliated with or endorsed by upstream. See CREDITS.md.

Assets 2

01 Jun 01:05

AlexFlanker

v0.1.4

6329115

v0.1.4 — capsule indicator style + MCP style switching

Pick your on-screen dictation indicator — and switch it right from Claude Desktop.

Added

🟣 Second indicator style: capsule — a pill that appears with three pulsing dots while
you record, retracts into a small circle + spinner while transcribing, then turns
green and fades when your text lands. (The original glowing ring is still the default.)
🎛️ INDICATOR_STYLE (ring | capsule) picks the style; the app selects it via a tiny
factory. New kit-owned UI file src/ui/capsule_indicator.py (MIT), copied in by install.sh.
🤖 Switch styles from Claude Desktop — the MCP server gains set_indicator_style
(ring / capsule / off) that updates the config and restarts in one call.
INDICATOR_STYLE is now writable via MCP, and status reports the current style.

How it ships

The patcher only wires main.py hooks (now 5 indicator hooks: flags, factory, instantiation,
state-change). Verified byte-identical to a known-good install (20 edits + 2 shipped UI
files) and idempotent. No upstream code redistributed.

macOS only. Not affiliated with or endorsed by upstream. See CREDITS.md.

Assets 2

01 Jun 00:43

AlexFlanker

v0.1.3

b4f809b

v0.1.3 — on-screen listening indicator

A small, native on-screen indicator so you can see your dictation state at a glance.

Added

🟢 Listening indicator — a bottom-center, click-through overlay that floats over any app:
- a breathing ring while recording,
- a spinner while transcribing,
- a green expand-and-fade burst when your text is pasted.
  Fully transparent (no panel chrome) with a soft glow, so it blends into whatever's behind it.
Toggle with SHOW_INDICATOR (default on); also settable from the MCP server.

How it ships

The UI component (src/ui/listening_indicator.py) is kit-owned (MIT) and copied into your
checkout by install.sh. The idempotent patcher only wires four main.py hooks. No upstream
code is redistributed.
Verified: the patcher applies to the pinned upstream commit byte-identical to a known-good
install (19 edits + 1 shipped file) and is idempotent.

macOS only. Not affiliated with or endorsed by upstream. See CREDITS.md.

Assets 2

01 Jun 00:08

AlexFlanker

v0.1.2

24eb225

v0.1.2 — manage from Claude Desktop (MCP), env-configurable sounds, auto-cleanup

Frictionless upgrades to the local-whisper dictation kit.

Added

🤖 MCP server (mcp/server.py) — monitor & configure the dictation service from Claude Desktop (or any MCP client) by just asking: status, logs, get_config/set_config, restart, recent_transcriptions, and model management. No UI to build.
🛠️ install-mcp.sh — installs the mcp SDK into the app venv and merges a whisper-input entry into Claude Desktop's config (existing servers untouched, backed up, idempotent). Refuses to run while Claude Desktop is open (it rewrites its own config).
🧹 Audio-archive auto-cleanup — deletes recordings older than AUDIO_ARCHIVE_RETENTION_HOURS (default 24) at startup and periodically; prunes cache.json.

Changed

🔊 Sound cues are now env-configurable — SOUND_START / SOUND_STOP / SOUND_DONE / SOUND_ERROR / SOUND_WARNING (defaults unchanged: Submarine/Submarine/Glass/Basso/Funk).
uninstall.sh now also removes the Claude Desktop MCP entry.

Quality

Patcher verified: applies to the pinned upstream commit byte-identical to a known-good install, and is idempotent on re-run.
Went through an adversarial multi-agent review (13 findings, 0 false positives); all confirmed items fixed — atomic config writes, write-once backup, hf-mirror.com model fallback behind the GFW, and more.

Not affiliated with or endorsed by upstream. macOS only. See CREDITS.md for the upstream license note.

Assets 2

31 May 22:49

AlexFlanker

v0.1.1

698d326

v0.1.1 — Local-mode punctuation

✍️ What's new — better Chinese punctuation (local mode)

Local transcripts now come out punctuated:

Prompt-guided punctuation — a default WHISPER_PROMPT makes whisper.cpp emit punctuation (run-on Chinese speech was coming out with none).
Full-width normalization — half-width punctuation next to Chinese is converted to full-width 「，。！？」 via WHISPER_FULLWIDTH_PUNCT (on by default).

Both are configurable in .env — set WHISPER_PROMPT= (empty) or WHISPER_FULLWIDTH_PUNCT=false to opt out.

Upgrade an existing install

cd whisper-input-next-mac-kit && git pull && ./install.sh

The patcher is idempotent and leaves your .env untouched (code defaults provide punctuation).

Full details in CHANGELOG.md.

Assets 2

31 May 22:25

AlexFlanker

v0.1.0

64694fd

v0.1.0 — Initial release

A one-command macOS installer that turns Whisper-Input-Next into an always-on, fully local voice keyboard — offline, free, and private.

✨ Highlights

🎙️ Tap Right-⌘ to start/stop — single-tap toggle, no chord, no app conflicts
🔊 Audio cues — Submarine on start/stop, Glass when text is pasted
🚀 launchd auto-start service — starts at login, restarts on crash, no terminal window
🧠 Ctrl+F also routes to local whisper.cpp
⚙️ Turn-key: uv venv, deps, whisper-cpp, model, .env, LaunchAgent — all wired automatically

🚀 Install (macOS)

git clone https://github.com/AlexFlanker/whisper-input-next-mac-kit.git
cd whisper-input-next-mac-kit && ./install.sh

📌 Notes

macOS only. Default model large-v3-turbo (override with WIN_MODEL=large-v3).
Does not redistribute upstream source — the installer clones it from the official repo on your machine. All credit to the upstream authors — see CREDITS.md.
Targets upstream commit 5edec44.

MIT licensed (covers this kit's own code).

Assets 2

Releases: AlexFlanker/whisper-input-next-mac-kit

v0.2.1 — freeze resilience (hold-to-restart + diagnostics)

Added

Fixed

How it ships

Uh oh!

v0.2.0 — local LLM polish (your dictation, cleaned up on-device)

Added

Enable it

How it ships

Uh oh!

v0.1.4 — capsule indicator style + MCP style switching

Added

How it ships

Uh oh!

v0.1.3 — on-screen listening indicator

Added

How it ships

Uh oh!

v0.1.2 — manage from Claude Desktop (MCP), env-configurable sounds, auto-cleanup

Added

Changed

Quality

Uh oh!

v0.1.1 — Local-mode punctuation

✍️ What's new — better Chinese punctuation (local mode)

Upgrade an existing install

Uh oh!

v0.1.0 — Initial release

✨ Highlights

🚀 Install (macOS)

📌 Notes

Uh oh!