Release v0.3.0 — Parakeet on Hexagon NPU · trsdn/openwritr-windows

📥 Which file do I download?

Your machine	Download this
Snapdragon / ARM (e.g. Surface Pro 11, Surface Laptop 7)	`openwritr-windows-arm64-v0.3.0-setup.exe`
Intel / AMD (most other laptops & desktops)	`openwritr-windows-x64-v0.3.0-setup.exe`

Not sure which you have? Settings → System → About → "System type": if it says ARM-based processor take arm64, otherwise x64.

Install

Run the setup .exe. Windows SmartScreen will warn ("Windows protected your PC") because the binaries are not code-signed yet — click More info → Run anyway. You can verify your download against SHA256SUMS.txt: Get-FileHash .\openwritr-windows-<arch>-v0.3.0-setup.exe in PowerShell.
The installer creates everything itself (Start Menu entry, optional autostart, uninstaller under Settings → Apps). No folders to prepare.
First launch downloads the speech model (~0.6–1.2 GB) — one-time, takes a couple of minutes. A microphone icon appears in the system tray when ready.
Hold Ctrl + Win, speak, release — the text is pasted at your cursor.

Manual install (portable zip, no installer)

Download the .zip for your architecture instead, unzip it into any folder you like (e.g. C:\Tools\OpenWritr\), and run openwritr.exe from there. Same binaries — just no Start Menu entry, no autostart, no uninstaller. User data (settings, models, logs) goes to %LOCALAPPDATA%\OpenWritr\ automatically either way.

First release with Parakeet TDT v3 running on the Snapdragon X Elite Hexagon NPU (arm64) — plus a CPU-only build for Intel/AMD (x64).

Push-to-talk transcription on the NPU at typical 200–400 ms total decode for a 5-second utterance, with the encoder itself at ~67 ms steady-state per 8-second window. Long-form audio is chunked transparently (8 s window, 1 s overlap, decoder runs once over the stitched feature stream).

Performance

Measured on Snapdragon X Elite (X1E80100):

Audio length	Decode (preproc + NPU encode + TDT)	× Realtime	Chunks
3 s	128 ms	23×	1
5.8 s	221 ms	26×	1
16.4 s	375 ms	44×	3
23.0 s	626 ms	37×	4

The x64 build runs the same pipeline on the CPU (no Hexagon NPU on Intel/AMD) at roughly 25× realtime on a modern CPU.

Switch engines (arm64 only)

Right-click the tray icon → Settings → Transcription engine. NPU is used when available, with automatic CPU fallback.

Companion model

The NPU encoder is hosted at trsdn/parakeet-tdt-0.6b-v3-htp-int8-8s (632 MB QAIRT context binary). CC-BY-4.0, attribution NVIDIA Parakeet. Downloaded automatically on first NPU launch.

What landed

NPU pipeline: Direct ort_sys FFI (src/asr/qnn_ffi.rs) loading an AI-Hub-compiled QNN context binary; chunked long-audio with seam stitching.
Focus-robust hotkey: global low-level keyboard hook — recording survives focus steals (popups, UAC, shortcuts).
Audio idle fix: capture stream rebuilt per recording — no more dead mic after long idle.
App icon, settings links, included-Copilot-model markers.
x64 build for Intel/AMD (CPU INT8, 9 MB zip).
CI: every release is built reproducibly on GitHub Actions (windows-11-arm).

Known limits

arm64 NPU model is device-gated to Snapdragon X Elite; other Snapdragons fall back to CPU.
Static 8 s NPU window; longer audio is chunked transparently.
Binaries unsigned (SmartScreen warning) — Store/signing in progress.

🤖 Generated with Claude Code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.3.0 — Parakeet on Hexagon NPU

Choose a tag to compare

Sorry, something went wrong.