Releases: programow/ada
v0.1.6 — monitor-framed demo, cleaner download cards
What's Changed
- feat(landing): frame demo as a monitor screen; drop version from download cards by @GuilhermeVozniak in #38
- chore: bump version to 0.1.6 by @GuilhermeVozniak in #39
Full Changelog: v0.1.5...v0.1.6
v0.1.5 — readable warnings, OS-aware landing, interactive demo, version sync
What's Changed
- feat(landing): docs page + restructured download + demo fix + windows test unflake by @GuilhermeVozniak in #32
- fix(desktop): readable warning banner + refresh history after Clear all by @GuilhermeVozniak in #33
- feat(release): single-source-of-truth version bumper + CI consistency gate by @GuilhermeVozniak in #35
- feat(landing): adapt copy to visitor OS + interactive looping demo by @GuilhermeVozniak in #36
- chore: bump version to 0.1.5 by @GuilhermeVozniak in #37
Full Changelog: v0.1.2...v0.1.5
v0.1.2 — CI hardening + macOS icon bundling
What's Changed
- feat(landing): real provider logos in providers grid by @LeydenJar in #29
- feat(desktop): realtime STT via ElevenLabs Scribe v2 by @LeydenJar in #31
- Add modifier-only chord and double-tap modifier shortcuts on macOS by @GuilhermeVozniak in #30
Full Changelog: v0.1.4...v0.1.2
v0.1.4
v0.1.3
First release with the in-app auto-updater wired end-to-end:
- Minisign-signed updater bundles produced by CI.
- Tauri-shaped
update.jsonmanifest published as a release asset. - In-app "Install & restart" banner + manual "Check now" from Settings → Updates.
- Settings tab now shows the running binary's version at the bottom.
After installing this build, future releases will be offered in-app and can be installed with a click.
v0.1.2-test — first signed macOS build
First end-to-end test of the macOS code signing + notarization + stapling pipeline (#23). Bundles for all platforms; the macOS DMG will be attached automatically once CI finishes (≈6–10 min).
What's new
- macOS DMG is now signed with Developer ID Application: Programow LTDA, notarized by Apple, and stapled — opens cleanly on any Mac without Gatekeeper warnings, online or offline.
- New
/build-releaseslash command for reproducing the release ritual locally.
Note
This is tagged with the -test suffix because it's the first release through the signed pipeline. If everything looks good, future releases will drop the suffix.
🤖 Generated with Claude Code
v0.1.1 — Linux + Windows bundles
First release with the full three-platform matrix. v0.1.0 shipped macOS only because Linux + Windows were gated out of the release pipeline — those gates are now removed, the per-platform build issues are fixed, and this release ships bundles for all three.
What's new since v0.1.0
Release pipeline
- Linux + Windows entries re-enabled in the release matrix (#17).
libxdo-devadded to the Linux apt block so theenigosynthetic-keystroke crate links.icon.ico+.icns+ sized PNGs wired intotauri.conf.json's bundle config (#19) — Windows bundling was failing withCouldn't find a .ico icon.- Rust toolchain pinned per-platform (#21): stable for macOS + Windows;
1.88.0for Linux to dodge aglib v0.18.5trait-solver breakage on current stable while satisfyingdarling's 1.88 MSRV and theedition2024stabilisation floor.
Downloads
- macOS:
bluemacaw_0.1.0_universal.dmg - Linux:
.AppImage/.deb/.rpm - Windows:
.msi/-setup.exe
(Bundle filenames still carry the 0.1.0 version string — the build identifier in tauri.conf.json wasn't bumped before this tag was cut. Functionality and content are the v0.1.1 release.)
Known issue
The macOS bundle is ad-hoc signed (Apple Developer ID + notarization are queued for Plan D). On macOS Sonoma / Sequoia this can cause the microphone permission prompt to re-appear across launches. Workaround: drag the app from the DMG into /Applications before launching, and once granted, the entry sticks for the lifetime of that installed copy.
v0.1.0 — bluemacaw
First release under the bluemacaw name.
Highlights since v0.0.1
Rebrand
- Full rebrand from "Vox Era" / "Ada" to bluemacaw — new name, identifier
com.vhtechnology.bluemacaw, all landing references repointed toprogramow/ada. - Soft, rounded brand identity rolled across the desktop UI: Dashboard, Settings, Recording status pill, overlay pill, theme variables.
Desktop
- New first-launch permission onboarding screen with per-platform rows (macOS: Microphone + Accessibility + Input Monitoring; Windows / Linux: Microphone; Wayland info banner where applicable).
- Macros: redesigned onboarding aligned with the brand palette and a new behaviour — clicking Grant for Accessibility / Input Monitoring on macOS now surfaces the Restart bluemacaw banner immediately, since TCC doesn't propagate those grants into a running process.
- Light / Dark / System theme support across the main window and overlay.
- Cancel an in-progress recording via Cmd+Esc or the overlay X button.
- Overlay fixes: stale state transitions dropped, audio meter recalibrated for built-in mics, rescue from unreachable coordinates.
Landing
- Theme parity with the desktop app.
- Downloads now resolve from the release
latest.jsonmanifest, so the homepage cards always point at the newest assets. - GitHub Pages deploy wired up.
- `/changelog` fixed (was 404ing because of the pre-rename repo slug).
CI / dev workflow
- Cross-platform CI matrix made green: Ubuntu time-jitter in the db purge boundary test, Windows path bug in the migrations test harness, Ubuntu's missing libxdo for enigo, and a mic-select hydration race.
Downloads
The macOS `.dmg` is attached below. Linux + Windows release builds are still gated in release.yml; once the matrix is re-enabled they will appear on the next release.
v0.0.3-test
feat(landing): resolve download URLs from release manifest
Replaces the hard-coded GitHub releases page link with a runtime
fetch of latest.json (published by the release workflow as an asset
on each GH release). The Download component is now a client
component that:
- shows a fallback link to /releases/latest before the manifest
resolves, keeping the section usable if JS is slow
- renders a real download link per platform when the manifest URL
is non-null
- renders a non-clickable "Coming soon" card when the URL is null
(windows + linux until those bundles ship)
- displays the resolved version beneath the macOS card
Adds packages/landing/src/lib/manifest.ts mirroring the shape of
lib/github.ts and tests following the same vi.stubGlobal('fetch')
pattern used in github.test.ts.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v0.0.1
Plan A: monorepo bootstrap (#3)mj
* docs: add brainstormed spec for Ada documentation and AI workflow
Approved design covering (1) a docs/ folder with seven topic-focused docs
explaining architecture, build/release, permissions, Whisper integration,
troubleshooting, and the AI development workflow, and (2) four project-local
slash commands (/build-clean, /reset-perms, /dev, /diagnose-mic) that
automate the rituals contributors run by hand today.
Co-Authored-By: WOZCODE <contact@withwoz.com>
* docs: add implementation plan for Ada documentation and AI workflow
15 bite-sized tasks covering the seven docs in docs/, the four
.claude/commands/ slash commands, the README.md and CLAUDE.md
trims, manual acceptance testing of the commands, and the
merge/PR step.
Co-Authored-By: WOZCODE <contact@withwoz.com>
* docs: add docs/ index
* docs: explain architecture, IPC contract, and end-to-end flow
* docs: document the clean build & install ritual with rationale
* docs: explain microphone and accessibility permission requirements
* docs: document Whisper API integration and config.json schema
* docs: add symptom-keyed troubleshooting punch list
* docs: explain when to use each slash command
* feat(claude): add /reset-perms slash command
* feat(claude): add /dev slash command with config.json sanity check
* feat(claude): add /diagnose-mic read-only diagnostic command
* feat(claude): add /build-clean slash command for the full ritual
* docs: trim README and link to docs/ for deeper guides
* docs: replace inline build ritual in CLAUDE.md with link to /build-clean
* chore: stop tracking AI workflow specs/plans
These are scaffolding artifacts from the brainstorming workflow rather than long-lived project docs. Keep them locally for ongoing use but exclude from the repo.
* feat: add documentation and plans
* feat: add new skills for Tauri app development and release management, update .gitignore
* chore: update .gitignore to exclude .DS_Store files
* docs: update plans to reflect branch changes from `tech-stack` to `execution` for the Vox Era project
* chore: archive legacy Electron app under legacy/electron/
* feat: root package.json with Bun workspaces and dev tooling
* feat: add empty package skeletons for desktop, landing, and infra
* feat: add shared strict TypeScript base config
* feat: add Biome lint+format config
* feat: add commitlint config enforcing conventional commits
* feat: add lefthook hooks for pre-commit, commit-msg, pre-push
* chore: update .gitignore for monorepo build outputs and secrets
* docs: add Apache 2.0 license
* docs: rewrite root README for Vox Era monorepo
* docs: rewrite CLAUDE.md for Vox Era monorepo
* docs: add CONTRIBUTING.md with branching, commits, and doc-update rules
* docs: refresh docs index for Vox Era plan structure
* ci: add base CI workflow with lint and typecheck
* chore: add PR template with doc-update checklist
* feat(claude): add /sync-docs slash command for doc-update audits
* chore: confirm hook chain on first real commit
* chore: update .gitignore to include docs/sessions/ directory
* feat(desktop): scaffold Tauri 2 app with React+Vite+Vitest
Deviations from plan B Task 1 forced by ecosystem drift:
- objc2 family bumped to 0.6 (foundation/av-foundation 0.3) since 0.2/0.5 are no longer published.
- macOS infoPlist moved to a separate Info.plist file: Tauri 2 expects a path string, not an inline object.
- Removed opener:default permission left over from create-tauri-app's default capabilities (we don't depend on tauri-plugin-opener).
* test(desktop): configure Vitest with happy-dom and add smoke test
* feat(desktop): add AudioSource trait with permission states and capture types
* feat(desktop): implement cpal-backed MicrophoneSource with WAV encoding
* test(desktop): add MockMicrophoneSource backed by canned WAV bytes
* feat(desktop): per-platform mic + accessibility permission detection
Replaces the NotDetermined stubs in audio/permissions/mod.rs with real
checks per OS:
- macOS: AVCaptureDevice.authorizationStatusForMediaType: + the matching
async request method via objc2-av-foundation 0.3 (with the
AVCaptureDevice / AVMediaFormat / block2 features now enabled).
Accessibility goes through AXIsProcessTrustedWithOptions, linked
directly from ApplicationServices.framework because core-graphics 0.24
only re-exports the screen-capture access pair.
- Windows: reads HKCU\Software\Microsoft\Windows\CurrentVersion\
CapabilityAccessManager\ConsentStore\microphone\Value.
- Linux: probes cpal for a default input device (no consent system
exists at OS level) and treats settings deep links as unsupported.
Also exposes a shared SettingsPanel enum + open_settings_*_panel
helpers for the upcoming UI flows.
* feat(desktop): add Vault trait with KeyringVault and InMemoryVault
* feat(desktop): add SecretKey wrapper that redacts values in Debug output
* feat(desktop): add Settings types with platform-aware default hotkey
* feat(desktop): register SQLite migration 0001 for transcriptions table
* feat(desktop): add transcription repository (insert, list, soft_delete, purge)
* feat(desktop): add stats aggregations (totals, WPM, time saved, top provider/model, streak)
* feat(desktop): add rolling-window retention purge with soft-delete grace period
* feat(desktop): add ShortcutManager trait with standard and mock implementations
* feat(desktop): add macOS Fn-key shortcut via CGEventTap (requires Accessibility)
* feat(desktop): add Clipboard trait with InMemoryClipboard for tests
* feat(desktop): add Paster trait with enigo-based real impl and recording mock
* feat(desktop): add system tray with Open Vox Era and Quit menu items
* feat(desktop): add Tauri capabilities granting required plugin permissions
* feat(desktop): expose Tauri command surface for permissions, recording, secrets, paste
Adapted the plan's commands.rs to module surfaces as built in Sections 2-8:
- Dropped clipboard from AppState; Paster.paste_text already combines
clipboard write + keystroke per spec §6.10.
- EnigoPaster is generic over Clipboard, so the run() wires an Arc<InMemoryClipboard>
through the paster constructor.
- Drives all 12 commands plus tray::build() in setup; AppState managed via
Tauri's State extractor.
* feat(desktop): add typed invoke wrapper and React entry point
* feat(desktop): add provider registry types, 9 stub adapters, and transcribe orchestration
* feat(desktop): add OpenAI provider adapter with listModels and pricing
* feat(desktop): add Azure OpenAI provider adapter contract tests
* feat(desktop): add Groq provider adapter with listModels and pricing
* feat(desktop): add Deepgram provider adapter with pricing and default models
* feat(desktop): add AssemblyAI provider adapter with pricing and default models
* feat(desktop): add ElevenLabs provider adapter with listModels and pricing
* feat(desktop): add Fal provider adapter with pricing and default models
* feat(desktop): add Gladia provider adapter contract tests
* feat(desktop): add Rev.ai provider adapter with pricing and default models
* feat(desktop): set up Tailwind v3 + shadcn helpers with neobrutalism palette
Co-Authored-By: WOZCODE <contact@withwoz.com>
* feat(desktop): add main window shell with four-tab navigation
Co-Authored-By: WOZCODE <contact@withwoz.com>
* feat(desktop): add Dashboard tab with stats summary cards
Co-Authored-By: WOZCODE <contact@withwoz.com>
* feat(desktop): add History tab with provider filter, search, and pagination
Co-Authored-By: WOZCODE <contact@withwoz.com>
* feat(desktop): add Settings Providers section with key, model picker, and active toggle
Co-Authored-By: WOZCODE <contact@withwoz.com>
* feat(desktop): add Settings Recording section with hotkey, mic picker, and test button
Co-Authored-By: WOZCODE <contact@withwoz.com>
* feat(desktop): add Overlay, History, Theme, and Updates settings subsections
Co-Authored-By: WOZCODE <contact@withwoz.com>
* feat(desktop): add overlay window with state-based recording pill
Co-Authored-By: WOZCODE <contact@withwoz.com>
* test(desktop): add audio fixtures with regeneration script
Adds five WAV fixtures used by upcoming functional tests, plus a
regeneration script and synth helper. The script prefers OPENAI_API_KEY
from the environment and falls back to the local Ada config.json; with
VOX_ERA_ALLOW_SYNTH_FALLBACK=1 it synthesizes tone-based placeholders
when the API is unavailable. The currently committed WAVs were generated
in synth-fallback mode because the OpenAI account hit insufficient_quota.
* test(desktop): functional test of full transcribe flow with MSW + audio fixture
Drives the transcribe orchestrator end-to-end with WAV fixtures loaded
from disk, mocking the @tauri-apps/api invoke surface for settings and
secret retrieval and using MSW v2 to intercept the OpenAI HTTP call.
Covers happy-path, long-form audio, non-retriable HTTP errors, and the
missing-API-key branch. Runs under the node environment to avoid the
happy-dom/MSW stream conflict.
* feat(claude): add desktop-relevant slash commands for Vox Era
Adds: dev-desktop, test, test-fast, typecheck, lint, coverage,
add-provider, diagnose. Updates build-clean and reset-perms for
Tauri + com.programow.voxera. Removes Electron-era dev.md and
diagnose-mic.md (superseded by dev-desktop and diagnose).
* docs(desktop): add architecture, permissions, secrets, providers, testing docs
Replaces the Electron-era architecture.md and permissions.md with
their Tauri-aware counterparts. Adds secrets.md (vault threat model
+ zeroize/redaction), providers.md (data-driven contract for the 9
STT adapters + listModels strategies + pricing maintenance),
testing.md (4-layer pyramid + mocking boundaries + audio fixtures),
and packages/desktop/README.md (Rust/TS boundary + plugin list).
Updates development-workflow.md to reflect the new slash commands.
* ci: add desktop test matrix on macos/windows/linux
Expands the existing lint-typecheck job with two new matrix jobs:
test-desktop runs Vitest + cargo test --lib on all three OSes;
build-desktop runs cargo check --release --locked. Linux builds
install the apt deps declared in tauri.conf.json's deb.depends.
Adds packages/desktop/src-tauri/gen/** to biome ignore so locally
generated capability schemas don't fail lint.
* feat(landing): scaffold Next.js 15 App Router with static export
Replace Plan A placeholder with a real Next.js 15 + React 18 scaffold
under packages/landing/. Configure static export (output: 'export') so
bun run build emits out/index.html for Plan D's S3 deploy. Extend the
shared tsconfig.base.json and align scripts (dev/build/typecheck/
test/test:unit) with the workspace conventions used by packages/desktop.
Co-Authored-By: WOZCODE <contact@withwoz.com>
* feat(landing): configure Tailwind with neobrutalism palette
Add tailwind.config.ts with the same neobrutalism palette used by
packages/desktop (warm yellow main hsl(50 100% 65%), cream bg
hsl(60 100% 95%), black border, 3/5px borders, 4px/6px offset shadows
via boxShadow.neo / neo-lg). Wire the matching CSS variables into
globals.css so all subsequent landing components inherit the same
look as the desktop app.
Co-Authored-By: WOZCODE <contact@withwoz.com>
* feat(landing): add hand-written shadcn UI primitives
Add Button (cva variants + Radix Slot for asChild), Card +
sub-components, Badge (default/outline/muted), and Separator (Radix)
under src/components/ui/, plus src/lib/utils.ts (cn helper) and a
components.json descriptor. Skip the interactive shadcn CLI; instead
mirror the neobrutalism styling already shipped by packages/desktop —
border-3, shadow-neo, bold/uppercase typography — so the landing
visual language stays in sync with the app.
Co-Authored-By: WOZCODE <contact@withwoz.com>
* test(landing): configure Vitest with happy-dom
Wire up Vitest with happy-dom + globals, the @testing-library/jest-dom
matchers via a setup file, and an @ -> src alias mirroring the desktop
package's vitest.config.ts. Add a smoke sanity test that asserts DOM
APIs work, locking in the harness before component tests are added in
later sections.
Co-Authored-By: WOZCODE <contact@withwoz.com>
* feat(landing): add Footer component with GitHub, privacy, and changelog links
* feat(landing): add Hero with headline and download CTA
* feat(landing): add Demo, Features, ProvidersGrid, PrivacyTeaser, Download sections
* feat(landing): add /privacy page with full security and threat-model explanation
Co-Authored-By: WOZCODE <contact@withwoz.com>
* feat(landing): add GitHub releases fetcher with empty-state fallback
Co-Authored-By: WOZCODE <contact@withwoz.com>
* feat(landing): add /changelog page with build-time GitHub fetch and empty state
Co-Authored-By: WOZCODE <contact@withwoz.com>
* feat(landing): assemble home page from all sections
Co-Authored-By: WOZCODE <contact@withwoz.com>
* test(landing): add Playwright E2E for home, privacy, changelog
* ci: add landing PR preview workflow (active once Plan D provisions AWS)
* ci: add test-landing job (Vitest + Playwright)
* feat(claude): add /dev-landing slash command
* docs(landing): add package README
* chore: ignore Playwright + Next.js build artifacts in Biome lint
* feat(claude): switch project skill to pulumi-cloud-iac
* chore: rename macOS bundle id to com.vhtechnology.voxera
Vendor namespace switched from programow → vhtechnology. Affects:
- packages/desktop/src-tauri/tauri.conf.json (the source of truth)
- .claude/commands/build-clean.md, diagnose.md, reset-perms.md
- docs/development-workflow.md, docs/permissions.md
- packages/desktop/README.md
cargo check, bun run typecheck, bun run lint all clean. cargo test --lib
still 51 passed. The bundle id is only consumed by macOS at runtime (TCC
key, app data dir lookup) and by the slash commands; no Rust source
references it directly.
* docs: pivot Pulumi backend to Pulumi Cloud + change vendor bundle id
- Spec, Plan A, Plan B, Plan D: drop self-hosted S3 state bucket + AWS KMS
secrets bootstrap; switch to Pulumi Cloud under personal account
guilherme-vozniak-a-gmail-com (stack: guilherme-vozniak-a-gmail-com/vox-era-infra/prod).
Pulumi Cloud handles state and secrets server-side.
- Plan D Task 4 reduced from "AWS bootstrap for Pulumi state" (S3 bucket +
KMS key) to "Pulumi Cloud login + Cloudflare API token" (two simple manual
steps).
- Plan D Task 5 Steps 3-5: drop backend.url block from Pulumi.yaml, drop
--secrets-provider flag from stack init, drop AWS_PROFILE prefix from
pulumi config commands.
- GitHub Secrets list gains PULUMI_ACCESS_TOKEN; nothing dropped.
- Bundle id changed from com.programow.voxera to com.vhtechnology.voxera
in spec, Plan A, and Plan B (3 references in Plan B's scaffold + tauri.conf
template + reset-perms note).
- Risk row "Pulumi state bucket loss/corruption" replaced with "Pulumi Cloud
account loss / takeover".
* feat(desktop): land first usable recording loop
Make Vox Era functional end-to-end for the first time: configure an
API key, pick a model, press the hotkey, and have the transcription
pasted at the cursor.
Three coordinated pieces:
- API Keys + Model Configs: replaces the static one-card-per-provider
layout with two persisted concepts. SQLite migration 0002 adds
api_keys, model_configs, and an app_state key/value table. Keychain
account is now an opaque UUID per credential, so multiple keys per
provider are supported. New Add/Delete dialogs (Radix Dialog), and
a click-to-activate model-config card. transcribe() now resolves
the active config + secret instead of the previously-broken
invoke('get_setting', ...) path.
- Coming-soon scaffolding: dimmed cards + yellow badge for every
Settings panel that isn't wired yet (Recording, Overlay, History,
Theme, Updates), plus dim-only treatment for the Dashboard and
History tabs. Tabs primitive: hide inactive panels (data-state)
and drop the wrapper box around triggers.
- Global hotkey + paste pipeline: register Cmd+Shift+Space (macOS) /
Ctrl+Shift+Space (others) via tauri-plugin-global-shortcut in
setup; emit a Tauri event that drives a press-to-toggle state
machine in JS (idle -> recording -> transcribing -> idle, with
errors). Recording state pill in the main window header. The
paste pipeline now uses the real TauriClipboard (was the in-memory
test stub), verifies the clipboard read-back after writing, and
sleeps 80ms before the keystroke so the macOS pboard daemon has
time to propagate to other processes.
Out of scope, flagged in code:
- Hotkey customisation UI (badged "Coming soon").
- Fn-alone on macOS via CGEventTap.
- Overlay window state animation.
- Dashboard stats + History list reading from SQLite.
- Reliable paste into terminal-style apps such as Claude Code itself
(manual Cmd+V works; synthesised Cmd+V occasionally races on
modifier-hold timing).
50 Rust tests, 150 desktop Vitest tests, full typecheck and Biome
clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(desktop/paste): drop clipboard read-back verify
The verify loop was added during debugging and always succeeded on
the first attempt across many runs — confirming the local NSPasteboard
write is synchronous. The remaining race was inter-process pboard
propagation, fixed by the 80 ms settle delay; the verify itself was
load-bearing only as a logging tool. Drop it.
Errors still log on the failure paths (clipboard write, enigo
keystroke) so a real regression won't go silent.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(desktop): close-to-tray with dynamic Dock icon + visible tray
Replicate the legacy Electron app's tray-resident behaviour on macOS:
- Intercept `WindowEvent::CloseRequested` on the main window in
`lib.rs` setup. Prevent the close, hide the window, and switch
the activation policy to `Accessory` so the Dock icon
disappears. The app stays alive — global hotkey, microphone
session, recording loop all continue.
- The tray "Open Vox Era" menu item now switches the activation
policy back to `Regular` *before* showing/focusing the window,
so the Dock icon reappears in lockstep with the window.
- Set an actual icon on `TrayIconBuilder` via
`app.default_window_icon().clone()`. Without this the tray
entry was registered but rendered as nothing in the menu bar —
invisible since the start, only noticed now that close-to-tray
makes it the only re-entry point.
Cmd+Q still works while the window is visible (default macOS
menu bar). When the window is hidden there is no Vox Era menu
bar — same as the legacy app — so quit goes through the tray.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(desktop/tray): remove unwired Switch provider menu item
The "Switch provider →" menu item was a placeholder marker for a
submenu that was never wired. Clicking it does nothing visible, and
the submenu itself depends on settings + history aggregation that
isn't built yet. Drop the item, the SwitchProvider variant, the
constant, and the corresponding dispatch test until the actual
submenu lands.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(desktop/overlay): wire transparent recording indicator pill
Make the floating overlay window actually do its job: when the
recording state machine in the main window enters recording or
transcribing, a small dark translucent pill appears at the
bottom-center of the primary monitor. When recording is idle or
errored, the overlay hides.
The bridge is JS-to-JS via a Tauri event:
- New `lib/overlay-bridge.ts` exposes `publishRecordingState(s)` —
emits a `vox-era://recording-state` event and toggles the
overlay window's visibility based on the state kind. Side-
effect-only, never throws so an overlay quirk can't break the
recording flow.
- `useHotkeyRecording` accepts an injectable `publish` dep and
calls it after every setState transition. Default uses the real
bridge.
- New `windows/overlay/OverlayApp.tsx` listens for the event,
positions the window once at bottom-center on mount via
`getCurrentWindow().setPosition()`, and renders OverlayWindow
with the mapped state.
- `App.tsx` overlay branch now mounts <OverlayApp /> instead of
the previously hardcoded idle-state OverlayWindow.
Visual redesign of OverlayWindow itself:
- Drop the heavy neo-brutal red box + thick border + drop shadow.
- Drop the broken Stop / Record / Paste buttons (and the props,
the idle and resultPreview states that nothing maps to).
- Replace with a small dark translucent pill (bg-black/55 +
backdrop-blur-md), ring-1 for edge definition, no drop shadow.
- Recording: red pulsing dot + 4-bar staggered waveform + label.
- Transcribing: 3 bouncing dots + label.
`main.tsx` clears `<html>` and `<body>` background colors when
the URL is `?window=overlay` — the global Vox Era cream theme
otherwise paints a solid rectangle behind the floating pill,
defeating the macOS-native transparent-window effect.
`capabilities/default.json` gains `core:window:allow-set-position`
so the overlay can self-position on mount.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(desktop/overlay): persist enable toggle, drop position option
Settings → Overlay was a dimmed Coming-Soon card with two unwired
useState controls. Now the enable switch actually does something:
- New `getOverlayEnabled` / `setOverlayEnabled` in `lib/db.ts`
back the toggle against the existing `app_state` table under key
`overlay_enabled`. Default `true` when no row is present.
- `lib/overlay-bridge.ts` reads the flag inside
`publishRecordingState` before each `overlay.show()`. When
disabled the bridge hides the overlay window instead, but still
emits the `vox-era://recording-state` event so the main-window
status pill keeps working. A new `hideOverlayWindow` helper lets
the settings toggle hide the overlay immediately when the user
flips it off mid-session. If the DB read itself fails the bridge
defaults to showing — never silently strand the user without
feedback.
- `windows/main/SettingsOverlay.tsx` is rewritten as a real
control: loads the persisted value on mount, disables the switch
briefly until the load completes, persists on every change, and
hides the overlay window on toggle-off. The Position select is
removed — overlay always renders bottom-center via OverlayApp's
setPosition().
- `MainWindow.test.tsx` mocks gain `getOverlayEnabled` /
`setOverlayEnabled` and the bridge module so the SettingsOverlay
child can mount inertly under the existing 5 main-window tests.
Toggling on mid-recording does not retroactively show the overlay
— the bridge only re-evaluates on state transitions. Acceptable
wart; user can hit the hotkey again if they want it back. Recorded
in the plan.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(desktop/overlay): drag to reposition + persist across launches
Let the user drag the overlay pill anywhere on screen and have the
new position remembered on the next show.
- `lib/db.ts` gains `getOverlayPosition` / `setOverlayPosition`
(`OverlayPosition = {x, y}`) backed by two new `app_state` rows
`overlay_x` and `overlay_y`. `null` returned when either row is
missing or the stored value isn't a finite integer; values are
rounded before persisting.
- `windows/overlay/OverlayWindow.tsx` makes the pill draggable.
`data-tauri-drag-region` is set, but on macOS transparent
decorations:false windows it's not always honoured — so we also
attach an `onMouseDown` that calls `getCurrentWindow().startDragging()`
for reliable behaviour. The grab/grabbing cursor classes provide
the affordance. Decorative children (dot, waveform, dots, label)
use `pointer-events-none` so the pill div reliably receives the
click.
- `windows/overlay/OverlayApp.tsx` refactors `useOverlayInitialPosition`
to read the saved position first and only fall back to bottom-center
when none is stored. New `useOverlayPositionPersistence` hook
subscribes to `getCurrentWindow().onMoved`, debounces 400 ms after
the last move event, and writes via `setOverlayPosition`.
- `capabilities/default.json` gains `core:window:allow-start-dragging`
so the explicit `startDragging()` call is permitted.
Tests +12: 6 in `db.test.ts` (null/partial/full/non-numeric/upsert/
round) and 6 in the overlay tests (drag-region attribute on each
state, restore-vs-default mount, save-on-move, debounce coalesce,
unmount cancels pending save). Total 189 desktop tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(desktop/overlay): position-setup mode, reset, toast, drag handle
Lets the user position the overlay without recording, reset it back
to default, and get visual confirmation when either lands. Also
restricts the OS drag region to a small handle on the left of the
pill so accidental clicks elsewhere don't move the window.
UI in `windows/main/SettingsOverlay.tsx`:
- New "Reset position" button — calls `resetOverlayPosition()` and
toasts "Overlay position reset".
- New "Position overlay" toggle button — flips to "Use this
position" while active. Entering calls `enterOverlayPositionSetup`
to show the overlay; the toggle is reverted by an external
setup-off event so the auto-exit path can drive the same UI.
Bridge in `lib/overlay-bridge.ts`:
- Three new event constants and three new helpers:
`enterOverlayPositionSetup`, `exitOverlayPositionSetup({hide,
reason})`, `resetOverlayPosition`. The setup-off event carries a
payload `{ reason: 'manual' | 'idle' | 'recording-wins' }` so the
main window can decide whether to surface a toast.
Overlay state machine in `windows/overlay/OverlayApp.tsx`:
- New `useOverlayPositionSetup` hook tracks `setupActive` from the
setup events. Renders the new `'positioning'` pill while active.
- 3s idle-exit timer arms on the first `onMoved` after entering
setup and resets on every subsequent move; on expiry it emits
setup-off with `reason: 'idle'`.
- Recording wins: a recording / transcribing transition during
setup emits setup-off with `reason: 'recording-wins'` and keeps
the window visible (no hide).
- New `useOverlayResetHandler` listens for the reset event and
re-applies the bottom-center default via `setPosition`.
Drag handle in `windows/overlay/OverlayWindow.tsx`:
- New `DragHandle` carries `data-tauri-drag-region`, the explicit
`startDragging()` `onMouseDown`, the grab/grabbing cursor, and a
`⠿` glyph. It's rendered as the first child of every visible
pill. The outer pill has no drag attributes — clicking the icon
/ waveform / label is no longer a drag handle.
- New `'positioning'` `OverlayState` variant with a small dark pill
containing the handle + "Drag to position" label.
Toast in `components/ui/toast.tsx` (new):
- Small portal-rendered status pill: top-right, `bg-green-500/90`,
white text, auto-dismisses after 2.5s. Uses an `<output>`
element with `aria-live=polite`. Wired into SettingsOverlay's
setup-off listener (toasts on `manual` and `idle`, silent on
`recording-wins`) and into the Reset button.
`MainWindow.test.tsx` mock list extended for the new bridge
exports so the existing tab-switch tests keep mounting cleanly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(desktop/overlay): add a Stop button to the recording pill
Lets the user stop a recording with the mouse instead of the
hotkey. The button piggy-backs on the existing recording flow:
- New `requestRecordingToggle()` helper in `lib/overlay-bridge.ts`
emits `vox-era://shortcut-toggle` — the same event the OS-level
hotkey handler fires. The main window's `useHotkeyRecording`
listener picks it up and runs the controller, which from the
`recording` state stops, transcribes, pastes, and returns to
idle. No new state machine, no new listener.
- New `StopButton` rendered on the right of the recording pill
only. Small dark circle with a square glyph. Uses
`aria-label="Stop recording"` for tests + a11y. Hidden on
transcribing and positioning.
- `OverlayApp` wires `onStop={() => requestRecordingToggle()}`
through to `OverlayWindow`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(desktop/overlay): convert overlay to non-activating NSPanel
Clicking the overlay's Stop button or drag handle previously made
Vox Era frontmost, which yanked focus from the app the user was
dictating into and broke the paste-into-active-app flow. Match
what every other macOS dictation tool does and re-class the
overlay's NSWindow as a non-activating NSPanel.
- New `objc2-app-kit` macOS dependency.
- New `src/overlay_panel.rs` (macOS only). `make_overlay_nonactivating`
grabs the underlying NSWindow via `webviewWindow.ns_window()`,
re-classes it to NSPanel via `AnyObject::set_class` (the
object_setClass runtime call wrapped by objc2 — `setClass:` is
not a public NSObject selector and panics with unrecognized-
selector). It then sets:
* styleMask |= NSWindowStyleMaskNonactivatingPanel
* level = NSFloatingWindowLevel
* hidesOnDeactivate = false
* collectionBehavior =
CanJoinAllSpaces | Stationary | FullScreenAuxiliary
* becomesKeyOnlyIfNeeded = true
- `lib.rs` setup calls the conversion on the "overlay" webview
window after `tray::build`, gated to macOS.
After this, clicks land in the overlay without the owning app
becoming frontmost. Vox Era stays in the background; whatever app
the user was using stays focused; `paste_text` injects Cmd+V into
the right place.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: stub Electron-era build and troubleshooting docs pending Plan D rewrite
Both files were Electron-era content (electron-builder, paste-helper.swift,
com.programow.ada). Replaced with brief deprecation stubs pointing at the
current slash commands and the docs that are accurate today (architecture,
permissions). Plan D rewrites both with full Tauri-era content as part of
its release-pipeline section.
Also removes orphan create-tauri-app scaffold files no code references:
- packages/desktop/src/App.css
- packages/desktop/src/assets/react.svg
bun run lint, bun run typecheck, bun run test (93/93) all clean.
* fix(desktop): close recording loop end-to-end (history + auto-activate)
Three real bugs in the recording loop after PR #4:
1. addModelConfig didn't auto-activate the first config when no active
one existed, so Settings → Model Configs left transcription stuck
on "No model selected" until the user discovered the click-row-to-
activate UX. Now the first config you add becomes active automatically;
subsequent adds leave the existing active selection alone.
2. The recording controller never wrote to the transcriptions table.
Plan B Section 6 specced history persistence, but cce2228's loop
stopped at pasteText. Added saveTranscription + listTranscriptions
to lib/db.ts and wired the controller to save after paste succeeds.
Failure to save is logged, not propagated, so the user still sees
the pasted text.
3. MainWindow.tsx hardcoded <History entries={[]} />. Now loads from
listTranscriptions in a useEffect that re-runs whenever the recording
state returns to idle, so completed transcriptions show up immediately.
Tests: 223 desktop passing (added two coverage tests for addModelConfig's
auto-activate behavior).
* docs: spec for recording settings + transcription history features
* docs: implementation plan for recording settings + transcription history
* feat(desktop): add hotkey combo string parser/formatter
Introduce shortcut::parse with parse_combo / format_combo that convert
between human-readable combo strings (e.g. "Cmd+Shift+Space") and
tauri_plugin_global_shortcut::Shortcut. Cmd/Command/Meta/Super/Win all
normalize to Modifiers::SUPER, matching global-hotkey's canonical
cross-platform representation of the Command/Windows/Super key.
Update default_record_shortcut in lib.rs to use Modifiers::SUPER on
macOS so the registered default hotkey matches what the parser produces.
* feat(desktop): add list_audio_input_devices Tauri command
* feat(desktop): start_recording accepts optional device_id
* feat(desktop): dynamic hotkey registration via register_hotkey command
* feat(desktop): typed wrappers for new audio + hotkey commands
* feat(desktop): persist mic device + hotkey combo in app_state
* feat(desktop): add HotkeyInput capture component
* feat(desktop): wire SettingsRecording with mic picker, hotkey capture, test playback
Co-Authored-By: WOZCODE <contact@withwoz.com>
* feat(desktop): register persisted hotkey on app startup
* fix(desktop): satisfy noUncheckedIndexedAccess in SettingsRecording test
* feat(desktop): soft + hard + restore + clear-all transcription deletes
* feat(desktop): retention picker storage + purgeOlderThan sweep helper
* feat(desktop): getHistoryStats aggregations
* feat(desktop): export helpers for .txt + .md (per-row + bulk)
* feat(desktop): per-row Copy/Export/Delete + bulk export on History tab
* feat(desktop): wire SettingsHistory with retention picker + Clear all
Co-Authored-By: WOZCODE <contact@withwoz.com>
* feat(desktop): dashboard renders 5 stats from getHistoryStats
* feat(desktop): daily retention sweep on app mount
* feat(desktop): wire main window soft-delete-with-undo + dashboard refresh
* chore(desktop): remove dead Rust history modules (now in lib/db.ts)
* fix(desktop): downmix multichannel mic input to mono (slow-motion playback)
* fix(desktop): release global shortcut during HotkeyInput capture
* feat(desktop): wire macOS Fn-key shortcut via CGEventTap with Use Fn button
* chore(desktop): add Fn-tap diagnostic logging + info-level default in dev
* feat(desktop): auto-manage AppleFnUsageType when user picks Fn shortcut
* fix(desktop): handle AssemblyAI speech_models migration + new model ids
* fix(desktop): drop retired distil-whisper-large-v3-en + add legacy alias
* ci: daily smoke test against every configured STT provider
Catches silent server-side API breakages (param renames, retired model
ids) before users hit them. Iterates PROVIDERS, hits each one with one
transcription of the hello-world.wav fixture, and exits 1 if any provider
errors. Providers whose secret isn't set are skipped, so users can opt in
per provider via repo secrets (SMOKE_OPENAI_KEY, SMOKE_GROQ_KEY, etc.).
Azure OpenAI is excluded — it needs a deployment name, not just a key.
Runs daily at 07:00 UTC on cron, plus workflow_dispatch for manual
re-runs, plus on push to main when provider configs or this workflow
itself change. Concurrency-gated so consecutive triggers cancel.
The hello-world.wav fixture is currently synthetic tones (OpenAI TTS
quota was exhausted when fixtures were regenerated) — that's fine for
catching API-contract breakages, which is the whole point. If a provider
rejects non-speech audio that's something to handle in the smoke script,
not a real bug.
* fix(desktop): surface actionable error when paste needs Accessibility
Synthetic Cmd+V via enigo silently no-ops when Vox Era lacks the
Accessibility permission on macOS — the transcribed text ends up on
the clipboard but never lands in the focused app, with no signal to
the user. Probe the permission before issuing the keystroke and
return a structured `accessibility-required:` error so the renderer
can render a clear, actionable message. The recording controller now
catches paste failures, still saves the transcription to history,
and tells the user to grant Accessibility (or just hit Cmd+V — the
text is already on the clipboard).
* feat(desktop): live audio level meter in recording overlay
The recording pill previously showed a static blinking dot plus four
hardcoded animated bars — visually busy but disconnected from what
the mic is actually capturing, so users couldn't tell if their voice
was being picked up.
Now: cpal callbacks compute the peak amplitude of each downmixed mono
chunk and merge it into a per-session `peak_level: Arc<Mutex<f32>>`
on the active session. A new `get_recording_level` Tauri command
returns the peak and atomically resets it to 0.0, so polling at ~12 Hz
yields a meter that naturally tracks recent loudness instead of
saturating. The overlay app polls every 80 ms while in the recording
state, the bars' heights scale with the level (thresholds 0.1/0.3/0.5
/0.7), and the dot scales subtly with it for cohesion.
`AudioSource::peak_level` defaults to `None` so the mock impl keeps
working unchanged.
* feat(desktop): show per-model pricing in model config UI
When picking or browsing model configs, users had no way to see what
each model costs without leaving the app. Surface the per-minute price
right in the dropdown ("Whisper-1 · $0.0060/min") and in each saved
row's subline, formatted to drop trailing zeros for sub-cent prices.
Pricing comes from each provider's existing `pricing` map, so no new
data is needed — this just makes it visible at the decision point.
* feat(desktop): wire Input Monitoring + Apple Events permissions on macOS
The Fn-key CGEventTap gates on the Input Monitoring TCC bucket
(kTCCServiceListenEvent), not Accessibility — without
NSInputMonitoringUsageDescription in Info.plist a signed production
build wouldn't trigger the OS prompt at all, leaving the Fn shortcut
silently broken. CGEventTap creation failure was also being reported
as "accessibility required", which sent users to the wrong Settings
pane.
Changes:
- Info.plist: add NSInputMonitoringUsageDescription + NSAppleEventsUsageDescription
- Add SettingsPanel::InputMonitoring with the Privacy_ListenEvent deep link
- Link CGRequestListenEventAccess + CGPreflightListenEventAccess from
CoreGraphics; expose check/request commands mirroring the AX pair
- New prompting variant of check_accessibility_permission so the explicit
"Grant Accessibility" button triggers the native OS dialog
- ShortcutError::InputMonitoringRequired (distinct from AccessibilityRequired)
- Fn-tap failure now surfaces "input-monitoring-required: ... quit and
reopen" since TCC grants don't propagate to a running process
- TypeScript wrappers + tests for the three new commands
Stubs on Windows/Linux return Granted since neither platform has an
equivalent gate.
* feat(desktop): wayland paste fallback + frontend mic-permission gate
Two related production-readiness fixes:
1. wayland blocks synthetic keystrokes from third-party apps, so
enigo's Ctrl+V silently fails on GNOME/KDE Wayland sessions on
linux. Detect via XDG_SESSION_TYPE / WAYLAND_DISPLAY and return
a structured wayland-paste-unsupported: error; the React layer
surfaces "text is on your clipboard — press Ctrl+V to paste it"
instead of looking broken. X11 path is unchanged.
2. cpal's implicit microphone prompt is unreliable on signed macos
builds (tauri#9928). The frontend recording-controller now calls
checkMicrophonePermission → requestMicrophonePermission before
startRecording, with a structured mic-denied: error when the user
denies access. Strengthens the existing not-determined branch and
adds a denied branch that tells the user to open Settings (the OS
won't re-prompt after a hard deny).
* chore(skills): expand TCC reset to cover Input Monitoring + Apple Events
/reset-perms and /build-clean now also reset the ListenEvent and
AppleEvents TCC buckets that the Fn-key tap and (future) Apple
Events paste flow rely on. Adds a dev-binary-stale-entry caveat
to /reset-perms — tccutil reset <service> com.vhtechnology.voxera
only clears the prod-bundle row; the dev binary's path-keyed row
is unaffected.
* feat(desktop): first-launch permission onboarding screen
Adds a routed-within-main-window onboarding flow that only appears
when the platform's required permissions are actually missing. On
windows and linux — where everything usually just works — the gate
silently marks completion and the user goes straight to the main UI.
Per-platform rows:
- macos: Microphone, Accessibility, Input Monitoring (all three need
explicit grants in TCC)
- windows: Microphone (only shown when Denied/NotDetermined)
- linux: Microphone, plus an informational banner on wayland sessions
explaining that paste falls back to clipboard-only
Each row has a Grant button that calls the appropriate Tauri command
to trigger the native OS dialog (AVCaptureDevice.requestAccess for
mic; AXIsProcessTrustedWithOptions prompt:true for Accessibility;
CGRequestListenEventAccess for Input Monitoring), then falls back to
the System Settings deep link if the OS doesn't re-prompt.
After granting Accessibility or Input Monitoring, a restart banner
appears with a button calling AppHandle::restart — TCC grants don't
propagate to a running process, so a relaunch is mandatory and we
spell that out.
Completion is persisted via tauri-plugin-store under the versioned
key onboarding_v1_completed so we can re-show after future major
changes. A "Skip for now" option exists behind a confirm dialog for
users who want to grant later.
New Tauri commands: get_platform_info, restart_app. The wayland
detection helper from paste/mod.rs is hoisted to a small platform
module since onboarding also needs it.
* refactor(desktop): fold history into the dashboard tab
The dashboard and history tabs were duplicating cognitive overhead —
users opened the app, saw stats, then had to click another tab to
see what they actually transcribed. Merge them into a single
dashboard view: stats grid on top, "Recent transcriptions" section
below with all the existing search / provider filter / pagination /
export / soft-delete + undo behaviour intact.
Implementation is a pure composition change in MainWindow:
- Dashboard.tsx and History.tsx are unchanged (their unit tests are
unchanged too — 5 + 8 tests still pass)
- All shared state (historyEntries, refreshKey, undoToast) stays in
MainWindow exactly as before
- Only the tabs structure shifts: history tab + panel removed; the
History component now renders inside the dashboard panel under a
small "Recent transcriptions" heading
No regressions to the delete/undo flow, the refresh-on-idle effect,
or any of the row-level actions.
* refactor(desktop): drop About tab and add Statistics heading
Two small UI tweaks:
- Remove the About tab — it was a placeholder card with no content
and added noise to the tab strip
- Add a "Statistics" section heading above the stat-card grid so it
mirrors the "Recent transcriptions" heading below it, giving the
Dashboard a consistent two-section structure
The MainWindow now ships with exactly two tabs (Dashboard, Settings)
and tests are updated accordingly: dropped the about-panel click
test, asserted both section headings render under the Dashboard
panel, and confirmed the about tab no longer appears.
* refactor(desktop): centralize rust to ts string contracts behind shared constants
Five magic strings were duplicated literal-by-literal between the
rust backend and the ts frontend: four error-marker prefixes
(accessibility-required, mic-denied, wayland-paste-unsupported,
input-monitoring-required) plus the shortcut-toggle event name. A
rename on either side compiled cleanly while silently breaking UX —
the user would lose the permission deep-link CTAs or the Stop button
would silently stop working.
Centralize all five in markers.rs (rust) + markers.ts (ts) with a
vitest contract test that parses markers.rs and asserts every rust
constant has a matching ts constant of the same name and value.
Verified to fail loudly when either side drifts: mutating a single
const fires two clear assertion failures with the offending name.
The Wayland fallback message is now constructed via a small helper
that consumes the shared const, so the prefix can't drift. The
useHotkeyRecording + overlay-bridge had two independent definitions
of the same event string — both now re-export from markers.ts.
* fix(desktop): shrink mutex hold time in cpal audio callbacks
The per-session samples mutex was held for the entire downmix loop
inside both cpal callbacks (f32 and i16 branches). On stereo inputs
at high sample rates that's tens to hundreds of microseconds with
the lock taken, directly contending with the 12 Hz get_recording_level
polling from the overlay meter — measurable stutter.
Restructure so the lock is held only for an extend_from_slice on a
preallocated local buffer. The downmix + peak computation happens
on the audio thread's stack first, then samples lock → extend → drop,
then peak_level lock → merge → drop. The two locks are no longer
acquired simultaneously, so they can no longer block each other.
Also switch the audio-callback locks from .unwrap() to a poisoning-
tolerant lock_or_recover helper. Before: a panic on any consumer
thread would poison the mutex and the next cpal callback would
panic the audio thread itself — taking the whole app's recording
down via a real-time-thread crash, which on macos can corrupt
CoreAudio state. After: the recovery helper returns into_inner on
a poisoned guard, so the audio thread survives.
Byte-for-byte output preserved: the local buffer contains the exact
same i16 values in the exact same order. Three new tests cover lock
contention, poisoned-mutex recovery, and empty-sample WAV encoding.
Vanity microphone_source_can_be_constructed test removed.
* refactor(desktop): replace navigator.platform with authoritative PlatformInfo
navigator.platform is deprecated and can return an empty string on
newer Chromium versions — fragile detection for what's actually a
constant property of the host OS. The Rust get_platform_info command
already returns the authoritative answer; just use it everywhere.
Two production sites were doing their own /Mac/i.test(navigator.platform):
HotkeyInput.tsx (for Cmd vs Ctrl labels) and db.ts::defaultHotkeyCombo
(for the install-time default hotkey). Both now read from a single
process-cached usePlatform() / getPlatform() helper that invokes the
Rust command once and memoizes the result for the app lifetime.
The hotkey-default literal pair (Cmd+Shift+Space / Ctrl+Shift+Space)
moves to lib/defaults.ts so the two callers share one source of truth.
defaultHotkeyCombo becomes async — its only caller (getHotkeyCombo)
already awaits, so the surface change is internal.
Final grep confirms zero navigator.platform references in production
code (test files included).
* chore(desktop): delete unused settings module
The src-tauri/src/settings/ module shipped Rust types (Settings,
Theme, HistorySettings, OverlaySettings, OverlayPosition,
default_hotkey) that no production code referenced — JS owns
settings via tauri-plugin-store. The only callers were the module's
own tests. Grep across the workspace confirms zero non-test usages.
Delete the module + its lib.rs registration. -3 Rust tests (the
module's self-tests), -1 file. No behavior change.
* chore(desktop): gate or delete test-only rust types
Five Rust types existed as test fixtures but were not gated to
cfg(test), so they shipped in the release binary:
Deleted (used only by their own tests — circular coverage):
- audio::mock::MockMicrophoneSource
- shortcut::standard::StandardShortcut (real registration happens
inline in commands::register_hotkey; the shim was a no-op)
- shortcut::mock::MockShortcutManager (no production seam to plug
into; ShortcutManager trait is used by macos_fn::MacOsFnTap only)
Gated with #[cfg(test)] (legitimate fixtures used by sibling tests):
- secrets::mock::InMemoryVault
- paste::RecordingPaster
cargo check --release succeeds, proving the gated types are now
excluded from the release build (had any release-mode code referenced
the gated items, it would have failed to compile).
Net: 13 Rust tests deleted (all of which tested the mocks
themselves), no production behavior change.
* refactor(desktop): extract duplicated provider helpers to providers/util.ts
providerName, formatPricePerMin, and modelPriceLabel were copy-pasted
across four React components (SettingsApiKeys, SettingsModelConfigs,
AddModelConfigDialog, DeleteApiKeyDialog). A pricing-format change
needed two edits; a rename needed four.
Consolidate into providers/util.ts with the canonical signatures:
- providerName(id: string): string — verbatim fallback if unknown
- formatPricePerMin(perMinuteUSD: number): string
- modelPriceLabel(providerId: string, modelId: string): string | null
AddModelConfigDialog previously passed a ProviderConfig | undefined
to modelPriceLabel; that single callsite migrates to the id-string
contract used by the other call sites. Behavior preserved: an
unknown id falls through to null both ways.
9 new tests in util.test.ts cover the helpers against the real
PROVIDERS registry — no mocking the data under test.
* test(desktop): rewrite db tests against real :memory: SQLite
The 32 existing tests in db.test.ts mocked tauri-plugin-sql's
Database.load and asserted SQL substrings against regexes. They
caught nothing: a typo'd column name, broken join, missing
migration, or wrong CONFLICT clause would all pass.
Replace with 66 real-SQL integration tests running against
better-sqlite3 in :memory: mode. The test harness reads the same
migration SQL files that Rust's history module loads via
include_str! (migrations/0001_init.sql, 0002_provider_configs.sql)
— so a schema typo would now fail both the Rust runtime AND the TS
tests. Single source of truth.
Closes the regression gaps the audit called out:
- streakDays counts the consecutive run ending today, not just
the count of distinct days (gap-day test)
- avgWPM returns null (not NaN/Infinity) when every durationMs=0
- purgeOlderThan uses strict < so the cutoff row is kept
- soft-delete + restore preserves every column byte-for-byte
- clearAllTranscriptions wipes soft-deleted rows AND zeros stats
- deleteApiKey cascades to model_configs via FK ON DELETE CASCADE
- originalFnUsageType gained 4 tests where there were 0 before
Adds better-sqlite3 + @types/better-sqlite3 as devDeps (native
build worked first try on macOS arm64 / Node 24 / Bun 1.3).
* test(desktop): add rust to ts command contract test
invoke.ts wraps every Tauri command as vox.foo() → invoke('snake_case').
Today, renaming a command on either side compiles cleanly but breaks
the runtime IPC silently — no test catches it.
Add a vitest contract test that parses three sources:
- commands.rs (extracts every #[tauri::command] fn + arg names)
- lib.rs (extracts the generate_handler! list)
- invoke.ts (extracts every invoke('name', {payload}) callsite)
And asserts three properties via separate it() blocks:
1. Every command in generate_handler! has a matching JS callsite
2. Every JS invoke('name') has a matching registered Rust command
3. Arg names match after snake_case to camelCase normalization
Verified to fire on real drift: renaming a Rust command produced
"Names in generate_handler! with no #[tauri::command] in commands.rs:
restart_app"; renaming a JS payload key produced a clear arg-name
diff between the two sides.
No drift exists on the current codebase (22 commands all checked).
The parser handles cfg-gated commands, async commands, and Tauri
framework-injected args (State, AppHandle, Window). No allowlist
escape hatches needed.
* test(desktop): replace provider shape tests with msw http assertions
Six STT provider adapters (assemblyai, azure-openai, deepgram, fal,
gladia, revai) had tests that only asserted shape: id === 'foo',
defaultModels.length > 0, makeModel(...) returns something defined.
None would catch a swapped factory, wrong base URL, missing auth
header, or model id typo.
Replace with MSW v2 HTTP assertions mirroring the openai / groq /
elevenlabs pattern. Each provider now has:
- 1 happy-path HTTP assertion: URL host + path + auth header +
request payload all checked against documented endpoint
- 3 negative-path assertions: 401, 429, and 500-with-HTML-body all
propagate as structured errors (verified via maxRetries: 0 so the
AI SDK doesn't silently retry)
- existing structural tests (id, defaultModels, pricing) kept
AssemblyAI got an extra speech_models rewrite test that exercises
the rewritingFetch shim (verifies outgoing body has speech_models
not speech_model).
30 shape tests → 55 tests across the six adapters. No production
provider bugs revealed; all URLs and headers match documentation.
* fix(desktop): tighten react effect deps to stop double-fetches
Two effect issues from the audit:
1. MainWindow's history-load effect fired on every render where
recordingState.kind === 'idle'. On mount that meant one fetch,
then on every transcribing → idle transition it both reloaded
history AND bumped refreshKey (which retriggers Dashboard's
stats fetch). Replace with two narrower effects: one fires once
on mount (deps [loadHistory]); the other fires only on the real
transition edge (prevKindRef tracks the prior kind, only refetches
when the previous was NOT idle and the current IS idle).
2. use-onboarding-status listed `loading` in its polling effect's
deps, so the first setLoading(false) re-armed a fresh interval
while the previous tick was still pending — visible as one
spurious extra poll on the first render after permissions
resolved. Replace with a loadingRef so the flag-flip is a side
effect not a dep.
No user-visible behavior change. All 366 tests still pass.
* refactor(desktop): typed enum dispatch + async paste_text
Two related Rust improvements:
1. open_settings_panel and the Fn-key marker accepted arbitrary
strings via match s.as_str(). Replace with serde-typed dispatch:
- SettingsPanelKind enum (kebab-case) for the panel arg; serde
rejects "unknown" at deserialize time
- parse_combo_input(&str) returns the existing typed HotkeyCombo
enum (Fn vs Standard) instead of a runtime is_fn_combo string
check. The Fn marker convention "Fn" / "fn" / " FN " all still
route to the macOS CGEventTap backend.
JS contract unchanged (still sends strings; serde converts them).
2. paste_text was a sync Tauri command that called std::thread::sleep
(PBOARD_SETTLE_DELAY = 80ms) on the worker thread before sending
the keystroke. Under burst load that serialized pastes behind each
other and tied up Tauri's command thread pool.
Make the command async and run the (still-sync) Paster::paste_text
inside tauri::async_runtime::spawn_blocking. AppState.paster moves
from Box<dyn Paster> to Arc<dyn Paster> so the closure can capture
it Send-safely. Trait and mock impls unchanged.
10 new Rust tests cover the SettingsPanelKind serde dispatch + the
parse_combo_input variants. Wave 3.2's contract test still passes —
its parser already handles async commands.
* fix(desktop): revert async paste_text — was crashing on macos 14+
The Wave 4.2 change made paste_text async and ran the enigo keystroke
inside tauri::async_runtime::spawn_blocking. That moved the work onto
Tokio's blocking pool, which doesn't have the dispatch-queue context
that HIToolbox's TSMGetInputSourceProperty requires on macos 14+ —
the OS aborted the process with dispatch_assert_queue_fail every time
the user finished a transcription.
Stack:
dispatch_assert_queue_fail
TSMGetInputSourceProperty (HIToolbox)
enigo::macos_impl::create_string_for_key
EnigoPaster::send_paste_keystroke
paste_text (spawn_blocking task)
Revert paste_text to a sync Tauri command. Tauri's sync command runner
dispatches on a thread context that AppKit/HIToolbox accept; the 80ms
PBOARD_SETTLE_DELAY is short enough that holding a command worker
thread for that duration is fine in practice. The motivation for the
async change was a theoretical worker-pool optimization — there was no
user-visible problem before. Keep AppState.paster as Arc<dyn Paster>
since that's harmless and avoids rippling unrelated changes.
Add a docstring on paste_text explaining why it MUST remain sync on
macos so this regression doesn't get reintroduced.
* Vidual identity, overlay improvements and cancel transcription functionality (#5)
* feat(desktop): swap in new logo and rename main header to "Joe the bird"
Replaces the placeholder Tauri icon set with logo-v5 (transparent
background) across all desktop bundle sizes plus the landing favicon,
and shows the logo next to the renamed h1 in the main window. Source
SVG is committed alongside the generated icons so future regenerations
can rerun `tauri icon` against it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(desktop): start visual identity shift toward soft, rounded palette
Layer 1 (tokens) and an opportunistic sweep of primitives + screens to
get the app onto the new visual identity quickly:
- Tokens: switch font to Nunito Variable (self-hosted via
@fontsource-variable/nunito so we don't break the CSP), introduce
brand color palette pulled from the logo (blue/navy/yellow/coral/
mint/pink/cream), replace hard offset neo shadows with soft
card/card-lg/pop shadows, drop border-3/border-5, add 16/20/28/pill
border-radius scale, and rewrite globals.css to a white page bg /
navy fg with surface/muted helpers.
- Primitives: Button, Switch, Tabs, Card, Input, Dialog,
ComingSoonBadge — all rounded (`pill` or `2xl`/`3xl`), soft shadow,
no more uppercase walls on Button.
- Screens: drop the duplicate outer card wrappers in MainWindow
(each settings sub-component already self-wraps in Card), sweep
hardcoded brutalist input/select strings in History,
SettingsRecording, SettingsHistory, AddApiKeyDialog,
AddModelConfigDialog, SettingsApiKeys to the new rounded style,
swap the saturated blue selected-row in SettingsModelConfigs for
a soft brand-blue tint, and switch RecordingStatusPill to a rounded
pill in brand-coral / brand-yellow/30.
- Logo: redraw with a white face polygon to fill the previously
transparent face area; regenerate the full Tauri icon set and the
landing favicon from the new SVG.
- MainWindow header: place the logo next to the h1 and rename to
"Joe the bird".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(desktop/overlay): rescue overlay parked at unreachable coords
Two ways the saved overlay position could send the pill into the
void:
1. A bad onMoved event (we saw 668672, 806912) writing physical-pixel
coords that fall outside any monitor. setPosition succeeds, the OS
places the window where requested, nothing visible to the user, and
Reset position only emitted an event — never rewrote the DB row, so
the bad value survived a click.
2. The user unplugs an external monitor (or rearranges them in System
Settings) after positioning the pill there. The saved coords stay
in the DB but point to a region no current monitor covers.
Two fixes:
- `resetOverlayPosition` now computes the default bottom-center for
the current monitor and writes it to the DB itself before emitting
the reset event. The OverlayApp's onMoved-debounce persistence path
is no longer the only writer, so a click on Reset position can't
silently no-op against a stuck value.
- `useOverlayInitialPosition` validates the saved position against
`availableMonitors()` on boot: if the entire pill rect doesn't fit
inside at least one currently connected monitor, the saved value is
discarded and we fall back to `defaultBottomCenter()`. Persistence
still works for the happy path; we only second-guess on restore. If
`availableMonitors()` itself fails, we trust the saved value rather
than relocating the pill every boot on a transient API error.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(desktop/overlay): recalibrate audio meter for built-in mic levels
The previous thresholds (0.1/0.3/0.5/0.7 and saturation at 1.0) were
calibrated against studio-grade signal levels — natural speech into a
laptop's built-in mic peaks well below that range, so the meter was
stuck at 1–2 bars for normal voice and only filled when shouting.
New calibration (0.008/0.025/0.05/0.10, saturation at 0.18) maps to
the real peak distribution we see from cpal callbacks on built-in
hardware: a whisper lights bar 1, conversational speech reaches
bar 3, slightly raised voice fills all four, and each bar's height
grows visibly across the normal speaking range instead of barely
crossing its threshold.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(desktop): cancel an in-progress recording (Cmd+Esc + overlay X)
Until now the only way to stop a recording was the toggle hotkey or
the overlay's Stop button — both of which send the captured audio off
to the STT provider. If the user catches themselves mid-utterance
saying something they don't want transcribed, there's no out.
This adds a true cancel path that aborts capture, drops the buffered
audio, and returns to idle without ever issuing an STT request. Two
surfaces:
- A configurable global hotkey, defaulted to `Cmd+Esc`. The combo
parser still requires at least one modifier, so we don't risk
stealing bare Esc from every app system-wide. The cancel hotkey is
independent of the toggle hotkey and registers/unregisters via its
own pair of Tauri commands (`register_cancel_hotkey` /
`unregister_cancel_hotkey`) and its own `current_cancel_hotkey`
slot in `AppState`.
- An X button next to the Stop button on the overlay pill, visible
only in the `recording` state. Click emits the same event the
global cancel hotkey fires so both paths converge.
Out of scope for this pass: cancelling once the STT request is in
flight (the `transcribing` state). The overlay's X button is hidden
during transcribing, and the hotkey is a no-op there too.
Implementation:
- Rust: `EVT_SHORTCUT_CANCEL` constant + a `cancel_capture` trait
method on `AudioSource`. `MicrophoneSource` shares a `take_session`
helper between `stop_capture` and `cancel_capture` so the lookup
logic can't drift; the cancel variant signals the cpal worker to
drop its stream but never touches the sample buffer (the session
is removed and the buffer is freed along with it). New unit tests
cover the discard path and the unknown-session error.
- TS: A `cancel()` action on the recording controller, parallel to
`toggle()`. The hook subscribes to both shortcut events. The
`HotkeyInput` primitive gains an `allowFn` prop so the cancel
input can hide the Fn-key button (the Fn-tap path is mutually
exclusive with the toggle, so we don't offer it for cancel). A
second `HotkeyInput` row in Settings → Recording, persisted via
a new `cancel_hotkey_combo` DB key (default `Cmd+Esc`). On boot,
`App.tsx` registers both the toggle and the cancel hotkey; either
registration failing is logged but never blocks the other.
- Tests: cancel transitions from every state (only `recording`
acts), the cancel-event hook listener, the new bridge function,
the overlay X button visibility, and the cancel-hotkey capture
flow in Settings. The mechanical Rust↔TS contract test
auto-validates the three new commands and the new event constant.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* style(desktop/overlay): pull pill into the brand palette
The recording pill was using generic dark-glass (`bg-black/55`) with
white accents and a red dot — looked like a stock Apple notification
rather than something that belonged to Joe the bird.
Restyled the pill against the same palette the rest of the app uses
while keeping it dark (it floats over arbitrary wallpapers and apps;
a light pill would disappear into white surfaces and fight for
attention on busy ones):
- Background: `bg-brand-navy/85` instead of `bg-black/55` — same
dark-glass feel, but using the logo's actual navy (#1A2051).
- Recording dot: `bg-brand-coral` instead of `bg-red-500`, matching
what RecordingStatusPill already uses in the main window.
- Waveform bars (and the transcribing dots): `bg-brand-yellow` (the
parrot's chest stripe) at full opacity when active, 40% when not.
The bars now transition `background-color` along with `height`, so
the meter "lights up" instead of just growing.
- Cancel button hover: `bg-brand-coral/40` — the destructive action
telegraphs itself instead of going through the neutral white tint
the Stop button uses.
- Ring removed, `shadow-card-lg` added — matches the rest of the
app's soft-shadow language instead of the hairline outline that
felt glassy/Apple-default.
- Label weight `font-medium` → `font-semibold` to match the
friendly-rounded tone Nunito brings to the rest of the UI.
Drag handle and Stop button kept on white tint — they're neutral
chrome controls, no semantic color needed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: luan <luan@siena.cx>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: WOZCODE <contact@withwoz.com>
Co-authored-by: luan <luan@siena.cx>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Luan Henning <46806315+LeydenJar@users.noreply.github.com>