Skip to content

v2.0.29 — Voice commands + per-device voice training

Choose a tag to compare

@K0rb3nD4ll4S K0rb3nD4ll4S released this 13 May 16:16
· 16 commits to main since this release

Feature: Voice commands + per-device voice training

Hands-free Wemo control from two surfaces — the Windows / macOS / Linux desktop app and the Docker / Synology web UI (accessible from any phone / tablet / laptop browser on your LAN). One shared library, two thin wrappers, zero new native dependencies.

Uses the browser-native webkitSpeechRecognition / SpeechRecognition API — already shipped in Chromium, Edge, and Safari. Free, no API key, no licence.


Command grammar (wake-word "dibby" required by default)

Spoken Action
dibby turn on bonus room pot light Set device on
dibby turn off deck master Set device off
dibby toggle dining room pot lights Flip current state
dibby bonus room on / dibby deck master off Terse form
dibby turn everything on / dibby all off Bulk command across all known devices

Device names are matched fuzzily against friendlyName — "deck mister" still hits "Deck Master" (Levenshtein score ≤ 0.4 = ≥ 60% similar). Below that threshold the app surfaces a "didn't recognise that device" toast instead of firing the wrong Wemo.


Per-device voice training (the accent-friendly part)

Fuzzy matching against friendlyName is great for typos but breaks down on accents, nicknames, and language mismatches. Each device now carries an optional voiceAliases: string[] field that competes with friendlyName on equal footing during matching, populated by recording phrases:

  1. Open the device's Info panel (desktop) or expand the device card (web UI).
  2. Click 🎤 add voice name.
  3. Say the phrase the way you actually say it — "deck light", "outside switch", "garage", whatever feels natural.
  4. The speech engine transcribes your recording and shows it back: "Heard: deck light — save?".
  5. On save the transcript is appended to that device's voiceAliases list. Multiple aliases per device are supported — Deck Master Switch can answer to "deck", "deck light", and "outside light" simultaneously.

Why this works for accents: the alias is stored as whatever the user's own STT engine actually returned when they spoke the phrase. If Chrome consistently transcribes a user's "deck light" as "tek light" because of accent, the stored alias becomes "tek light" — and that's exactly what comes back at command time too. The alias and the live command go through the same transcription pipeline, so they match cleanly even when the literal English doesn't.

Aliases survive every plugin / app upgrade because they live in the same dibby-wemo.json / devices.json file as the rest of the device record.


Privacy disclosure (shown once per browser)

Voice commands use your browser's built-in speech recognition. Chrome and Edge stream audio to Google/Microsoft to transcribe it; Safari uses on-device recognition. Dibby Wemo never records, stores, or transmits audio itself.

Dismissal is persisted in localStorage so the modal shows once per browser, not once per session.

Firefox doesn't ship SpeechRecognition by default — the 🎤 button shows a disabled state with a tooltip recommending Chrome / Edge / Safari. No broken UI.


What's new in the UI

  • Web UI (apps/desktop/resources/web/index.html):
    • 🎤 toggle button in the Devices toolbar next to ⟳ Scan
    • Live transcript bubble while listening (interim text in grey, finals in white with a ✓)
    • Per-card voice-alias chips with × delete + 🎤 add voice name link
    • Pulsing red glow on the toolbar mic button while the engine is active
  • Electron desktop renderer:
    • VoiceCommandButton — Sidebar mic button with the same pulse animation
    • VoiceAliasManager — embedded in the Info tab of every device; lists chips, records new aliases, deletes existing ones
  • Help doc — full Voice Commands section: enabling voice, the command grammar, training aliases, privacy story, how to reset the disclosure

Backend additions

  • docker/server.jsGET/POST/DELETE /api/devices/{host}/{port}/voice-aliases[/{index}] + static handlers for /voice-commands.js and /voice-trainer.js with the correct Content-Type so the browser parses them as scripts (the fallback index.html route would otherwise mis-serve them as HTML).
  • apps/desktop/src/main/ipc/devices.ipc.js — new IPC channels get-voice-aliases, add-voice-alias, remove-voice-alias. They mutate the in-place device record via the existing DwmStore.saveDevices, so HomeKit bridge sync / scheduler / etc. keep seeing the same device identity.
  • apps/desktop/src/preload/index.js — bridge exposes window.wemoAPI.{getVoiceAliases, addVoiceAlias, removeVoiceAlias}.

Not in this release (deferred)

  • Offline STT (whisper.cpp / vosk) — would add 100+ MB to every install. Re-evaluate when users specifically ask for offline mode.
  • Voice authoring of DWM rules ("dibby schedule deck master on at sunset") — future release.
  • Hardware-style always-on wake-word detection (picovoice / porcupine) — needs a paid licence or a 30 MB tflite model. Current soft wake-word ("dibby ..." prefix) is a reasonable compromise.
  • Voice in the Homebridge plugin UI — separate iframe sandbox + different mic-permission model. Add later if requested.

Affected packages — unified version bump

  • Desktop apps (Windows installer, macOS .dmg, Linux AppImage / deb / rpm) — functional change: voice button in sidebar, alias manager in device info
  • Docker image ghcr.io/k0rb3nd4ll4s/dibby-wemo-manager:2.0.29functional change: web UI gains voice toolbar + per-card training + REST endpoints
  • Synology .spk (apollolake / geminilake / denverton / broadwell / rtd1296) — functional change: same as Docker
  • homebridge-dibby-wemo@2.0.29 — version bump only, no functional change in this release
  • node-red-contrib-dibby-wemo@2.0.29 — version bump only
  • Home Assistant (HACS) 2.0.29 — version bump only

Upgrade

  • Desktop: download the installer for your platform from this release's Assets after CI finishes attaching artifacts.
  • Docker / Synology Container Manager: Stop → Build → Start (pulls :latest).
  • Synology .spk: download the new .spk for your arch → Package Center → Manual Install (preserves data).
  • Homebridge: npm install -g homebridge-dibby-wemo@2.0.29.
  • HACS: ⋮ → Reload data → Dibby Wemo → ⋮ → Redownload → 2.0.29 → restart HA.