Skip to content

Releases: RyuuMeow/MaidWhisper

v1.0.0-beta.1

05 Jun 11:49

Choose a tag to compare

v1.0.0-beta.1 Pre-release
Pre-release

[1.0.0-beta.1] - 2026-06-05

This is the first MaidWhisper public beta release. It focuses on making
GPT-SoVITS select-to-speak usable as a daily desktop tool: simple setup,
system-wide hotkey capture, a compact floating control panel, character
switching, multilingual reading, and optional LLM-assisted translation or
character voice rewriting.

Added

  • Windows installer packaging with optional GPT-SoVITS setup after install.
  • Managed GPT-SoVITS launch, wake, idle release, terminate-on-exit, and tray controls.
  • System-wide select-to-speak flow with global hotkey capture and deferred playback mode.
  • Floating control panel with character, language, tone, segmentation, playback, retry, skip, and clickable lyrics controls.
  • Character management for GPT-SoVITS model files, reference audio, prompt text, per-language generation speed, temperature, and tone settings.
  • Optional LLM translation and AI character voice rewriting with provider presets and model discovery.
  • Interface language support for English, Traditional Chinese, Simplified Chinese, and Japanese.
  • Cache management, startup update checking, manual update checking, GitHub link, and beta release documentation.

Changed

  • Product defaults are sourced from client/resources/settings/default_settings.json.
  • First-run interface language defaults to English.
  • First-run global hotkey defaults to Alt+Shift+M.
  • Release versioning is centralized through MAIDWHISPER_VERSION in client/CMakeLists.txt.

Notes

  • NVIDIA GPU acceleration is strongly recommended for GPT-SoVITS. CPU mode can work, but synthesis may be slow.
  • The bundled setup path is intended for users who want MaidWhisper to manage the GPT-SoVITS API server. Existing GPT-SoVITS users can still select their own launcher in Settings.
  • Generated AI voice content depends on user-provided models, prompts, and input text. Users are responsible for using models and generated audio legally and ethically.

Known Issues

  • This is a beta release. Packaging, GPT-SoVITS setup compatibility, and model behavior may still be adjusted based on user feedback.
  • Some GPT-SoVITS models or runtimes may require a user-provided launcher if the managed setup package does not match the local GPU/runtime combination.
  • LLM translation and character voice rewriting quality depends on the selected provider, model, prompt, and source text length.