Skip to content

v1.0.0 — First-class Stream mode, restructured menus, language flags

Choose a tag to compare

@perrette perrette released this 22 May 11:02
· 23 commits to main since this release

v1.0.0 — First-class Stream mode, restructured menus, language flags

Major milestone. Stream mode is now a first-class recording mode, the
parameter surface is split and renamed (Stream / Clip vocabulary
throughout), the tray menu is restructured for clarity, and output
sinks are pluggable with live mid-recording switching.

Headline changes since v0.18.0:

  • Mode toggle Stream / Clip at the top level. CLI gets --stream /
    --clip (--realtime, --pseudo-streaming kept as hidden aliases).

  • Streaming-params refactor — --silence-duration / --duration split
    into purpose-specific flags:
    --stream-chunk-min / --stream-chunk-max / --stream-chunk-silence-break
    --stream-context-reset-silence (multiplier of silence-break)
    --stream-context-length (rolling-context char cap, 0 = OFF)
    --realtime-commit-silence (OpenAI realtime backend)
    --stream-timeout / --clip-timeout (mode-specific auto-stop)
    Old flags stay as hidden back-compat aliases.

  • Auto + Max chunk-cut strategies. silence-break=0 → Auto (longest
    silence in window, respecting chunk-min); silence-break=None → Max
    (force-cut only, no silence-based cuts).

  • Output / Keyboard / File restructure. Output radio with four
    mutually-exclusive destinations (Keyboard / Clipboard / Terminal /
    File). Keyboard advanced submenu (visible iff Output=Keyboard)
    groups the typer Backend and a new Input mode radio (keystroke vs
    paste). File destination has a Choose path… picker (tkinter).

  • Output class refactor (scribe/output.py) — pluggable sinks
    (KeyboardOutput / ClipboardOutput / TerminalOutput / FileOutput).
    start_recording rebuilds the sink at every chunk if the user
    changed Output / Typer / Input mode mid-recording, so menu changes
    take effect live. FileOutput drops the trailing newline that made
    the realtime backend produce one-word-per-line files.

  • Vosk as a leaf in the Model menu. Model resolves from the active
    language via autoselect_language(); Auto on vosk displays as
    'Auto (🇬🇧 en)' without mutating o.language.

  • Language menu with flags — ISO 639-1 short codes (en / fr / de /
    it) + origin-country flags via desktop-ai-core v0.3.0's new
    default_country() / flag_for() helpers.

  • 🏠/☁️ prefix on model labels distinguishes on-device from cloud;
    vendor·model uses a middle-dot separator.

  • About submenu with app identity, version, license, GitHub link.

  • Backend × mode smoke-test matrix with --dry-run plumbing in every
    backend so the recording pipeline is exercised without API keys or
    local models on disk. Adds a regression guard for the same class of
    AttributeError that hit silence_duration after the rename.

  • pyproject discoverability: 23 classifiers (Topic, Audience,
    Environment, Natural Language, License, Dev Status) and a tighter
    23-keyword list. project.urls grows Source, Issues, Changelog,
    Funding.

  • Menu / docs alignment audit, terminal-frontend readability fix
    (every picker parent Item now carries a static help= so TUI users
    see 'Chunk min' instead of 'min').

Cross-repo dep: requires desktop-ai-core ≥ v0.3.0.