Skip to content

1vank1n/HandyFiles

Repository files navigation

HandyFiles

Cross-platform desktop app for transcribing audio and video files using local AI models. No cloud, no API keys — everything runs on your machine.

Inspired by Handy — a great open-source speech-to-text app. HandyFiles borrows its model management approach, uses the same transcribe-rs engine, and shares the GigaAM model distribution. While Handy focuses on live microphone recording, HandyFiles is designed for file-based transcription with drag & drop.

Features

  • Drag & drop any audio or video file
  • Local transcription using Whisper and GigaAM models
  • Model management — download, switch, and compare models from the app
  • Native audio decoding — no FFmpeg required (Symphonia + Rubato)
  • Re-transcribe with a different model to compare results
  • Supports: MP3, WAV, FLAC, OGG, AAC, M4A, MP4, MKV, WebM

Supported Models

Model Size Languages Notes
Whisper Tiny 75 MB 99 languages Fastest
Whisper Base 142 MB 99 languages Good balance
Whisper Small 466 MB 99 languages Recommended
Whisper Medium 1.5 GB 99 languages High quality
Whisper Large v3 3 GB 99 languages Best quality
Whisper Large v3 Turbo 1.6 GB 99 languages Large quality, medium speed
GigaAM v3 151 MB Russian Specialized for Russian

Install

Download the latest release for your platform from Releases.

  • macOS: .dmg (Apple Silicon)
  • Windows: .exe (NSIS installer)
  • Linux: .deb or .AppImage

macOS: "app is damaged" warning

The app is not signed with an Apple Developer certificate. macOS Gatekeeper will block it. To fix, run in Terminal after installing:

xattr -cr /Applications/HandyFiles.app

Build from Source

Prerequisites

Linux only:

sudo apt install libwebkit2gtk-4.1-dev libgtk-3-dev libayatana-appindicator3-dev librsvg2-dev

Build

pnpm install
pnpm tauri build

Development

pnpm tauri dev

Tech Stack

  • Tauri v2 — Rust backend + webview frontend
  • React 18 + TypeScript + Tailwind CSS 4
  • transcribe-rs — Whisper (whisper-cpp) + GigaAM (ONNX) transcription
  • Symphonia — pure Rust audio/video decoding
  • Rubato — audio resampling to 16kHz

License

MIT

About

A free cross-platform desktop app for transcribing audio/video files locally using Whisper and GigaAM models. Drag & drop, no cloud, no API keys.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages