EZ

A desktop app for audio recording, transcription, and annotation. Record voice, capture screenshots and clipboard content simultaneously, get word-level transcripts with timestamps, and annotate them with rich metadata — all processed locally, no cloud APIs required.

Built with Tauri 2 (Rust backend) + Svelte 5 (TypeScript frontend).

Features

Recording & Capture

Select any audio input device with real-time amplitude visualization
Automatically capture screenshots from a watched folder during recording
Automatically capture clipboard images every 1.5 seconds during recording
All captures are timestamped and associated with the recording session

Transcription

Offline transcription via Whisper (large-v3-turbo) with Metal GPU acceleration on macOS
Word-level timestamps using Dynamic Time Warping (DTW)
Voice Activity Detection (VAD) for improved accuracy on silent segments
Model downloaded automatically on first use (~874 MB, stored locally)

Annotation

Annotate any word range in the transcript with:
- Footnotes
- Hyperlinks
- Code snippets (with language tag)
- Images
- Audio clips
Conflict detection prevents overlapping annotations
Soft delete with undo (3-second toast window)

Session Management

Sessions persist across app restarts; last opened session is restored automatically
Rename and delete sessions
Export any session as a structured directory: session.json, transcript.json, annotations.json, plus all asset files

Tech Stack

Layer	Technology
Desktop framework	Tauri 2
Frontend	Svelte 5 + SvelteKit 2
Language	TypeScript (frontend), Rust (backend)
Transcription	whisper-rs with Metal acceleration
Audio I/O	cpal + hound
Audio resampling	rubato
Database	SQLite via sqlx + tauri-plugin-sql
File watching	notify
Build tool	Vite 8

Prerequisites

Node.js 20+
Rust 1.77+
Tauri CLI prerequisites for your OS
- macOS: Xcode Command Line Tools
- Linux: libwebkit2gtk, libgtk-3, libappindicator3, etc.
- Windows: Microsoft Visual C++ Build Tools

Development

# Install frontend dependencies
npm install

# Start dev mode (launches Vite dev server + Tauri window)
npm run tauri dev

The first build will compile all Rust dependencies, which takes a few minutes. Subsequent builds are incremental.

Building

# Build frontend
npm run build

# Bundle desktop app (outputs to src-tauri/target/release/bundle/)
npm run tauri build

Output formats by platform:

macOS: .dmg and .app
Windows: .msi and .exe
Linux: .AppImage and .deb

Project Structure

ez/
├── src/                    # Svelte frontend
│   ├── lib/
│   │   ├── components/     # UI components (AudioPlayer, TranscriptView, etc.)
│   │   ├── db.ts           # SQLite schema and initialization
│   │   ├── audio.ts        # Recording logic + device enumeration
│   │   ├── transcript.ts   # Transcript data access
│   │   ├── annotations.ts  # Annotation CRUD
│   │   ├── sessions.ts     # Session management
│   │   ├── captures.ts     # Clipboard + watcher capture handling
│   │   └── export.ts       # Session export
│   └── routes/
│       ├── +page.svelte    # Home (session list + recording controls)
│       └── session/[id]/   # Session detail view (transcript + annotations)
├── src-tauri/
│   ├── src/
│   │   ├── lib.rs          # Tauri app setup, command registration, URI schemes
│   │   ├── audio.rs        # Audio recording, device management
│   │   ├── transcribe.rs   # Whisper inference
│   │   ├── model.rs        # Model download and validation
│   │   ├── sessions.rs     # Session backend commands
│   │   ├── annotations.rs  # Annotation backend commands
│   │   ├── watcher.rs      # File system watcher
│   │   ├── resample.rs     # Audio resampling
│   │   └── export.rs       # Export command
│   ├── capabilities/
│   │   └── default.json    # Tauri permission grants
│   ├── Cargo.toml
│   └── tauri.conf.json
├── package.json
└── svelte.config.js

Data Storage

All app data is stored locally in the platform-specific app data directory:

Platform	Path
macOS	`~/Library/Application Support/com.ez.app/`
Windows	`%APPDATA%\ez\`
Linux	`~/.config/ez/`

Contents:

ez.db — SQLite database (sessions, transcripts, words, annotations)
recordings/ — WAV audio files
captures/ — Clipboard/screenshot captures
models/ — Downloaded Whisper and VAD model weights

Database Schema

Table	Purpose
`sessions`	Recording sessions with metadata (duration, device, sample rate)
`transcripts`	One per session; stores aggregated word count and detected language
`words`	Individual words with `start_ms` / `end_ms` timestamps
`annotations`	Rich annotations tied to word index ranges; supports soft delete
`pending_captures`	Staging area for clipboard/watcher assets before user annotation
`settings`	Key-value store (watch folder path, last open session)

Security

Audio and asset files are served via custom URI schemes (audio://, asset://) with path canonicalization to prevent directory traversal
No data is sent to external services; transcription and all processing runs locally
Model downloads use HuggingFace CDN with minimum-size validation to guard against truncated files

macOS Permissions

On first launch, macOS will request:

Microphone — required for audio recording
Files and Folders — required for watch folder access and asset storage

These are granted via standard macOS permission dialogs and can be revoked in System Settings > Privacy & Security.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src-tauri		src-tauri
src		src
static		static
.gitignore		.gitignore
.npmrc		.npmrc
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
svelte.config.js		svelte.config.js
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EZ

Features

Recording & Capture

Transcription

Annotation

Session Management

Tech Stack

Prerequisites

Development

Building

Project Structure

Data Storage

Database Schema

Security

macOS Permissions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EZ

Features

Recording & Capture

Transcription

Annotation

Session Management

Tech Stack

Prerequisites

Development

Building

Project Structure

Data Storage

Database Schema

Security

macOS Permissions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages