assemblyai-cli

Rust CLI to transcribe audio/video files via AssemblyAI.

This repo also ships an Agent Skill (via skills.sh) that works with Codex, Claude (Claude Code), Pi, clawd.bot, openclaw, and other Agent Skills-compatible tools.

Install

Quick install from GitHub Releases (macOS/Linux):

/bin/bash -c "$(curl -fsSL -H 'Cache-Control: no-cache' https://raw.githubusercontent.com/diskd-ai/assemblyai-cli/main/scripts/install.sh)"

Quick install from GitHub Releases (Windows PowerShell):

powershell -NoProfile -ExecutionPolicy Bypass -Command "iwr -useb -Headers @{ 'Cache-Control'='no-cache' } https://raw.githubusercontent.com/diskd-ai/assemblyai-cli/main/scripts/install.ps1 | iex"

After installing, run assemblyai-cli init (recommended) or set ASSEMBLYAI_API_KEY to configure authentication.

Homebrew (recommended):

brew tap diskd-ai/assemblyai-cli https://github.com/diskd-ai/assemblyai-cli
brew install diskd-ai/assemblyai-cli/assemblyai-cli

Upgrade:

brew update
brew upgrade diskd-ai/assemblyai-cli/assemblyai-cli

From source:

git clone https://github.com/diskd-ai/assemblyai-cli.git
cd assemblyai-cli
cargo install --path .

Skill (skills.sh)

This repo ships a skill in skill/assemblyai-cli.

Supported agents/tools (examples): Codex, Claude (Claude Code), Pi, clawd.bot, openclaw.

List available skills in this repo:

npx skills add diskd-ai/assemblyai-cli --list

Install the assemblyai-cli skill globally:

npx skills add diskd-ai/assemblyai-cli --skill assemblyai-cli --global --yes

Example invocation (Agent Skills, e.g. Claude Code):

/assemblyai-cli install via homebrew, then transcribe ./file.mp3

Quickstart

Install assemblyai-cli
Initialize config (recommended):
- assemblyai-cli init (or cargo run --quiet -- init when running from source)
Or export API key:
- export ASSEMBLYAI_API_KEY="..."
Transcribe:
- assemblyai-cli transcribe ./file.mp3 --output transcript.txt (or cargo run --quiet -- transcribe ./file.mp3 --output transcript.txt)

Commands

assemblyai-cli transcribe <INPUT>
assemblyai-cli init

<INPUT>:

Local file path (audio/video), or
HTTP(S) URL.

YouTube and other streaming URLs are usually not direct media URLs. Download audio with yt-dlp first, then transcribe the local file.

Example (YouTube -> mp3 -> transcript):

yt-dlp -x --audio-format mp3 -o "/tmp/assemblyai-cli-input.%(ext)s" "https://www.youtube.com/watch?v=VIDEO_ID"
assemblyai-cli transcribe "/tmp/assemblyai-cli-input.mp3" --output transcript.txt

Supported formats:

--format text (default)
--format srt
--format vtt

Speaker diarization:

--speaker-labels (when used with --format text, prints Speaker X: ...)
When used with --format srt|vtt, it prefers utterance-based subtitles with Speaker X: ... when available.

Configuration

The CLI looks for a JSON config at:

~/.assemblyai-cli/config.json (preferred), or
~/.assemblyai-cli (legacy single-file config).

To create or update the config interactively:

assemblyai-cli init (prompts for API key; if apiKey already exists it asks before overwriting; use --yes to skip the prompt)

API key resolution order:

Config apiKey
ASSEMBLYAI_API_KEY
ASSEMBLY_AI_KEY (base64 encoded; decoded automatically if it looks like base64)

CLI flags override config values.

Config file schema (JSON, camelCase)

Example ~/.assemblyai-cli/config.json:

{
  "apiKey": "YOUR_ASSEMBLYAI_API_KEY",
  "baseUrl": "https://api.assemblyai.com",

  "format": "text",
  "output": "transcript.txt",

  "speechModel": "best",
  "languageDetection": false,
  "language": "fr",

  "punctuate": true,
  "formatText": true,
  "disfluencies": false,
  "filterProfanity": false,

  "speakerLabels": false,
  "multichannel": true,
  "speechThreshold": 0.1,

  "charsPerCaption": 128,
  "wordBoost": ["MyProject"],
  "customSpelling": [{ "from": "MyProject", "to": "MyProject" }],

  "pollIntervalSeconds": 3,
  "timeoutSeconds": 3600
}

Notes:

output is optional; when omitted, transcript prints to stdout.
customSpelling is a list of { "from": "...", "to": "..." } objects.

Video inputs

For video files (.mp4, .avi, .mov, .mkv, .webm), the CLI extracts audio using ffmpeg (must be available on PATH).

Homebrew troubleshooting

If you previously used a debug tap (for example alexeus/assemblyai-cli), Homebrew may warn about missing origin remotes or “Formulae found in multiple taps”. Fix by removing the old tap and reinstalling from the diskd-ai tap:

brew untap alexeus/assemblyai-cli || true
brew tap diskd-ai/assemblyai-cli https://github.com/diskd-ai/assemblyai-cli
brew install diskd-ai/assemblyai-cli/assemblyai-cli

Tests

Tests cover CLI/config behavior without requiring bundled media files.

Optional end-to-end transcription testing requires a valid API key and your own media file(s).

RUST_TEST_THREADS=1 cargo test

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
Formula		Formula
scripts		scripts
skill/assemblyai-cli		skill/assemblyai-cli
src		src
tests		tests
.gitignore		.gitignore
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

assemblyai-cli

Install

Skill (skills.sh)

Quickstart

Commands

Configuration

Config file schema (JSON, camelCase)

Video inputs

Homebrew troubleshooting

Tests

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

assemblyai-cli

Install

Skill (skills.sh)

Quickstart

Commands

Configuration

Config file schema (JSON, camelCase)

Video inputs

Homebrew troubleshooting

Tests

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages