Skip to content

Skippia/murmur-type

Repository files navigation

murmur-type

Murmur and it types.

Voice-to-text and instant voice translation for Linux/Wayland. Press a hotkey, speak, press again — your words appear in the focused window. Powered by Whisper via Groq (free, blazing fast). Single Python file, zero dependencies.

Works with niri, sway, Hyprland, and any Wayland compositor.

License: MIT Python 3.10+ Wayland Zero Dependencies

Why murmur-type?

Most Linux voice-to-text tools require heavy setups (local models, Python packages, system services). murmur-type is a single file with no dependencies — just Python stdlib, a Groq API key, and your Wayland compositor. It takes 30 seconds to set up.

Features

  • Voice → Type — speak in any language, text appears in the focused window (VSCode, terminal, browser, anywhere)
  • Voice Translate — say a word in one language, see the translation + 3 context examples in a rofi popup with the word underlined in each sentence
  • Webhook Integration — press Enter in the popup to save translations to any REST API (flashcard apps, Notion, Anki, custom backends)
  • Multi-language — separate hotkeys per language, or auto-detect
  • Toggle design — one hotkey starts recording, same hotkey stops and processes. No daemon, no background service
  • Single file — one Python script, stdlib only. No pip, no venv, no Docker

How It Works

┌──────────┐    ┌─────────────┐    ┌──────────┐    ┌─────────┐
│ Mic      │───→│ pw-record   │───→│ Groq     │───→│ wtype   │
│ (hotkey) │    │ (PipeWire)  │    │ Whisper  │    │ (type)  │
└──────────┘    └─────────────┘    └──────────┘    └─────────┘

Translate mode:
┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────┐    ┌──────────┐
│ Mic      │───→│ Whisper  │───→│ LLM      │───→│ rofi │───→│ Webhook  │
│ (hotkey) │    │ (STT)    │    │ (translate)│   │ popup│    │ (save)   │
└──────────┘    └──────────┘    └──────────┘    └──────┘    └──────────┘

Screenshots

Voice Translate popup (rofi)

The translated word is highlighted with underline in each context sentence:

🇷🇺  выдержка

🇬🇧  endurance

1) His endurance was tested during the marathon.
    ↳ Его выдержка была проверена во время марафона.

2) The job requires both patience and endurance.
    ↳ Работа требует как терпения, так и выдержки.

3) She showed remarkable endurance under pressure.
    ↳ Она проявила замечательную выдержку под давлением.

⏎  Enter = save to vocabulary  |  Esc = dismiss

Requirements

  • Linux with Wayland (tested on niri + Arch Linux)
  • Python 3.10+ (stdlib only, no pip packages)
  • PipeWirepw-record for microphone capture
  • wtype — types text into the focused Wayland window
  • rofi — popup for translate mode
  • notify-send — desktop notifications
  • Groq API key (free) — for Whisper speech-to-text and LLM translation

Install dependencies (Arch Linux)

sudo pacman -S python pipewire wtype rofi libnotify

Get a Groq API key

  1. Go to https://console.groq.com/keys
  2. Sign up (free, Google/GitHub login)
  3. Create an API key

Installation

# Clone the repo
git clone https://github.com/Skippia/murmur-type.git
cd murmur-type

# Run the installer
./install.sh

# Edit config with your API key
nano config.json

The installer:

  • Creates a symlink ~/.local/bin/murmur-typemurmur-type.py
  • Copies config.example.jsonconfig.json (if not exists)
  • Checks that all dependencies are installed

Manual installation

# 1. Clone anywhere
git clone https://github.com/Skippia/murmur-type.git ~/murmur-type

# 2. Create config
cp config.example.json config.json
# Edit config.json — at minimum set "api_key"

# 3. Symlink to PATH
ln -s ~/murmur-type/murmur-type.py ~/.local/bin/murmur-type

# 4. Make sure ~/.local/bin is in your PATH

Configuration

Edit config.json:

{
  "provider": "groq",
  "api_key": "gsk_YOUR_KEY_HERE",
  "model": "whisper-large-v3",
  "language": "",
  "translate_model": "llama-3.3-70b-versatile",
  "app_url": "http://localhost:3009",
  "app_login": "",
  "app_password": "",
  "app_topic_id": ""
}

Required fields

Field Description
provider "groq" (recommended) or "openrouter"
api_key Your Groq API key (gsk_...)
model Whisper model: "whisper-large-v3" (best accuracy) or "whisper-large-v3-turbo" (faster)

Optional fields

Field Description
language Default language hint. Leave "" for auto-detect, or set "en", "ru", "uk", etc.
translate_model LLM for translation. Default: "llama-3.3-70b-versatile"
webhook Webhook config for saving translations (optional, see Webhook Integration)

Keybindings

Add these to your compositor config. Examples below for different compositors:

niri (~/.config/niri/config.kdl)

binds {
    Mod+Shift+E  hotkey-overlay-title="Voice-to-text (English)"   { spawn "murmur-type" "en"; }
    Mod+Shift+R  hotkey-overlay-title="Voice-to-text (Russian)"   { spawn "murmur-type" "ru"; }
    Mod+Shift+A  hotkey-overlay-title="Voice translate (RU → EN)"  { spawn "murmur-type" "translate"; }
}

sway (~/.config/sway/config)

bindsym $mod+Shift+e exec murmur-type en
bindsym $mod+Shift+r exec murmur-type ru
bindsym $mod+Shift+a exec murmur-type translate

Hyprland (~/.config/hypr/hyprland.conf)

bind = $mainMod SHIFT, E, exec, murmur-type en
bind = $mainMod SHIFT, R, exec, murmur-type ru
bind = $mainMod SHIFT, A, exec, murmur-type translate

Usage

Voice → Type (English)

  1. Press Mod+Shift+E — notification "Recording (EN)..."
  2. Speak in English
  3. Press Mod+Shift+E again — notification "Processing..."
  4. Transcribed text is typed into the focused window

Voice → Type (Russian)

Same as above but with Mod+Shift+R.

Voice Translate (RU → EN)

  1. Press Mod+Shift+A — notification "Recording (RU → EN)..."
  2. Say a Russian word or phrase
  3. Press Mod+Shift+A again
  4. A rofi popup appears with:
    • The Russian word you said
    • English translation (bold)
    • 3 example sentences with the word underlined
    • Russian translation for each example
  5. Enter — saves as a vocabulary card (if app is configured)
  6. Escape — dismiss

VPN Split-Tunnel

If you use a VPN (e.g., Windscribe) that routes through datacenter IPs, Groq will block your requests with a 403 error. The included groq-route.sh script adds direct routes for Groq's Cloudflare IPs, bypassing the VPN tunnel.

# Install as a systemd service (persists across reboots)
sudo cp groq-route.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now groq-route.service

This only affects traffic to Groq's API IPs (172.64.149.20, 104.18.38.236). All other traffic continues through your VPN.

File Structure

murmur-type/
├── murmur-type.py          # Main script (single file, stdlib only)
├── config.json            # Your config (gitignored)
├── config.example.json    # Config template
├── install.sh             # Installer script
├── groq-route.sh          # VPN split-tunnel route script
├── groq-route.service     # Systemd unit for persistent routes
├── .run/                  # Runtime data (gitignored, auto-created)
│   ├── recording.pid      # PID of pw-record process
│   ├── recording.wav      # Temporary audio file
│   ├── mode               # Current recording mode
│   └── app_token          # Cached auth token (if webhook uses auth)
└── README.md

Webhook Integration

When using the translate mode (Mod+Shift+A), pressing Enter in the rofi popup can send the word and translation to any HTTP endpoint. This lets you integrate with flashcard apps, Notion, Anki, Google Sheets, or any service with a REST API.

Set "webhook": null (or omit it) to disable — the rofi popup will still work, just without saving.

Basic webhook (no auth)

Send a POST request with the word and translation to any URL:

{
  "webhook": {
    "url": "https://your-app.com/api/words",
    "body": {
      "word": "{{word}}",
      "translation": "{{translation}}"
    }
  }
}

{{word}} and {{translation}} are placeholders — they get replaced with the actual values at runtime.

Webhook with static headers

If your API uses an API key or static token:

{
  "webhook": {
    "url": "https://your-app.com/api/words",
    "headers": {
      "X-Api-Key": "your-api-key",
      "Authorization": "Bearer your-static-token"
    },
    "body": {
      "word": "{{word}}",
      "translation": "{{translation}}"
    }
  }
}

Webhook with JWT login flow

If your API requires logging in first to get a JWT token:

{
  "webhook": {
    "url": "https://your-app.com/api/words",
    "body": {
      "topicId": "some-category-id",
      "word": "{{word}}",
      "translation": "{{translation}}"
    },
    "auth": {
      "url": "https://your-app.com/api/auth/login",
      "body": {
        "login": "your-username",
        "password": "your-password"
      },
      "token_path": "data.token"
    }
  }
}

How the auth flow works:

  1. On first request, murmur-type sends a POST to auth.url with auth.body
  2. Extracts the JWT token from the response using token_path (dot notation — e.g., "data.token" extracts response.data.token)
  3. Adds Authorization: Bearer <token> to the webhook request
  4. Caches the token in .run/app_token so subsequent calls skip the login
  5. If the webhook returns 401 (token expired), automatically re-authenticates and retries once

Custom body fields

The body object can contain any structure your API expects. Only {{word}} and {{translation}} are replaced — everything else is sent as-is:

{
  "webhook": {
    "url": "https://api.notion.com/v1/pages",
    "headers": {
      "Notion-Version": "2022-06-28",
      "Authorization": "Bearer ntn_your_token"
    },
    "body": {
      "parent": { "database_id": "abc123" },
      "properties": {
        "Word": { "title": [{ "text": { "content": "{{word}}" } }] },
        "Translation": { "rich_text": [{ "text": { "content": "{{translation}}" } }] }
      }
    }
  }
}

Troubleshooting

403 error from Groq

Your IP is blocked (datacenter/VPN). See VPN Split-Tunnel section.

"Recording too short"

You pressed the hotkey twice too fast. Hold for at least 1 second.

No sound captured

Check that PipeWire is running and your mic is the default source:

pw-record --list-targets

wtype doesn't work

Make sure you're on Wayland (not XWayland). Some apps (e.g., Electron with --disable-gpu) may not receive wtype input.

License

MIT

About

Murmur and it types. Voice-to-text & instant translation for Linux/Wayland. Whisper-powered, zero dependencies, single Python file.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors