A lightweight, Lua Neovim plugin that converts text into spoken audio using the Google Cloud Text-to-Speech API or a local model that complies with OpenAI speech API specification. You can synthesize the entire buffer or just a visual selection, and save the output as .wav or .mp3.
- Buffer or Selection: Synthesize an entire file or just the text highlighted in visual mode.
- Format Support: Automatically saves as
.wavor converts to.mp3usingffmpegbased on your output file extension. - Audio Processing: Includes built-in fade-ins, pitch shifting, and time stretching for smoother audio playback.
The plugin requires a few common command-line tools under the hood for audio processing:
You will need curl, ffmpeg (for MP3 conversion and filters), and rubberband (for audio time-stretching/pitch-shifting).
macOS (Homebrew):
brew install curl ffmpeg rubberbandUbuntu/Debian:
sudo apt-get install curl ffmpeg rubberband-cliThe plugin authenticates using your local gcloud configuration.
- Install the Google Cloud CLI.
- Authenticate and set your project:
gcloud auth login
gcloud config set project YOUR_PROJECT_IDInstall the plugin using your preferred package manager and configure the engine.
{
'your-username/speech.nvim', -- Or local path: dir = '~/path/to/speech_vim'
opts = {
-- Engine selection: 'google' (default) or 'local'
engine = 'google',
-- ==== GOOGLE ENGINE SETTINGS ====
default_voice = 'en-GB-Wavenet-N',
-- ==== LOCAL ENGINE SETTINGS ====
-- Use any local OpenAI-compatible API server (e.g., Fish-Speech, F5-TTS)
local_url = 'http://localhost:8080/v1/audio/speech',
local_voice = 'alloy', -- Change to your cloned voice profile name
local_api_key = '', -- Optional, if your local server requires it
local_model = 'tts-1',
-- ==== AUDIO PROCESSING SETTINGS ====
-- Speed multiplier (1.0 = normal, 0.5 = double speed)
factor = 0.81,
-- Pitch shift in semitones (0.0 = normal, negative = lower, positive = higher)
pitch_shift = -0.5,
}
}use {
'your-username/speech.nvim', -- Or local path: '~/path/to/speech_vim'
config = function()
require('speech_vim').setup({
engine = 'local', -- Example: Switching to a local model
local_url = 'http://localhost:8080/v1/audio/speech',
local_voice = 'my_custom_voice',
factor = 1.0, -- Example: Disable time-stretching
pitch_shift = 0.0 -- Example: Disable pitch-shifting
})
end
}To create speech from text run a command: :SpeechGen audio/file/path.wav
1. Synthesize the entire buffer (defaults to output.wav)
:SpeechGen2. Synthesize a specific file and format
If you provide an .mp3 extension, the plugin will use ffmpeg to convert it.
:SpeechGen ~/Desktop/my_audio.mp33. Synthesize selected text Select text in Visual mode, then type the command:
:'<,'>SpeechGen selection.wav**4. If you want to make pitch (or speed) changes temporarily and without restarting Neovim, you can run this command directly in the Neovim command line:
:lua require('speech_vim').config.pitch_shift = 2.0 and then run :SpeechGen as usual.
MIT