opencode-voice2text

This is a streaming voice input plugin for the OpenCode TUI with a provider-based speech recognition architecture. The current built-in provider is Volcengine ASR.

Press the shortcut once to start recognition. While you speak naturally, audio is streamed continuously to Volcengine. Press the shortcut again to stop recognition. Stable recognized text is appended continuously into the current OpenCode input while you are still speaking.

Demo

Features

Start and stop streaming recognition with a single shortcut
Stable recognition results are appended to the input before the session ends
Warning/error toast feedback for misconfiguration or failures
Works on macOS and Linux
Keeps credentials out of the plugin repo

Behavior

First Ctrl+S: start microphone capture and streaming recognition
While speaking: stable recognized text is appended continuously to the current prompt
Second Ctrl+S: stop capture, wait for the final ASR result, then append the remaining tail text
A persistent recording toast stays visible while recording and disappears automatically when recognition stops

Why this is toggle-based

OpenCode's current TUI plugin API supports keybind matching, but it does not expose key release events yet. That means truly reliable "hold to record / release to stop" behavior is not possible in a plugin right now.

Requirements

OpenCode with TUI plugin support
Provider credentials for your selected ASR backend
Sox installed locally (rec on macOS/Linux, sox.exe on Windows)

macOS:

brew install sox

Ubuntu/Debian:

sudo apt install sox

Windows:

Download and install SoX from https://sourceforge.net/projects/sox/
Make sure sox.exe is available in PATH
Verify the install:

sox --version

Install

Preferred install command:

opencode plugin opencode-voice2text@latest --global

This is the same style used by opencode-dynamic-context-pruning. The OpenCode CLI installs the npm package and updates your OpenCode plugin config for you.

If you only want it in the current project instead of globally, omit --global:

opencode plugin opencode-voice2text@latest

TUI config

The installer writes a default TUI plugin entry for you with:

commandKeybind: "ctrl+s"

You still need to make sure terminal_suspend does not conflict with your chosen shortcut.

Recommended ~/.config/opencode/tui.json:

{
  "$schema": "https://opencode.ai/tui.json",
  "keybinds": {
    "terminal_suspend": "none"
  }
}

If you want a different shortcut, edit the generated plugin entry in tui.json after installation.

Ctrl+S is the default shortcut. If pressing it does nothing, your terminal is likely intercepting it for XON/XOFF flow control before OpenCode sees the key.

Current shell session fix:

stty -ixon

Persistent fix for zsh:

Add stty -ixon to ~/.zshrc, then restart the terminal.

Persistent fix for bash:

Add stty -ixon to ~/.bashrc or ~/.bash_profile, then restart the terminal.

If you still prefer not to change terminal flow control, override commandKeybind manually in tui.json.

Windows terminals do not use the same Ctrl+S XON/XOFF flow control behavior, so the stty -ixon fix is only relevant on macOS/Linux shells.

Restart OpenCode

If OpenCode is already running, restart it so the plugin and dependency tree are loaded again.

Credentials

Create a local config file on the target machine:

macOS/Linux:

~/.config/opencode/voice2text.local.json

Windows:

%APPDATA%\opencode\voice2text.local.json

{
  "provider": "volcengine",
  "providerConfig": {
    "appId": "your-volcengine-app-id",
    "accessToken": "your-volcengine-access-token",
    "resourceId": "volc.seedasr.sauc.duration",
    "endpoint": "wss://openspeech.bytedance.com/api/v3/sauc/bigmodel_async"
  },
  "language": "zh-CN",
  "chunkMs": 200,
  "endWindowSize": 800,
  "maxDurationSeconds": 180,
  "appendTrailingSpace": true,
  "rate": 16000,
  "bits": 16,
  "channels": 1
}

An example template also lives in examples/voice2text.local.example.json.

Volcengine setup

For the built-in volcengine provider, you need to prepare the following values from Volcengine before the plugin can work:

Volcengine ASR product page: https://www.volcengine.com/product/asr
providerConfig.appId
providerConfig.accessToken
providerConfig.resourceId
providerConfig.endpoint

Typical setup flow:

Open the Volcengine ASR page, sign in to the Volcengine console, or register first if you do not already have an account. Then open the speech recognition / ASR service page.
Create or select an application.
Get the credentials and resource settings for that application.
Fill the values into your local voice2text.local.json. On macOS/Linux the default path is ~/.config/opencode/voice2text.local.json. On Windows the default path is %APPDATA%\opencode\voice2text.local.json. For resourceId, check the Big Model Streaming Speech Recognition API docs. The recommended value is volc.seedasr.sauc.duration. For endpoint, use wss://openspeech.bytedance.com/api/v3/sauc/bigmodel_async.

For this plugin's current Volcengine implementation:

providerConfig.endpoint is typically a websocket endpoint under wss://openspeech.bytedance.com/api/v3/sauc/...
providerConfig.resourceId should match the model/resource you enabled in Volcengine
providerConfig.appId and providerConfig.accessToken must belong to the same Volcengine application

Example:

{
  "provider": "volcengine",
  "providerConfig": {
    "appId": "your-app-id",
    "accessToken": "your-access-token",
    "resourceId": "volc.seedasr.sauc.duration",
    "endpoint": "wss://openspeech.bytedance.com/api/v3/sauc/bigmodel_async"
  }
}

If the plugin is triggered without valid Volcengine credentials, it will show a warning toast instead of failing silently.

You can override the config path with:

export OPENCODE_VOICE2TEXT_LOCAL_CONFIG=/path/to/voice2text.local.json

Environment variables

These can override or replace values from the local config file:

export OPENCODE_VOICE2TEXT_PROVIDER=volcengine
export OPENCODE_VOICE2TEXT_LANGUAGE=zh-CN
export OPENCODE_VOICE2TEXT_CHUNK_MS=200
export OPENCODE_VOICE2TEXT_END_WINDOW_SIZE=800
export OPENCODE_VOICE2TEXT_MAX_DURATION_SECONDS=180
export OPENCODE_VOICE2TEXT_APPEND_TRAILING_SPACE=true
export OPENCODE_VOICE2TEXT_SAMPLE_RATE=16000
export OPENCODE_VOICE2TEXT_BITS=16
export OPENCODE_VOICE2TEXT_CHANNELS=1

Legacy flat environment variables are still supported for the built-in Volcengine provider:

export OPENCODE_VOICE2TEXT_APP_ID=...
export OPENCODE_VOICE2TEXT_ACCESS_TOKEN=...
export OPENCODE_VOICE2TEXT_RESOURCE_ID=volc.seedasr.sauc.duration
export OPENCODE_VOICE2TEXT_ENDPOINT=wss://openspeech.bytedance.com/api/v3/sauc/bigmodel_async

Plugin options

You can pass the same runtime options through tui.json:

commandKeybind
provider
providerConfig
language
chunkMs
endWindowSize
maxDurationSeconds
appendTrailingSpace
rate
bits
channels

In practice, credentials are best kept in the local config file or environment variables rather than in tui.json.

Provider design

The config is now provider-oriented so more ASR backends can be added later without changing the install shape.

current provider: volcengine
future providers can reuse the same plugin entry and TUI behavior
provider-specific secrets now live under providerConfig

To add a new provider in code:

add a new file under src/providers/
implement the VoiceProvider interface from src/providers/types.ts
register it in src/providers/index.ts
use provider + providerConfig in local config

If provider config is missing, pressing the shortcut shows a toast explaining which local config file to fill instead of failing silently.

Development

Install dependencies and build:

npm install
npm run build

Type-check only:

npm run typecheck

Publishing

Automatic publish from GitHub Actions

This repository now includes .github/workflows/publish.yml.

It is configured for npm trusted publishing with GitHub Actions OIDC, so you do not need to store a long-lived NPM_TOKEN in GitHub.

Behavior:

every push to master runs typecheck and build
the workflow checks whether package.json's current name@version already exists on npm
if that version is not published yet, it runs npm publish
if that version already exists, the workflow exits cleanly without failing

Required npm setup:

add this repository as a trusted publisher for the npm package

On npmjs.com, open the package settings for opencode-voice2text, then configure:

Trusted Publisher
provider: GitHub Actions
owner: chenxuan520
repository: opencode-voice2text
workflow filename: publish.yml

Important release rule:

before pushing to master, bump package.json version if you want a new npm release
if you push code without changing the version, CI will skip publishing because npm versions are immutable

Version bump examples:

npm version patch

or:

npm version minor

Manual publish

npm publish

prepublishOnly runs the build automatically.

For emergency manual publishing, use your local npm login or a short-lived bypass-2FA token locally. Do not store long-lived publish tokens in GitHub Actions when trusted publishing is enabled.

Notes

The built-in Volcengine provider uses Volcengine's websocket ASR protocol directly.
Success toasts are intentionally not shown; recording state uses a long-lived toast that disappears after stop.
Errors still surface as OpenCode toasts.
opencode plugin ... updates the plugin entry in tui.json, but does not replace unrelated TUI settings such as theme or keybinds.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
log		log
src		src
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

opencode-voice2text

Demo

Features

Behavior

Why this is toggle-based

Requirements

Install

TUI config

Restart OpenCode

Credentials

Volcengine setup

Environment variables

Plugin options

Provider design

Development

Publishing

Automatic publish from GitHub Actions

Manual publish

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

opencode-voice2text

Demo

Features

Behavior

Why this is toggle-based

Requirements

Install

TUI config

Restart OpenCode

Credentials

Volcengine setup

Environment variables

Plugin options

Provider design

Development

Publishing

Automatic publish from GitHub Actions

Manual publish

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages