st

st is a Go CLI for speech-to-text (stt) and text-to-speech (tts) with pluggable provider integrations.

This repository currently includes:

OpenAI provider (official Go SDK)
Batch and streaming transcription
ffmpeg fallback conversion for unsupported input file extensions
Disk-backed TOML config at ~/.st/config.toml

Install

go build -o st ./cmd/st

Initialize config

./st config init

This creates ~/.st/config.toml.

Set your API key either:

In config: openai.api_key = "..."
Via env var (default): OPENAI_API_KEY

Usage

Transcribe audio (batch)

./st stt ./audio.mp3
./st stt ./audio.wav -o transcript.txt

Transcribe audio (streaming)

./st stt ./audio.mp3 --stream

Synthesize speech from text file

./st tts ./script.txt -o speech.mp3

Synthesize speech from raw text

./st tts -t "hello? can you hear me?" > speech.mp3
./st tts -t "hello? can you hear me?" -o speech.mp3

Provider architecture

Providers implement internal/providers.Provider and register themselves via providers.Register(name, factory).

Add another provider by:

Implementing the interface in a new package under internal/providers/<name>
Registering it in init()
Adding provider-specific config section support

Notes

OpenAI upload limit is 25 MB per transcription request.
If an input extension is unsupported, st attempts conversion via ffmpeg to wav automatically.
Without -o, commands write to stdout.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
cmd/st		cmd/st
internal		internal
.gitignore		.gitignore
README.md		README.md
go.mod		go.mod
go.sum		go.sum
justfile		justfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

st

Install

Initialize config

Usage

Transcribe audio (batch)

Transcribe audio (streaming)

Synthesize speech from text file

Synthesize speech from raw text

Provider architecture

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

st

Install

Initialize config

Usage

Transcribe audio (batch)

Transcribe audio (streaming)

Synthesize speech from text file

Synthesize speech from raw text

Provider architecture

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages