Whisper transcription, translation, and TTS in a single C++ header.
- Transcribe audio files (mp3, wav, m4a, ogg, webm) via OpenAI Whisper
- Translate audio to English
- Text-to-speech with 6 voices and 4 formats
- Single-header, C++17, namespace
llm
#define LLM_AUDIO_IMPLEMENTATION
#include "llm_audio.hpp"
llm::TranscribeConfig cfg{ .api_key = "sk-..." };
auto result = llm::transcribe("audio.mp3", cfg);
std::cout << result.text;TranscribeResult transcribe(const std::string& filepath, const TranscribeConfig&);
TranscribeResult transcribe_bytes(const std::vector<uint8_t>&, const std::string& filename, const TranscribeConfig&);
void text_to_speech(const std::string& text, const std::string& output_path, const TTSConfig&);
std::vector<uint8_t> text_to_speech_bytes(const std::string& text, const TTSConfig&);cmake -B build && cmake --build buildRequires libcurl (vcpkg: vcpkg install curl).
MIT — Mattbusel, 2026
| Repo | Purpose |
|---|---|
| llm-stream | SSE streaming |
| llm-cache | Response caching |
| llm-cost | Token cost estimation |
| llm-retry | Retry + circuit breaker |
| llm-format | Markdown/code formatting |
| llm-embed | Embeddings + cosine similarity |
| llm-pool | Connection pooling |
| llm-log | Structured logging |
| llm-template | Prompt templates |
| llm-agent | Tool-use agent loop |
| llm-rag | Retrieval-augmented generation |
| llm-eval | Output evaluation |
| llm-chat | Multi-turn chat |
| llm-vision | Vision/image inputs |
| llm-mock | Mock LLM for testing |
| llm-router | Model routing |
| llm-guard | Content moderation |
| llm-compress | Prompt compression |
| llm-batch | Batch processing |
| llm-audio | Audio transcription/TTS |
| llm-finetune | Fine-tuning jobs |
| llm-rank | Passage reranking |
| llm-parse | HTML/markdown parsing |
| llm-trace | Distributed tracing |
| llm-ab | A/B testing |
| llm-json | JSON parsing/building |