Skip to content

buildoak/tg-agents-wrapper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tg-agents-wrapper

License: MIT

Telegram bot that wraps Claude Code (Agent SDK) and OpenAI Codex into a conversational interface with voice I/O, message batching, session management, and context monitoring.

Built with Bun and grammY.

Features

  • Multi-engine -- switch between Claude and Codex mid-conversation with /engine
  • Voice I/O -- Whisper transcription + dual TTS pipeline (ElevenLabs cloud or Kokoro local)
  • Message batching -- collects rapid-fire Telegram messages into a single prompt (configurable delay)
  • Context monitoring -- track token usage, cache stats, and context window fill via /context
  • Wet proxy integration -- routes Claude API calls through wet for context compression; automatic healthcheck every 30 s with silent fallback to direct Anthropic if wet is down; /context shows compression stats (items compressed, tokens saved)
  • Session resilience -- stale session UUID recovery: adapter buffers the new session ID until the first successful API response, then commits it; dead sessions are auto-cleared so the next query starts fresh without manual /start
  • Session persistence -- sessions survive bot restarts (JSON file)
  • Photo and document support -- send images and files directly to the AI
  • Graceful shutdown -- saves sessions and aborts running queries on SIGINT/SIGTERM

Quick Start

Prerequisites

  • Bun runtime
  • Telegram bot token from @BotFather
  • Anthropic API key (for Claude engine)
  • Claude Code CLI installed (the Claude adapter spawns it as a subprocess)
  • OpenAI API key (optional -- for Codex engine and Whisper voice transcription)

Installation & Setup

# Clone the repository
git clone https://github.com/buildoak/tg-agents-wrapper.git
cd tg-agents-wrapper

# Install dependencies
bun install

# Configure environment
cp .env.example .env
# Edit .env and fill in your API keys

# Start the bot
bun run start

Commands

Command Description
/start Start a new session (resets previous)
/stop Stop the current session
/interrupt Abort the running query, keep session
/engine [claude|codex] Show or switch AI engine
/context Show token usage and context window stats
/effort [level] Set reasoning effort (minimal/low/medium/high/xhigh/max)
/mode Switch between text and voice modes
/voice [id] Show or change the TTS voice ID
/batch Toggle batch delay (15s quick / 2m long)
/thinking Toggle visibility of model thinking/reasoning
/status Show session info (engine, effort, mode, idle time)

Architecture

Telegram --> grammy Bot --> Handlers (text/voice/photo/document)
                                |
                                v
                          BufferManager (configurable batch delay)
                                |
                                v
                          processQuery() --> EngineAdapter.query()
                                |                   |
                                |         +---------+----------+
                                |         v                    v
                                |  ClaudeAdapter          CodexAdapter
                                |  (Agent SDK)            (Codex SDK)
                                |         |                    |
                                v         v                    v
                         NormalizedEvent stream (unified interface)
                                |
                        +-------+----------------+
                        v       v                v
                   text.delta  tool.started   usage --> context tracking
                   text.done   tool.completed
                   done

Key modules:

  • src/engine/interface.ts -- EngineAdapter contract and NormalizedEvent types
  • src/engine/claude.ts -- Claude Agent SDK wrapper
  • src/engine/codex.ts -- Codex SDK wrapper (thread-based)
  • src/handlers/query.ts -- event loop consuming NormalizedEvent stream
  • src/session/ -- session store, context monitor, lifecycle management
  • src/buffer/ -- message batching and media group collection
  • src/voice/ -- transcription (Whisper) and TTS (ElevenLabs + Kokoro)

Voice I/O

Voice input uses OpenAI Whisper for transcription. Voice output supports two TTS backends:

  • ElevenLabs (cloud) -- requires ELEVENLABS_API_KEY in .env
  • Kokoro (local, optional) -- on-device TTS via MLX on Apple Silicon, no API key needed

Local TTS setup (Kokoro)

pip install mlx-audio    # MLX-based Kokoro model
brew install ffmpeg       # WAV → OGG/Opus conversion

The bot auto-detects Kokoro availability at startup and logs the result. If Kokoro deps are missing, it falls back to ElevenLabs. Without either, TTS is disabled but the bot works fine for text.

Configuration

All configuration is via environment variables. See .env.example for the full list.

Variable Required Description
TGBOT_API_KEY Yes Telegram Bot API token
TGBOT_ALLOWED_USERS Yes Comma-separated Telegram user IDs
ANTHROPIC_API_KEY Yes Anthropic API key for Claude
OPENAI_API_KEY No OpenAI API key (Codex engine + Whisper)
WORKING_DIR No Working directory for AI engines (default: ./)
DEFAULT_ENGINE No Default engine: claude or codex (default: claude)
CLAUDE_MODEL No Claude model to use (default: claude-opus-4-6)
CODEX_MODEL No Codex model to use (default: gpt-5.4)
DEFAULT_REASONING_EFFORT No Reasoning effort level: minimal/low/medium/high/xhigh/max (default: high)
BOT_NAME No Display name in bot messages (default: Bot)
ELEVENLABS_API_KEY No ElevenLabs API key for cloud TTS
ELEVENLABS_PUBLIC_OWNER_ID No ElevenLabs public owner ID for shared voices
DEFAULT_ELEVENLABS_VOICE_ID No ElevenLabs voice ID for TTS
ELEVENLABS_VOICE_NAME No ElevenLabs voice name for display
DEFAULT_KOKORO_VOICE No Kokoro voice ID (default: af_heart)
TG_SESSION_DIR No Session storage directory (default: /tmp/tg-agents-wrapper)
TG_FILES_DIR No File storage directory (default: /tmp/tg-agents-wrapper-files)
WET_PORT No Port for wet context compression proxy; enables /context compression stats and automatic healthcheck with fallback to direct API

Development

# Run in development mode with auto-reload
bun run dev

# Type check
bun run check

License

MIT

Author

Nick Oak (buildoak)

About

Telegram bot wrapper for Claude Code (Agent SDK) and OpenAI Codex. Voice I/O via Whisper + ElevenLabs/Kokoro TTS, message batching, session persistence, context monitoring, and wet proxy integration. Built with Bun and grammY.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors