I created this multi-modal AI Telegram bot for the Tech Syndicate competition. It's currently a demo/prototype that showcases how different AI services can be integrated into a single, cohesive chatbot experience.
This is my take on building a comprehensive AI assistant within Telegram. The bot combines several AI capabilities:
- Smart Text Conversations - Have natural conversations with an AI that remembers your chat history
- Voice Message Processing - Send voice messages that get transcribed and answered
- Voice Responses - Get AI responses as audio messages (text-to-speech)
- Image Generation - Create images from text descriptions (when enabled)
- OpenAI API Compatibility - Works with any AI service that supports the OpenAI API format
I built this using:
- Python 3.8+ with extensive use of async/await for handling multiple operations, Recommended
3.10.6 - python-telegram-bot library for Telegram integration
- Whisper for converting voice messages to text
- Edge-TTS for generating voice responses
- Stable Diffusion for image generation (via Automatic1111 API)
- Environment-based configuration for easy customization
The bot uses a .env file for configuration where you can:
- Enable or disable specific features (text chat, voice, images)
- Set up your AI API endpoints
- Manage authorized users
- Configure paths and settings
After starting the bot, you can:
- Just send text messages to chat
- Send voice messages for voice interactions
- Use
/gento create images from text - Use
/wipehistto clear your conversation history - Check
/statusto see which features are enabled
This was my submission for the Tech Syndicate competition and is currently in the proof-of-concept stage. It demonstrates how multiple AI services can work together, but there's still room for improvement and additional features.
I wanted to create something that was:
- Modular - Easy to enable/disable features
- Compatible - Works with various AI services through OpenAI API format
- User-friendly - Simple configuration and natural interactions
- Extensible - Easy to add new capabilities
This project is licensed under the GNU AGPL 3.0 license. Feel free to explore the code and build upon it, but please share your improvements under the same license.