| title | Voice AI Assistant |
|---|---|
| emoji | 🎙️ |
| colorFrom | blue |
| colorTo | purple |
| sdk | streamlit |
| sdk_version | 1.28.0 |
| app_file | app.py |
| pinned | false |
A voice-enabled AI chatbot built with Streamlit and Google Gemini API, featuring automatic speech-to-text transcription and multi-language support.
This project is deployed on multiple platforms:
- Hugging Face Spaces: https://huggingface.co/spaces/sam-codecub/voice-ai-assistant
- GitHub: https://github.com/CodeCubCA/voice-ai-assistant-samCodeCub
Try the live demo on Hugging Face Spaces!
- Voice Input: Click-to-record voice input with automatic transcription
- Auto-Send: Messages are automatically sent after voice transcription
- Multi-Language Support: Supports 12 languages including English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, Hindi, and Arabic
- Voice Commands: Control the app with voice commands like "clear chat" or "change personality"
- Custom Personalities: Currently features a Clash Royale themed AI personality
- Chat History: Full conversation history with message display
- Manual Text Input: Type messages manually with a text area and send button
- Streamlit: Web interface framework
- Google Gemini API: AI language model (gemini-2.5-flash)
- SpeechRecognition: Speech-to-text conversion using Google's speech recognition
- audio-recorder-streamlit: Browser-based audio recording component
- python-dotenv: Environment variable management
- Clone the repository:
git clone https://github.com/CodeCubCA/voice-ai-assistant-samCodeCub.git
cd voice-ai-assistant-samCodeCub- Install dependencies:
pip3 install -r requirements.txt- Set up your environment variables:
- Copy
.env.exampleto.env - Add your Google Gemini API key to
.env:
- Copy
GEMINI_API_KEY=your_api_key_here
- Run the application:
streamlit run app.py- Click the microphone button
- Speak clearly when recording
- Click again to stop recording
- Your message will be automatically transcribed and sent
- Type your message in the text area
- Click the "Send" button
Speak these commands to control the app:
- Clear Chat: "Clear chat", "Clear conversation", or "Delete history"
- Change Personality: "Change personality to Clash Royale"
Choose your preferred language for voice recognition from the sidebar. Supported languages:
- English (US/UK)
- Spanish
- French
- German
- Italian
- Portuguese
- Chinese (Mandarin)
- Japanese
- Korean
- Hindi
- Arabic
Select different AI personalities from the sidebar (currently Clash Royale themed).
- Speak in a quiet environment
- Be close to your microphone
- Speak clearly at normal pace
- Use short sentences
- Grant microphone permissions in your browser
voice-ai-assistant/
├── app.py # Main application file
├── requirements.txt # Python dependencies
├── .env # Environment variables (not in git)
├── .env.example # Environment variables template
├── .gitignore # Git ignore file
├── README.md # This file
└── test files/ # Testing utilities
├── test_mic.html
├── test_audio_simple.py
└── test_voice.py
- Python 3.7+
- Modern web browser with microphone support
- Internet connection for speech recognition and AI API
- Google Gemini API key
Voice input not working:
- Check browser microphone permissions
- Try the
test_mic.htmlfile to verify microphone access - Ensure no other app is using your microphone
Transcription errors:
- Speak more clearly and slowly
- Reduce background noise
- Try a different language setting
- Check your internet connection
API errors:
- Verify your Gemini API key in
.env - Check API quota limits
- Ensure internet connection is stable
This project is for educational purposes.
Built with Streamlit and Google Gemini API