This guide will help you set up the voice chatbot system that uses Whisper for speech recognition, Ollama Mistral for language processing, and CSM for speech synthesis.
- Python 3.8+ installed
- Ollama installed and running locally (https://ollama.com/)
- Git
- CUDA-compatible GPU recommended (for faster processing)
- Install Ollama from https://ollama.com/
- Pull the Mistral model:
ollama pull mistral
- Start the Ollama server:
ollama serve
-
Clone the CSM repository:
git clone https://github.com/ruapotato/csm-buddy cd csm-buddy git submodule update --init --recursive
-
Create a virtual environment and install CSM dependencies:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt
The script will automatically download the Whisper model the first time you run it, but you can pre-download it:
from transformers import WhisperProcessor, WhisperForConditionalGeneration
# Download the model (this will cache it for future use)
processor = WhisperProcessor.from_pretrained("openai/whisper-small")
model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
- Make sure Ollama is running with the Mistral model available
- Make sure you have a working microphone
- Run the voice chatbot script:
python main.py Press enter when asked - talk - press enter when done python main_streaming.py (For streamed replys)
- The system will greet you when it starts
- Press Enter to start recording your voice
- Speak your message
- Press Enter again to stop recording
- The system will:
- Show "LISTENING" debug when recording
- Show "TRANSCRIBING" debug when processing your speech
- Show "THINKING" debug when querying Ollama
- Show "SPEAKING" debug when generating and playing the response
- To exit, say "goodbye" or press Ctrl+C
- Audio recording issues: Make sure your microphone is properly connected and configured as the default input device
- Ollama connection error: Ensure Ollama is running and the Mistral model is downloaded
- CUDA/GPU errors: If you encounter GPU-related errors, try modifying the script to use CPU only