Talk to advanced AI models with fluid, natural conversation—right from your terminal
Welcome to the OpenAI Realtime API Console—a powerful and elegant command-line interface that brings the revolutionary capabilities of OpenAI's Realtime API directly to your fingertips. Inspired by the official OpenAI Realtime Console, this Python implementation offers an immersive and responsive way to experience conversational AI.
Imagine having a thoughtful conversation with one of the world's most advanced AI systems, with the natural flow and immediacy of talking to a person. Our console transforms this vision into reality, enabling real-time voice interactions with cutting-edge language models through a sleek, intuitive terminal interface.
Whether you're a developer exploring the boundaries of conversational AI, a researcher studying human-AI interaction, or simply curious about the capabilities of real-time language models, this console provides an unparalleled window into the future of communication technology.
-
🎙️ Intuitive Push-to-Talk — Press and hold the
SPACEbar to speak, release to send your message. Experience conversation as it should be: fluid, natural, and effortless. -
🔊 Crystal-Clear Audio Response — Hear the AI's voice with exceptional clarity through real-time audio playback, creating an immersive conversational experience.
-
⚡ Lightning-Fast Responsiveness — Built on an efficient asynchronous architecture that ensures smooth, non-blocking communication with minimal latency.
-
💾 Audio Recording & Playback — Save conversations as WAV files for future reference, analysis, or sharing with colleagues.
-
⌨️ Text & Voice Flexibility — Seamlessly switch between voice and text input based on your preference or environment.
-
🧹 Clean Interface Utilities — Maintain a distraction-free workspace with built-in terminal buffer clearing functionality.
-
🔧 Extensive Configurability — Fine-tune your AI interactions with granular control over:
- Temperature (creativity level)
- Voice selection (multiple options available)
- System instructions
- Response length
- Turn detection
- And much more
Follow these steps to set up the OpenAI Realtime API Console on your system:
git clone https://github.com/S4mpl3r/openai-realtime-console-py.git
cd openai-realtime-console-py
For Windows:
python -m venv .venv
.venv\Scripts\activate
For macOS/Linux:
python3 -m venv .venv
source .venv/bin/activate
For Windows:
python -m pip install -r requirements.txt
For macOS/Linux:
python3 -m pip install -r requirements.txt
Create a .env file in the project root directory and add your OpenAI API key:
OPENAI_API_KEY=your_api_key_here
⚠️ Important: Your API key is sensitive information. Never commit it to version control or share it publicly.
For Windows:
python console.py
For macOS/Linux:
python3 console.py
- Press & hold
SPACE— Record your voice input - Release
SPACE— Send your message to the AI - Press
Q— Exit the application
- Start a conversation by pressing and holding the SPACE bar
- Speak naturally while holding SPACE
- Release the SPACE bar when you've finished speaking
- Listen to the AI's response through your speakers
- Continue the dialogue by pressing SPACE again when ready
The console provides extensive customization options through the SessionConfig class. You can modify parameters in the main() function of console.py:
Choose from a variety of expressive voices:
alloy(default)echofableonyxnovashimmer
Adjust the AI's creativity level (0.6-1.2):
- Lower values (e.g., 0.7) for more focused, deterministic responses
- Higher values (e.g., 1.1) for more varied, creative interactions
Customize the AI's behavior with specific instructions:
- Define personality traits
- Set knowledge boundaries
- Guide response style and format
- Establish voice characteristics
The OpenAI Realtime API enables dynamic, bi-directional communication with AI models through WebSockets. Key aspects include:
- Streaming Audio & Text — Send and receive audio/text in real-time
- Turn Detection — Natural conversation management
- Event-Based Architecture — Efficient handling of information exchange
- Session Management — Maintain conversation context
This console implements a complete client for the API, handling all the complex details of WebSocket communication, audio processing, and event management.
- Audio Quality: 16-bit PCM mono at 24kHz provides optimal balance between quality and bandwidth
- Network Requirements: A stable internet connection with at least 1 Mbps upload/download is recommended
- Memory Usage: Typically consumes 50-100MB of RAM during operation
Contributions are welcome! To contribute:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Created with ❤️ by Suvom
Experience the future of conversation today