Skip to content

Suvom2024/Realtime-API-Console-OpenAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🌟 OpenAI Realtime API Console 🌟

MIT License Python 3.10+ OpenAI API

🗣️ Experience the future of AI interaction through your command line 🎧

Talk to advanced AI models with fluid, natural conversation—right from your terminal


🚀 Introduction

Welcome to the OpenAI Realtime API Console—a powerful and elegant command-line interface that brings the revolutionary capabilities of OpenAI's Realtime API directly to your fingertips. Inspired by the official OpenAI Realtime Console, this Python implementation offers an immersive and responsive way to experience conversational AI.

Imagine having a thoughtful conversation with one of the world's most advanced AI systems, with the natural flow and immediacy of talking to a person. Our console transforms this vision into reality, enabling real-time voice interactions with cutting-edge language models through a sleek, intuitive terminal interface.

Whether you're a developer exploring the boundaries of conversational AI, a researcher studying human-AI interaction, or simply curious about the capabilities of real-time language models, this console provides an unparalleled window into the future of communication technology.

✨ Key Features

Core Capabilities

  • 🎙️ Intuitive Push-to-Talk — Press and hold the SPACE bar to speak, release to send your message. Experience conversation as it should be: fluid, natural, and effortless.

  • 🔊 Crystal-Clear Audio Response — Hear the AI's voice with exceptional clarity through real-time audio playback, creating an immersive conversational experience.

  • ⚡ Lightning-Fast Responsiveness — Built on an efficient asynchronous architecture that ensures smooth, non-blocking communication with minimal latency.

Advanced Features

  • 💾 Audio Recording & Playback — Save conversations as WAV files for future reference, analysis, or sharing with colleagues.

  • ⌨️ Text & Voice Flexibility — Seamlessly switch between voice and text input based on your preference or environment.

  • 🧹 Clean Interface Utilities — Maintain a distraction-free workspace with built-in terminal buffer clearing functionality.

  • 🔧 Extensive Configurability — Fine-tune your AI interactions with granular control over:

    • Temperature (creativity level)
    • Voice selection (multiple options available)
    • System instructions
    • Response length
    • Turn detection
    • And much more

🛠️ Installation Guide

Follow these steps to set up the OpenAI Realtime API Console on your system:

1. Clone the Repository

git clone https://github.com/S4mpl3r/openai-realtime-console-py.git
cd openai-realtime-console-py

2. Set Up a Virtual Environment

For Windows:

python -m venv .venv
.venv\Scripts\activate

For macOS/Linux:

python3 -m venv .venv
source .venv/bin/activate

3. Install Dependencies

For Windows:

python -m pip install -r requirements.txt

For macOS/Linux:

python3 -m pip install -r requirements.txt

4. Configure API Access

Create a .env file in the project root directory and add your OpenAI API key:

OPENAI_API_KEY=your_api_key_here

⚠️ Important: Your API key is sensitive information. Never commit it to version control or share it publicly.

🎮 Usage Guide

Starting the Console

For Windows:

python console.py

For macOS/Linux:

python3 console.py

Basic Controls

  • Press & hold SPACE — Record your voice input
  • Release SPACE — Send your message to the AI
  • Press Q — Exit the application

Interacting with the AI

  1. Start a conversation by pressing and holding the SPACE bar
  2. Speak naturally while holding SPACE
  3. Release the SPACE bar when you've finished speaking
  4. Listen to the AI's response through your speakers
  5. Continue the dialogue by pressing SPACE again when ready

⚙️ Advanced Configuration

The console provides extensive customization options through the SessionConfig class. You can modify parameters in the main() function of console.py:

Voice Options

Choose from a variety of expressive voices:

  • alloy (default)
  • echo
  • fable
  • onyx
  • nova
  • shimmer

Temperature Settings

Adjust the AI's creativity level (0.6-1.2):

  • Lower values (e.g., 0.7) for more focused, deterministic responses
  • Higher values (e.g., 1.1) for more varied, creative interactions

System Instructions

Customize the AI's behavior with specific instructions:

  • Define personality traits
  • Set knowledge boundaries
  • Guide response style and format
  • Establish voice characteristics

🌐 Realtime API Overview

The OpenAI Realtime API enables dynamic, bi-directional communication with AI models through WebSockets. Key aspects include:

  • Streaming Audio & Text — Send and receive audio/text in real-time
  • Turn Detection — Natural conversation management
  • Event-Based Architecture — Efficient handling of information exchange
  • Session Management — Maintain conversation context

This console implements a complete client for the API, handling all the complex details of WebSocket communication, audio processing, and event management.

📊 Performance Considerations

  • Audio Quality: 16-bit PCM mono at 24kHz provides optimal balance between quality and bandwidth
  • Network Requirements: A stable internet connection with at least 1 Mbps upload/download is recommended
  • Memory Usage: Typically consumes 50-100MB of RAM during operation

📚 Resources & Documentation

🤝 Contributing

Contributions are welcome! To contribute:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.


Created with ❤️ by Suvom

Experience the future of conversation today

About

Unleash the power of conversational AI with Realtime-API-Console-Python, a command-line tool for interacting with the OpenAI Realtime API. This console provides push-to-talk audio input, real-time audio playback, and is built with an efficient asynchronous Python client.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages