Currently testing some LiveApi Stuff
- Python 3.8 or higher
- Gemini API key (get one from Google AI Studio)
- Working microphone
- Webcam (for camera mode)
- Speakers or headphones
- Clone the repository:
git clone https://github.com/Toltally-suck-at-code/LiveAPI.git
cd LiveAPI- Create and activate a virtual environment:
For Mac/Linux:
# Create virtual environment
python -m venv venv
# Activate virtual environment
source venv/bin/activateFor Windows (Command Prompt):
# Create virtual environment
python -m venv venv
# Activate virtual environment
venv\Scripts\activate.batFor Windows (PowerShell):
# Create virtual environment
python -m venv venv
# Activate virtual environment
.\venv\Scripts\Activate.ps1- Install system dependencies:
For macOS:
# Install PortAudio using Homebrew
brew install portaudioFor Ubuntu/Debian Linux:
sudo apt-get install python3-pyaudio portaudio19-devFor Windows:
# No additional system dependencies required- Install Python dependencies:
pip install -r requirements.txt- Set up your Gemini API key:
For Mac/Linux:
export GEMINI_API_KEY='your-api-key-here'For Windows (Command Prompt):
set GEMINI_API_KEY=your-api-key-hereFor Windows (PowerShell):
$env:GEMINI_API_KEY='your-api-key-here'python /app.pyYou can run the application in three different modes: (BACKUP MODE!)
- Camera mode (default):
python Old/LiveTest.py --mode camera- Screen capture mode:
python Old/LiveTest.py --mode screen- Audio only mode:
python Old/LiveTest.py --mode none- Real-time audio interaction with Gemini AI
- Optional video streaming from camera or screen
- Push-to-talk functionality
- Voice responses from the AI
- If you encounter audio issues, check your microphone and speaker settings
- Make sure your webcam is working if using camera mode
- Verify that your Gemini API key is correctly set in the environment variables
- If you get dependency errors, try running
pip install -r requirements.txtagain - If you get "command not found" errors, make sure your virtual environment is activated
- To deactivate the virtual environment when you're done, simply type
deactivatein your terminal - For macOS users: If you get PortAudio errors, make sure you've installed it using
brew install portaudio - For Linux users: If you get PortAudio errors, make sure you've installed the required packages using
apt-get
This app now includes a push-to-talk (PTT) feature to prevent the AI from hearing its own responses through your laptop speakers.
- Click and hold the microphone button in the UI to activate voice input
- Release the button to stop transmitting audio
- Press and hold Space bar to activate voice input
- Release Space bar to stop transmitting audio
-
Microphone Button States:
- Gray = Idle (not transmitting)
- Green with pulse effect = Active (transmitting audio)
- Faded/Disabled = No live session running
-
Listening Indicator:
- A red "Listening..." indicator appears in the bottom-right of the video feed when PTT is active
- The microphone will only transmit audio while PTT is active
- PTT is automatically disabled when:
- No live session is running
- The browser window loses focus
- You navigate away from the page
- The feature works on both desktop and mobile devices
- Prevents audio feedback loops
- Gives you complete control over when to speak
- Reduces background noise transmission
- Saves bandwidth by only sending audio when needed