<a href="https://colab.research.google.com/github/TheOnlyOrigin/1108-cwalina-paul.github.io/blob/main/Conceptual_Gemini_Glasses_Bridge.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [6]:
# Conceptual Python script demonstrating the core logic for a
# "Gemini Bridge" application on a host device (e.g., your Fairphone).
#
# DISCLAIMER: This is for educational purposes. It will not work out-of-the-box
# with the glasses. It uses your computer's microphone and speakers to simulate
# the audio I/O that you would eventually handle over Bluetooth.
#
# Prerequisites on your computer to run this simulation:
# pip install google-generativeai speechrecognition pyttsx3 pyaudio

import google.generativeai as genai
import speech_recognition as sr
import pyttsx3
import time
from google.colab import userdata # Import userdata to access secrets

# --- 1. CONFIGURATION ---

# IMPORTANT: Securely manage your API key using Colab Secrets Manager.
try:
    # Access the secret named 'GEMINI_API_KEY'
    GEMINI_API_KEY = userdata.get('GEMINI_API_KEY')
except Exception as e:
    print(f"Error accessing GEMINI_API_KEY from Secrets Manager: {e}")
    print("Please ensure you have added GEMINI_API_KEY to Colab Secrets and enabled notebook access.")
    exit()


if GEMINI_API_KEY is None:
    print("CRITICAL: GEMINI_API_KEY not found in Colab Secrets Manager.")
    print("Please add your Gemini API key to Colab Secrets with the name 'GEMINI_API_KEY' and enable notebook access.")
    exit()

# Configure the Gemini client
genai.configure(api_key=GEMINI_API_KEY)
model = genai.GenerativeModel('gemini-1.5-flash')

# Initialize speech recognition and text-to-speech engines
recognizer = sr.Recognizer()
tts_engine = pyttsx3.init()

# --- 2. CORE LOGIC MODULES ---

def listen_for_audio_from_glasses():
    """
    CONCEPTUAL FUNCTION (SIMULATED)
    This represents the part of your app that listens for the audio
    stream from the glasses' microphones.

    REAL-WORLD IMPLEMENTATION:
    - This would be a persistent Bluetooth listener.
    - It would receive raw audio data (e.g., Opus or PCM packets).
    - You would need to decode and buffer this audio data.
    - For this simulation, we use the computer's microphone.
    """
    print("\n[GLASSES SIM] Listening for your voice...")
    with sr.Microphone() as source:
        # Adjust for ambient noise for better recognition
        recognizer.adjust_for_ambient_noise(source, duration=0.5)
        try:
            # Listen for the actual audio input
            audio_data = recognizer.listen(source, timeout=5, phrase_time_limit=10)
            return audio_data
        except sr.WaitTimeoutError:
            print("[GLASSES SIM] Listening timed out. No speech detected.")
            return None

def transcribe_audio_to_text(audio_data):
    """
    Transcribes the captured audio data into a text string.

    REAL-WORLD IMPLEMENTATION:
    - You could use an on-device transcription library for speed and privacy,
      or a cloud-based service like Google's Speech-to-Text API.
    """
    if audio_data is None:
        return None

    print("[PHONE] Transcribing audio...")
    try:
        text_query = recognizer.recognize_google(audio_data)
        print(f"[PHONE] You said: '{text_query}'")
        return text_query
    except sr.UnknownValueError:
        print("[PHONE] Transcription failed: Could not understand audio.")
        return None
    except sr.RequestError as e:
        print(f"[PHONE] Transcription service error: {e}")
        return "Error with the transcription service."

def query_gemini_model(text_query):
    """
    Sends the transcribed text to the Gemini API and gets the response.
    """
    if not text_query:
        return "I didn't receive a query."

    print(f"[CLOUD] Sending to Gemini: '{text_query}'")
    try:
        # The core call to the Gemini API
        response = model.generate_content(text_query)
        # Add a small delay to simulate network latency for realism
        time.sleep(1)
        print("[CLOUD] Received response from Gemini.")
        return response.text
    except Exception as e:
        print(f"[CLOUD] Gemini API Error: {e}")
        return "I'm having trouble connecting to my AI model right now."

def stream_response_to_glasses(text_response):
    """
    CONCEPTUAL FUNCTION (SIMULATED)
    Converts the text response from Gemini into speech and sends it
    to the glasses' speakers.

    REAL-WORLD IMPLEMENTATION:
    - Use an Android Text-to-Speech (TTS) engine.
    - As the TTS engine generates audio buffers, you would encode them
      (e.g., into Opus packets).
    - You would then send these packets to the glasses over the
      correct Bluetooth characteristic you discovered in your reverse-engineering.
    - For this simulation, we use the computer's speakers.
    """
    print(f"\n[GLASSES SIM] Speaking response: '{text_response}'")
    tts_engine.say(text_response)
    tts_engine.runAndWait()
    print("[GLASSES SIM] Finished speaking.")

# --- 3. MAIN APPLICATION LOOP ---

def main():
    """
    The main execution loop for the Gemini Bridge.
    This would be the core service running in your Android app.
    """
    print("--- Gemini Glasses Bridge Initialized (Simulation) ---")
    print("Say something into your microphone to begin.")

    while True:
        # Step 1: Listen for and capture audio from the "glasses"
        captured_audio = listen_for_audio_from_glasses()

        if captured_audio:
            # Step 2: Convert the captured audio into text
            user_query = transcribe_audio_to_text(captured_audio)

            if user_query:
                # Step 3: Send the text query to the Gemini model
                gemini_response = query_gemini_model(user_query)

                # Step 4: Convert the response to speech and "send" it to the glasses
                stream_response_to_glasses(gemini_response)

if __name__ == "__main__":
    main()

Error accessing GEMINI_API_KEY from Secrets Manager: Secret GEMINI_API_KEY does not exist.
Please ensure you have added GEMINI_API_KEY to Colab Secrets and enabled notebook access.


NameError: name 'GEMINI_API_KEY' is not defined

In [2]:
!pip install google-generativeai speechrecognition pyttsx3 pyaudio

Collecting speechrecognition
  Downloading speechrecognition-3.14.3-py3-none-any.whl.metadata (30 kB)
Collecting pyttsx3
  Downloading pyttsx3-2.98-py3-none-any.whl.metadata (3.8 kB)
Collecting pyaudio
  Downloading PyAudio-0.2.14.tar.gz (47 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m47.1/47.1 kB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Downloading speechrecognition-3.14.3-py3-none-any.whl (32.9 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m32.9/32.9 MB[0m [31m59.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pyttsx3-2.98-py3-none-any.whl (34 kB)
Building wheels for collected packages: pyaudio
  [1;31merror[0m: [1msubprocess-exited-with-error[0m
  
  [31m×[0m [32mBuilding wheel for pyaudio [0m[1;32m([0m[32mpyproject.toml[0m[1;32m)[0m did not run s

In [3]:
!apt-get update
!apt-get install -y portaudio19-dev

Get:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,632 B]
Hit:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
Hit:3 http://archive.ubuntu.com/ubuntu jammy InRelease
Get:4 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
Get:5 https://r2u.stat.illinois.edu/ubuntu jammy InRelease [6,555 B]
Get:6 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]
Get:7 http://archive.ubuntu.com/ubuntu jammy-backports InRelease [127 kB]
Get:8 https://r2u.stat.illinois.edu/ubuntu jammy/main all Packages [9,059 kB]
Hit:9 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease
Get:10 http://security.ubuntu.com/ubuntu jammy-security/main amd64 Packages [3,036 kB]
Hit:11 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy InRelease
Get:12 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 Packages [1,561 kB]
Hit:13 https://ppa.launchpadcontent.net/ubuntugis/ppa/ubuntu jam

In [4]:
!pip install google-generativeai speechrecognition pyttsx3 pyaudio

Collecting speechrecognition
  Using cached speechrecognition-3.14.3-py3-none-any.whl.metadata (30 kB)
Collecting pyttsx3
  Using cached pyttsx3-2.98-py3-none-any.whl.metadata (3.8 kB)
Collecting pyaudio
  Using cached PyAudio-0.2.14.tar.gz (47 kB)
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Using cached speechrecognition-3.14.3-py3-none-any.whl (32.9 MB)
Using cached pyttsx3-2.98-py3-none-any.whl (34 kB)
Building wheels for collected packages: pyaudio
  Building wheel for pyaudio (pyproject.toml) ... [?25l[?25hdone
  Created wheel for pyaudio: filename=pyaudio-0.2.14-cp311-cp311-linux_x86_64.whl size=67424 sha256=34fb6e7ab726797f606ab71c6c2c9c205cec0713248ec4fae10393efe8dd1512
  Stored in directory: /root/.cache/pip/wheels/80/b1/c1/67e4ef443de2665d86031d4760508094eab5de37d5d64d9c27
Successfully built pyaudio
Installing collected packages: pyttsx3, pyaudio, s

Now you can access the secret in your notebook using the `userdata` module:

In [None]:
from google.colab import userdata

# Access the secret named 'GEMINI_API_KEY'
GEMINI_API_KEY = userdata.get('GEMINI_API_KEY')

# Now you can use GEMINI_API_KEY in your script
# For example, you can configure the genai library with it
# import google.generativeai as genai
# genai.configure(api_key=GEMINI_API_KEY)

I will now modify the original script to use the key from the secrets manager.