# **Azure Voice Live API: Real-Time Voice Agents**

This notebook demonstrates enterprise-grade real-time voice conversation implementation using Azure Voice Live API with Azure AI Agent Service integration.

### **Notebook Architecture Overview**
```
┌─────────────────┐    ┌──────────────────┐    ┌─────────────────────┐
│   Audio I/O     │◄──►│  WebSocket Client │◄──►│ Azure AI Agent      │
│  (sounddevice)  │    │  (Threading)     │    │ Service + Voice API │
└─────────────────┘    └──────────────────┘    └─────────────────────┘
```

### **Implementation Progression**
1. **Core Infrastructure** - WebSocket connection & audio processing
2. **Authentication & Configuration** - Entra ID + API key fallback  
3. **Direct Model Integration** - Basic voice live API connection
4. **Azure AI Agent Service** - Production agent-based architecture
5. **Production Deployment** - Complete application with proper cleanup

### **Key Features**
- **Low-latency audio streaming** (24kHz, 20ms chunks)
- **Semantic VAD** with Azure deep noise suppression
- **Thread-safe audio processing** with graceful shutdown
- **Dual authentication** (Entra ID + API key)
- **Agent-based architecture** for enterprise scalability

## **1. Dependencies & Core Imports**

In [1]:
# Core Python libraries
import os
import uuid
import json
import time
import base64
import logging
import threading
import queue
import signal
import sys
from collections import deque
from datetime import datetime

# Audio processing
import numpy as np
import sounddevice as sd

# Azure SDK and authentication
from dotenv import load_dotenv
from azure.core.credentials import TokenCredential
from azure.identity import DefaultAzureCredential

# Type hints
from typing import Dict, Union, Literal, Set
from typing_extensions import Iterator, TypedDict, Required

# WebSocket communication
import websocket
from websocket import WebSocketApp

print("✅ All dependencies imported successfully!")

✅ All dependencies imported successfully!


## **2. Configuration & Environment**

In [2]:
# Load environment variables
load_dotenv("./.env", override=True)

# Global variables for thread coordination
stop_event = threading.Event()
connection_queue = queue.Queue()

# Audio configuration
AUDIO_SAMPLE_RATE = 24000

# Logger setup
logger = logging.getLogger(__name__)

# Voice Live API configuration
AZURE_VOICE_LIVE_ENDPOINT = os.environ.get("AZURE_VOICE_LIVE_ENDPOINT") or "https://aifoundry825233136833-resource.services.ai.azure.com"
AZURE_VOICE_LIVE_MODEL = os.environ.get("AZURE_VOICE_LIVE_MODEL") or "gpt-4o"
AZURE_VOICE_LIVE_API_VERSION = os.environ.get("AZURE_VOICE_LIVE_API_VERSION") or "2025-05-01-preview"
AZURE_VOICE_LIVE_API_KEY = os.environ.get("AZURE_VOICE_LIVE_API_KEY")

# Azure AI Agent Service configuration
AI_FOUNDRY_PROJECT_NAME = os.environ.get("AI_FOUNDRY_PROJECT_NAME")
AI_FOUNDRY_AGENT_ID = os.environ.get("AI_FOUNDRY_AGENT_ID")
AI_FOUNDRY_AGENT_CONNECTION_STRING = os.environ.get("AI_FOUNDRY_AGENT_CONNECTION_STRING")

print("📋 Voice Live API Configuration:")
print(f"  - Endpoint: {AZURE_VOICE_LIVE_ENDPOINT}")
print(f"  - Model: {AZURE_VOICE_LIVE_MODEL}")
print(f"  - API Version: {AZURE_VOICE_LIVE_API_VERSION}")
print(f"  - API Key: {'✅ Configured' if AZURE_VOICE_LIVE_API_KEY else '❌ Not set'}")

print("\n🤖 Azure AI Agent Service Configuration:")
print(f"  - Project Name: {AI_FOUNDRY_PROJECT_NAME or '❌ Not set'}")
print(f"  - Agent ID: {AI_FOUNDRY_AGENT_ID or '❌ Not set'}")
print(f"  - Connection String: {'✅ Configured' if AI_FOUNDRY_AGENT_CONNECTION_STRING else '❌ Not set (hub-based only)'}")

print(f"\n🎛️  Audio Sample Rate: {AUDIO_SAMPLE_RATE} Hz")

📋 Voice Live API Configuration:
  - Endpoint: https://poc-ai-agents-voice-resource.cognitiveservices.azure.com/
  - Model: gpt-4o
  - API Version: 2025-05-01-preview
  - API Key: ✅ Configured

🤖 Azure AI Agent Service Configuration:
  - Project Name: poc-ai-agents-voice
  - Agent ID: asst_Dd9U7mxFgfZxjwhEbSr76dyU
  - Connection String: ❌ Not set (hub-based only)

🎛️  Audio Sample Rate: 24000 Hz


## **3. WebSocket Connection Management**

Thread-safe WebSocket client with message queuing and connection state management.

In [3]:
class VoiceLiveConnection:
    """
    Manages WebSocket connection to Azure Voice Live API.
    
    Features:
    - Asynchronous message handling
    - Thread-safe message queue
    - Connection state management
    - Error handling and logging
    """
    
    def __init__(self, url: str, headers: dict) -> None:
        self._url = url
        self._headers = headers
        self._ws = None
        self._message_queue = queue.Queue()
        self._connected = False
        self._connection_id = str(uuid.uuid4())[:8]
        self._message_count = 0

    def connect(self) -> None:
        """Establish WebSocket connection with event handlers."""
        
        if self._connected:
            logger.warning("Connection already established")
            return
            
        print(f"🔌 Establishing WebSocket connection (ID: {self._connection_id})...")
        
        def on_message(ws, message):
            """Handle incoming messages by adding them to the queue."""
            self._message_count += 1
            self._message_queue.put(message)
            logger.debug(f"Connection {self._connection_id}: Message #{self._message_count} queued")
        
        def on_error(ws, error):
            """Handle WebSocket errors."""
            print(f"❌ WebSocket error (ID: {self._connection_id}): {error}")
            logger.error(f"Connection {self._connection_id}: WebSocket error: {error}")
        
        def on_close(ws, close_status_code, close_msg):
            """Handle connection closure."""
            print(f"🔌 WebSocket connection closed (ID: {self._connection_id})")
            print(f"   Status: {close_status_code}, Message: {close_msg}")
            logger.info(f"Connection {self._connection_id}: WebSocket closed - {close_status_code}: {close_msg}")
            self._connected = False
        
        def on_open(ws):
            """Handle successful connection."""
            print(f"✅ WebSocket connection opened (ID: {self._connection_id})")
            logger.info(f"Connection {self._connection_id}: WebSocket opened")
            self._connected = True

        # Create WebSocket app with event handlers
        self._ws = websocket.WebSocketApp(
            self._url,
            header=self._headers,
            on_message=on_message,
            on_error=on_error,
            on_close=on_close,
            on_open=on_open
        )
        
        # Start WebSocket in a separate thread
        self._ws_thread = threading.Thread(
            target=self._ws.run_forever, 
            name=f"WebSocket-{self._connection_id}"
        )
        self._ws_thread.daemon = True
        self._ws_thread.start()
        
        # Wait for connection to be established
        timeout = 10  # seconds
        start_time = time.time()
        while not self._connected and time.time() - start_time < timeout:
            time.sleep(0.1)
        
        if not self._connected:
            raise ConnectionError(f"Failed to establish WebSocket connection (ID: {self._connection_id}) within {timeout} seconds")
        
        print(f"✅ Connection established successfully (ID: {self._connection_id})")

    def recv(self) -> str:
        """Receive a message from the queue (non-blocking with timeout)."""
        try:
            message = self._message_queue.get(timeout=1)
            logger.debug(f"Connection {self._connection_id}: Message received from queue")
            return message
        except queue.Empty:
            return None

    def send(self, message: str) -> None:
        """Send a message through the WebSocket."""
        if self._ws and self._connected:
            self._ws.send(message)
            logger.debug(f"Connection {self._connection_id}: Message sent")
        else:
            logger.warning(f"Connection {self._connection_id}: Cannot send - not connected")
            print(f"⚠️  Cannot send message - connection not established (ID: {self._connection_id})")

    def close(self) -> None:
        """Close the WebSocket connection."""
        if self._ws:
            print(f"🔌 Closing WebSocket connection (ID: {self._connection_id})...")
            logger.info(f"Connection {self._connection_id}: Closing connection")
            self._ws.close()
            self._connected = False
            
            # Wait for thread to finish
            if hasattr(self, '_ws_thread') and self._ws_thread.is_alive():
                self._ws_thread.join(timeout=3)
                if self._ws_thread.is_alive():
                    print(f"⚠️  WebSocket thread did not stop gracefully (ID: {self._connection_id})")
                else:
                    print(f"✅ WebSocket thread stopped (ID: {self._connection_id})")
            
            print(f"✅ Connection closed (ID: {self._connection_id}, {self._message_count} messages processed)")
        else:
            print(f"ℹ️  No connection to close (ID: {self._connection_id})")

    @property
    def connection_id(self) -> str:
        """Get the connection ID for debugging."""
        return self._connection_id
    
    @property
    def is_connected(self) -> bool:
        """Check if the connection is active."""
        return self._connected
    
    @property
    def message_count(self) -> int:
        """Get the number of messages processed."""
        return self._message_count

print("✅ Enhanced VoiceLiveConnection class with connection state management")

✅ Enhanced VoiceLiveConnection class with connection state management


## **4. Azure Voice Live Client**

Main client supporting both direct model and agent connections.

In [4]:
class AzureVoiceLive:
    """
    Main client for Azure Voice Live API.
    
    Handles:
    - Authentication (API key or token-based)
    - Connection management
    - URL construction for WebSocket endpoint
    """
    
    def __init__(
        self,
        *,
        azure_endpoint: str | None = None,
        api_version: str | None = None,
        token: str | None = None,
        api_key: str | None = None,
    ) -> None:
        self._azure_endpoint = azure_endpoint
        self._api_version = api_version
        self._token = token
        self._api_key = api_key
        self._connection = None

    def connect(self, model: str) -> VoiceLiveConnection:
        """
        Create a connection to the Voice Live API.
        
        Args:
            model: The AI model to use (e.g., 'gpt-4o')
            
        Returns:
            VoiceLiveConnection: Ready-to-use connection object
        """
        # if self._connection is not None:
        #     raise ValueError("Already connected to the Voice Live API.")
        if not model:
            raise ValueError("Model name is required.")

        # Convert HTTPS endpoint to WSS for WebSocket
        azure_ws_endpoint = self._azure_endpoint.rstrip('/').replace("https://", "wss://")

        # Construct WebSocket URL
        url = f"{azure_ws_endpoint}/voice-live/realtime?api-version={self._api_version}&model={model}"

        # Setup authentication headers
        auth_header = {"Authorization": f"Bearer {self._token}"} if self._token else {"api-key": self._api_key}
        request_id = uuid.uuid4()
        headers = {"x-ms-client-request-id": str(request_id), **auth_header}

        # Create and connect
        self._connection = VoiceLiveConnection(url, headers)
        self._connection.connect()
        return self._connection

print("✅ AzureVoiceLive client class defined")

✅ AzureVoiceLive client class defined


## **5. Real-Time Audio Processing**

Low-latency audio player with deque-based buffering for real-time streaming.

In [5]:
class AudioPlayerAsync:
    """
    Asynchronous audio player with buffering for real-time playback.
    
    Features:
    - Thread-safe audio queue
    - Real-time streaming playback
    - Automatic start/stop management
    - Low-latency audio processing
    """
    
    def __init__(self):
        self.queue = deque()
        self.lock = threading.Lock()
        self.stream = sd.OutputStream(
            callback=self.callback,
            samplerate=AUDIO_SAMPLE_RATE,
            channels=1,
            dtype=np.int16,
            blocksize=2400,  # ~100ms at 24kHz
        )
        self.playing = False

    def callback(self, outdata, frames, time, status):
        """
        Audio callback function called by sounddevice.
        
        This function is called in real-time by the audio system
        and must be very efficient to avoid dropouts.
        """
        if status:
            logger.warning(f"Stream status: {status}")
            
        with self.lock:
            data = np.empty(0, dtype=np.int16)
            
            # Fill the output buffer from our queue
            while len(data) < frames and len(self.queue) > 0:
                item = self.queue.popleft()
                frames_needed = frames - len(data)
                data = np.concatenate((data, item[:frames_needed]))
                
                # If we have leftover data, put it back
                if len(item) > frames_needed:
                    self.queue.appendleft(item[frames_needed:])
            
            # Pad with silence if we don't have enough data
            if len(data) < frames:
                data = np.concatenate((data, np.zeros(frames - len(data), dtype=np.int16)))
                
        outdata[:] = data.reshape(-1, 1)

    def add_data(self, data: bytes):
        """Add audio data to the playback queue."""
        with self.lock:
            np_data = np.frombuffer(data, dtype=np.int16)
            self.queue.append(np_data)
            
            # Auto-start playback if we have data
            if not self.playing and len(self.queue) > 0:
                self.start()

    def start(self):
        """Start audio playback."""
        if not self.playing:
            self.playing = True
            self.stream.start()

    def stop(self):
        """Stop audio playback and clear buffer."""
        with self.lock:
            self.queue.clear()
        self.playing = False
        self.stream.stop()

    def terminate(self):
        """Terminate the audio player and release resources."""
        with self.lock:
            self.queue.clear()
        self.stream.stop()
        self.stream.close()

print("✅ AudioPlayerAsync class defined")

✅ AudioPlayerAsync class defined


## **6. Audio Input Capture**

Microphone capture with base64 encoding for WebSocket transmission.

In [6]:
def listen_and_send_audio(connection: VoiceLiveConnection) -> None:
    """
    Capture audio from microphone and send to Voice Live API.
    
    This function runs in a separate thread and continuously:
    1. Reads audio from the microphone
    2. Encodes it as base64
    3. Sends it to the API via WebSocket
    
    Args:
        connection: Active VoiceLiveConnection instance
    """
    logger.info("Starting audio stream ...")

    # Create audio input stream
    stream = sd.InputStream(
        channels=1, 
        samplerate=AUDIO_SAMPLE_RATE, 
        dtype="int16"
    )
    
    try:
        stream.start()
        
        # Read audio in 20ms chunks (480 samples at 24kHz)
        read_size = int(AUDIO_SAMPLE_RATE * 0.02)
        
        while not stop_event.is_set():
            if stream.read_available >= read_size:
                # Read audio data
                data, _ = stream.read(read_size)
                
                # Encode as base64
                audio = base64.b64encode(data).decode("utf-8")
                
                # Create API message
                param = {
                    "type": "input_audio_buffer.append", 
                    "audio": audio, 
                    "event_id": ""
                }
                
                # Send to API
                data_json = json.dumps(param)
                connection.send(data_json)
            else:
                time.sleep(0.001)  # Small sleep to prevent busy waiting
                
    except Exception as e:
        logger.error(f"Audio stream interrupted. {e}")
    finally:
        stream.stop()
        stream.close()
        logger.info("Audio stream closed.")

print("✅ Audio input function defined")

✅ Audio input function defined


## **7. Audio Output & Event Processing**

WebSocket event handler with audio playback and conversation state management.

In [7]:
def receive_audio_and_playback(connection: VoiceLiveConnection) -> None:
    """
    Receive messages from Voice Live API and handle audio playback.
    
    This function runs in a separate thread and handles:
    1. Receiving WebSocket messages from the API
    2. Processing different event types
    3. Playing back audio responses
    4. Managing conversation state
    
    Args:
        connection: Active VoiceLiveConnection instance
    """
    last_audio_item_id = None
    audio_player = AudioPlayerAsync()
    event_count = 0
    session_id = None

    logger.info("Starting audio playback thread...")
    print("🎧 Audio playback thread started - monitoring events...")
    
    try:
        while not stop_event.is_set():
            # Receive message from API
            raw_event = connection.recv()
            if raw_event is None:
                continue
                
            try:
                event = json.loads(raw_event)
                event_type = event.get("type")
                event_count += 1
                
                # Log every event with timestamp
                timestamp = datetime.now().strftime("%H:%M:%S.%f")[:-3]
                print(f"📨 [{timestamp}] Event #{event_count}: {event_type}")
                logger.info(f"Event #{event_count}: {event_type} - {event}")

                # Handle different event types with detailed logging
                if event_type == "session.created":
                    session = event.get("session", {})
                    session_id = session.get('id', 'unknown')
                    print(f"✅ Session created: {session_id}")
                    print(f"   Instructions: {session.get('instructions', 'Not set')[:100]}...")
                    print(f"   Voice: {session.get('voice', {}).get('name', 'Not set')}")
                    logger.info(f"Session created: {session_id}")

                elif event_type == "session.updated":
                    session = event.get("session", {})
                    print(f"🔄 Session updated")
                    print(f"   Voice: {session.get('voice', {}).get('name', 'Not set')}")
                    print(f"   VAD: {session.get('turn_detection', {}).get('type', 'Not set')}")

                elif event_type == "input_audio_buffer.committed":
                    item_id = event.get("item_id", "unknown")
                    print(f"🎤 Audio buffer committed: {item_id}")

                elif event_type == "input_audio_buffer.speech_started":
                    # User started speaking - stop AI playback
                    print(f"🗣️  Speech started - stopping AI playback")
                    audio_player.stop()

                elif event_type == "input_audio_buffer.speech_stopped":
                    print(f"🤐 Speech stopped - processing...")

                elif event_type == "conversation.item.created":
                    item = event.get("item", {})
                    item_type = item.get("type", "unknown")
                    item_id = item.get("id", "unknown")
                    print(f"📝 Item created: {item_type} ({item_id})")
                    if item_type == "message":
                        role = item.get("role", "unknown")
                        print(f"   Role: {role}")

                elif event_type == "response.created":
                    response = event.get("response", {})
                    response_id = response.get("id", "unknown")
                    print(f"🤖 Response created: {response_id}")

                elif event_type == "response.output_item.added":
                    item = event.get("item", {})
                    item_type = item.get("type", "unknown")
                    item_id = item.get("id", "unknown")
                    print(f"➕ Output item added: {item_type} ({item_id})")

                elif event_type == "response.content_part.added":
                    part = event.get("part", {})
                    part_type = part.get("type", "unknown")
                    print(f"🧩 Content part added: {part_type}")

                elif event_type == "response.audio_transcript.delta":
                    delta = event.get("delta", "")
                    item_id = event.get("item_id", "unknown")
                    print(f"📝 Transcript delta: '{delta}' (item: {item_id})")

                elif event_type == "response.audio.delta":
                    # New audio data from AI response
                    item_id = event.get("item_id", "unknown")
                    delta_length = len(event.get("delta", ""))
                    
                    if item_id != last_audio_item_id:
                        last_audio_item_id = item_id
                        print(f"🎵 New audio item: {item_id}")

                    # Decode and play audio
                    bytes_data = base64.b64decode(event.get("delta", ""))
                    if bytes_data:
                        print(f"🔊 Playing audio chunk: {len(bytes_data)} bytes")
                        logger.debug(f"Audio data length: {len(bytes_data)} bytes")   
                        audio_player.add_data(bytes_data)

                elif event_type == "response.audio.done":
                    item_id = event.get("item_id", "unknown")
                    print(f"✅ Audio complete: {item_id}")

                elif event_type == "response.content_part.done":
                    part = event.get("part", {})
                    part_type = part.get("type", "unknown")
                    print(f"✅ Content part done: {part_type}")

                elif event_type == "response.output_item.done":
                    item = event.get("item", {})
                    item_type = item.get("type", "unknown")
                    item_id = item.get("id", "unknown")
                    print(f"✅ Output item done: {item_type} ({item_id})")

                elif event_type == "response.done":
                    response = event.get("response", {})
                    response_id = response.get("id", "unknown")
                    status = response.get("status", "unknown")
                    print(f"✅ Response complete: {response_id} (status: {status})")

                elif event_type == "rate_limits.updated":
                    limits = event.get("rate_limits", [])
                    print(f"📊 Rate limits updated: {len(limits)} limits")
                    for limit in limits:
                        name = limit.get("name", "unknown")
                        remaining = limit.get("remaining", 0)
                        print(f"   {name}: {remaining} remaining")

                elif event_type == "error":
                    # Handle API errors
                    error_details = event.get("error", {})
                    error_type = error_details.get("type", "Unknown")
                    error_code = error_details.get("code", "Unknown")
                    error_message = error_details.get("message", "No message provided")
                    print(f"❌ ERROR: {error_type} ({error_code})")
                    print(f"   Message: {error_message}")
                    logger.error(f"API Error: {error_type} - {error_code} - {error_message}")
                    raise ValueError(f"API Error: {error_type} ({error_code}) - {error_message}")
                
                else:
                    # Log any unknown event types
                    print(f"❓ Unknown event: {event_type}")
                    logger.warning(f"Unknown event type: {event_type} - {event}")
                    
            except json.JSONDecodeError as e:
                logger.error(f"Failed to parse JSON event: {e}")
                print(f"❌ JSON Parse Error: {e}")
                continue

    except Exception as e:
        logger.error(f"Error in audio playback: {e}")
        print(f"❌ Audio playback error: {e}")
        import traceback
        traceback.print_exc()
    finally:
        audio_player.terminate()
        print(f"✅ Audio playback thread completed ({event_count} events processed)")
        logger.info(f"Audio playback done. Processed {event_count} events.")

print("✅ Enhanced audio output function with comprehensive event logging")

✅ Enhanced audio output function with comprehensive event logging


## **8. Graceful Shutdown Handler**

In [8]:
def read_keyboard_and_quit() -> None:
    """
    Monitor keyboard input for quit command.
    
    This function runs in a separate thread and waits for the user
    to type 'q' to gracefully shutdown the application.
    """
    print("Press 'q' and Enter to quit the chat.")
    
    while not stop_event.is_set():
        try:
            user_input = input()
            if user_input.strip().lower() == 'q':
                print("Quitting the chat...")
                stop_event.set()
                break
        except EOFError:
            # Handle case where input is interrupted
            break

print("✅ User input function defined")

✅ User input function defined


## **9. Session Configuration**

Production-grade session settings with semantic VAD and audio preprocessing.

In [9]:
# Session configuration for the Voice Live API
session_config = {
    "type": "session.update",
    "session": {
        # Basic AI behavior
        "instructions": "You are a helpful AI assistant responding in natural, engaging language.",
        
        # Voice Activity Detection (VAD) settings
        "turn_detection": {
            "type": "azure_semantic_vad",
            "threshold": 0.3,  # Sensitivity for detecting speech
            "prefix_padding_ms": 200,  # Keep audio before speech starts
            "silence_duration_ms": 200,  # How long to wait for silence
            "remove_filler_words": False,  # Keep "um", "uh", etc.
            "end_of_utterance_detection": {
                "model": "semantic_detection_v1",
                "threshold": 0.01,  # When to consider speech finished
                "timeout": 2,  # Max time to wait for continuation
            },
        },
        
        # Audio preprocessing
        "input_audio_noise_reduction": {
            "type": "azure_deep_noise_suppression"  # Remove background noise
        },
        "input_audio_echo_cancellation": {
            "type": "server_echo_cancellation"  # Prevent feedback loops
        },
        
        # AI voice settings
        "voice": {
            "name": "en-US-Ava:DragonHDLatestNeural",  # High-quality neural voice
            "type": "azure-standard",
            "temperature": 0.8,  # Controls response creativity/randomness
        },
    },
    "event_id": ""
}

print("✅ Session configuration created")
print(f"   - Voice: {session_config['session']['voice']['name']}")
print(f"   - VAD Type: {session_config['session']['turn_detection']['type']}")
print(f"   - Noise Reduction: {session_config['session']['input_audio_noise_reduction']['type']}")
print(f"   - Echo Cancellation: {session_config['session']['input_audio_echo_cancellation']['type']}")

✅ Session configuration created
   - Voice: en-US-Ava:DragonHDLatestNeural
   - VAD Type: azure_semantic_vad
   - Noise Reduction: azure_deep_noise_suppression
   - Echo Cancellation: server_echo_cancellation


## **10. Authentication & Client Setup**

Dual authentication strategy: Entra ID (preferred) with API key fallback.

In [10]:
# Setup logging for this session
timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")

# Create logs directory if it doesn't exist
import os
if not os.path.exists('logs'):
    os.makedirs('logs')

# Configure logging
logging.basicConfig(
    filename=f'logs/{timestamp}_voicelive_notebook.log',
    filemode="w",
    level=logging.DEBUG,
    format='%(asctime)s:%(name)s:%(levelname)s:%(message)s'
)

# Setup authentication - prefer token-based auth
credential = DefaultAzureCredential()
scopes = "https://ai.azure.com/.default"

try:
    token = credential.get_token(scopes)
    auth_method = "token"
    print("✅ Using token-based authentication")
except Exception as e:
    print(f"⚠️  Token auth failed: {e}")
    print("🔑 Falling back to API key authentication")
    auth_method = "api_key"

# Create the client
client_kwargs = {
    "azure_endpoint": AZURE_VOICE_LIVE_ENDPOINT,
    "api_version": AZURE_VOICE_LIVE_API_VERSION,
}

if auth_method == "token":
    client_kwargs["token"] = token.token
else:
    client_kwargs["api_key"] = AZURE_VOICE_LIVE_API_KEY

client = AzureVoiceLive(**client_kwargs)

print(f"✅ Azure Voice Live client created successfully")
print(f"   - Authentication: {auth_method}")
print(f"   - Endpoint: {AZURE_VOICE_LIVE_ENDPOINT}")
print(f"   - Model: {AZURE_VOICE_LIVE_MODEL}")

✅ Using token-based authentication
✅ Azure Voice Live client created successfully
   - Authentication: token
   - Endpoint: https://poc-ai-agents-voice-resource.cognitiveservices.azure.com/
   - Model: gpt-4o


## **11. Connection Validation**

In [12]:
# Test connection (optional - uncomment to run)
# This will establish a connection and immediately close it

try:
    print("🔌 Testing connection...")
    test_connection = client.connect(model=AZURE_VOICE_LIVE_MODEL)
    print("✅ Connection test successful!")
    test_connection.close()
    print("🔌 Test connection closed")
except Exception as e:
    print(f"❌ Connection test failed: {e}")

# print("ℹ️  Connection test code ready (uncomment to run)")

🔌 Testing connection...
🔌 Establishing WebSocket connection (ID: c82a9cab)...
✅ WebSocket connection opened (ID: c82a9cab)
✅ Connection established successfully (ID: c82a9cab)
✅ Connection test successful!
🔌 Closing WebSocket connection (ID: c82a9cab)...
🔌 WebSocket connection closed (ID: c82a9cab)
   Status: None, Message: None
✅ Connection closed (ID: c82a9cab, 1 messages processed)
🔌 Test connection closed


## **Development Utilities & Diagnostics**

In [11]:
# Development and testing utilities

def check_audio_devices():
    """Check available audio input and output devices."""
    print("🎤 Available Audio Input Devices:")
    for i, device in enumerate(sd.query_devices()):
        if device['max_input_channels'] > 0:
            print(f"  {i}: {device['name']} (inputs: {device['max_input_channels']})")
    
    print("\n🔊 Available Audio Output Devices:")
    for i, device in enumerate(sd.query_devices()):
        if device['max_output_channels'] > 0:
            print(f"  {i}: {device['name']} (outputs: {device['max_output_channels']})")
    
    print(f"\n🎛️  Default Input Device: {sd.query_devices(sd.default.device[0])['name']}")
    print(f"🎛️  Default Output Device: {sd.query_devices(sd.default.device[1])['name']}")

def test_microphone(duration=3):
    """Test microphone input for a few seconds."""
    print(f"🎤 Testing microphone for {duration} seconds...")
    print("💬 Please speak into your microphone...")
    
    def callback(indata, frames, time, status):
        volume_norm = np.linalg.norm(indata) * 10
        print(f"📊 Volume level: {'█' * int(volume_norm)}")
    
    with sd.InputStream(callback=callback, channels=1, samplerate=AUDIO_SAMPLE_RATE):
        sd.sleep(duration * 1000)
    
    print("✅ Microphone test complete")

def create_custom_session_config(
    instructions="You are a helpful AI assistant.",
    voice_name="en-US-Ava:DragonHDLatestNeural",
    temperature=0.8,
    vad_threshold=0.3
):
    """Create a custom session configuration."""
    return {
        "type": "session.update",
        "session": {
            "instructions": instructions,
            "turn_detection": {
                "type": "azure_semantic_vad",
                "threshold": vad_threshold,
                "prefix_padding_ms": 200,
                "silence_duration_ms": 200,
                "remove_filler_words": False,
                "end_of_utterance_detection": {
                    "model": "semantic_detection_v1",
                    "threshold": 0.01,
                    "timeout": 2,
                },
            },
            "input_audio_noise_reduction": {
                "type": "azure_deep_noise_suppression"
            },
            "input_audio_echo_cancellation": {
                "type": "server_echo_cancellation"
            },
            "voice": {
                "name": voice_name,
                "type": "azure-standard",
                "temperature": temperature,
            },
        },
        "event_id": ""
    }

def view_logs():
    """View the most recent log file."""
    import glob
    log_files = glob.glob("logs/*.log")
    if log_files:
        latest_log = max(log_files, key=os.path.getctime)
        print(f"📝 Latest log file: {latest_log}")
        with open(latest_log, 'r') as f:
            lines = f.readlines()
            print("📄 Last 20 lines:")
            for line in lines[-20:]:
                print(line.strip())
    else:
        print("📝 No log files found")

print("✅ Development utilities loaded:")
print("  - check_audio_devices(): Check available audio devices")
print("  - test_microphone(duration=3): Test microphone input")
print("  - create_custom_session_config(): Create custom configurations")
print("  - view_logs(): View recent log entries")

✅ Development utilities loaded:
  - check_audio_devices(): Check available audio devices
  - test_microphone(duration=3): Test microphone input
  - create_custom_session_config(): Create custom configurations
  - view_logs(): View recent log entries


## **Troubleshooting & Best Practices**

### **Common Issues**
- **401/403**: Verify Entra ID permissions and endpoint URL
- **404/409**: Check agent ID and API version
- **Audio latency**: Optimize buffer sizes and sample rates
- **Connection hangs**: Implement proper timeout and cleanup

### **Production Considerations**
- **Authentication**: Use Entra ID in production, API keys for development
- **Audio Quality**: 24kHz sample rate with noise suppression enabled
- **Error Handling**: Implement circuit breakers and retry logic
- **Monitoring**: Add Application Insights telemetry
- **Scaling**: Use connection pooling for high-volume scenarios

## **Audio Diagnostics**

In [13]:
# First, let's check your audio devices
print("🔍 Checking audio device configuration...")
check_audio_devices()

print("\n" + "="*60)
print("🎤 Testing microphone for 3 seconds...")
print("Please speak into your microphone!")
test_microphone(duration=3)

🔍 Checking audio device configuration...
🎤 Available Audio Input Devices:
  0: Microsoft Sound Mapper - Input (inputs: 2)
  1: Surface Stereo Microphones (Sur (inputs: 2)
  2: Microphone (Lumina Camera - Raw (inputs: 2)
  6: Primary Sound Capture Driver (inputs: 2)
  7: Surface Stereo Microphones (Surface High Definition Audio) (inputs: 2)
  8: Microphone (Lumina Camera - Raw) (inputs: 2)
  14: Surface Stereo Microphones (Surface High Definition Audio) (inputs: 2)
  15: Microphone (Lumina Camera - Raw) (inputs: 2)
  17: Microphone (Microsoft Surface Thunderbolt(TM) 4 Dock Audio) (inputs: 1)
  20: Headset (@System32\drivers\bthhfenum.sys,#2;%1 Hands-Free%0
;(Shiva’s AirPods Pro #2)) (inputs: 1)
  22: Microphone (Dell USB Audio) (inputs: 2)
  24: Headset (@System32\drivers\bthhfenum.sys,#2;%1 Hands-Free%0
;(Shiva’s AirPods Pro #2 - Find My)) (inputs: 1)
  27: PC Speaker (Realtek HD Audio 2nd output with SST) (inputs: 2)
  30: PC Speaker (Realtek HD Audio output with SST) (inputs: 2)
  31

In [14]:
def run_voice_chat_debug():
    """
    Enhanced voice chat function with better debugging and error handling.
    """
    global stop_event
    connection = None  # Initialize connection variable
    
    try:
        # Reset the stop event for a fresh start
        stop_event.clear()
        
        print("🚀 Starting Voice Live chat application (DEBUG MODE)...")
        
        # 1. Connect to the API
        connection = client.connect(model=AZURE_VOICE_LIVE_MODEL)
        print("✅ Connected to Voice Live API")
        
        # 2. Send session configuration
        connection.send(json.dumps(session_config))
        print("✅ Session configuration sent")
        print(f"   Instructions: {session_config['session']['instructions'][:50]}...")
        print(f"   Voice: {session_config['session']['voice']['name']}")
        
        # 3. Wait a moment to receive session.created event
        print("⏳ Waiting for session creation...")
        time.sleep(2)
        
        # Check for any immediate messages
        for i in range(5):
            msg = connection.recv()
            if msg:
                try:
                    event = json.loads(msg)
                    print(f"📨 Received: {event.get('type', 'unknown')}")
                    if event.get('type') == 'session.created':
                        print(f"   Session ID: {event.get('session', {}).get('id', 'unknown')}")
                    elif event.get('type') == 'error':
                        print(f"❌ Error: {event.get('error', {})}")
                except:
                    print(f"📨 Raw message: {msg[:100]}...")
            else:
                break
        
        # 4. Create and start threads with better error handling
        print("🧵 Starting threads...")
        
        def safe_listen_and_send_audio(connection):
            try:
                listen_and_send_audio(connection)
            except Exception as e:
                print(f"❌ Audio input error: {e}")
                stop_event.set()
        
        def safe_receive_audio_and_playback(connection):
            try:
                receive_audio_and_playback(connection)
            except Exception as e:
                print(f"❌ Audio output error: {e}")
                stop_event.set()
        
        send_thread = threading.Thread(
            target=safe_listen_and_send_audio, 
            args=(connection,),
            name="AudioInput"
        )
        receive_thread = threading.Thread(
            target=safe_receive_audio_and_playback, 
            args=(connection,),
            name="AudioOutput"
        )
        
        # Start audio threads
        send_thread.start()
        receive_thread.start()
        
        print("🎙️  Voice chat is now active!")
        print("💬 You can start speaking...")
        print("📊 Monitoring audio threads...")
        
        # Monitor threads instead of waiting for keyboard input
        start_time = time.time()
        max_runtime = 60  # Maximum runtime in seconds before auto-shutdown
        
        while not stop_event.is_set():
            # Check if threads are still alive
            if not send_thread.is_alive():
                print("⚠️  Audio input thread died")
                stop_event.set()
                break
            if not receive_thread.is_alive():
                print("⚠️  Audio output thread died")
                stop_event.set()
                break
            
            # Auto-shutdown after max runtime to prevent hanging
            if time.time() - start_time > max_runtime:
                print(f"⏰ Auto-shutdown after {max_runtime} seconds")
                stop_event.set()
                break
            
            # Print status every 10 seconds
            if (time.time() - start_time) % 10 < 1:
                elapsed = int(time.time() - start_time)
                remaining = max_runtime - elapsed
                print(f"📊 Running {elapsed}s... Threads alive. {remaining}s remaining. Speak into your microphone!")
            
            time.sleep(1)
        
    except Exception as e:
        print(f"❌ Error in voice chat: {e}")
        import traceback
        traceback.print_exc()
        stop_event.set()
    
    finally:
        # ALWAYS ensure cleanup happens
        print("🛑 Shutting down...")
        stop_event.set()
        
        # Wait for threads to finish (with timeout)
        if 'send_thread' in locals() and send_thread.is_alive():
            send_thread.join(timeout=3)
            if send_thread.is_alive():
                print("⚠️  Audio input thread did not stop gracefully")
        
        if 'receive_thread' in locals() and receive_thread.is_alive():
            receive_thread.join(timeout=3)
            if receive_thread.is_alive():
                print("⚠️  Audio output thread did not stop gracefully")
        
        # Close connection - this is the critical fix
        if connection is not None:
            try:
                connection.close()
                print("✅ WebSocket connection closed properly")
            except Exception as e:
                print(f"⚠️  Error closing connection: {e}")
        else:
            print("ℹ️  No connection to close")
        
        print("✅ Voice chat cleanup completed")

print("✅ Debug voice chat function created with proper connection cleanup")

✅ Debug voice chat function created with proper connection cleanup


In [15]:
# Run the debug version of voice chat
# This will show more detailed information about what's happening

# First check if we can see logs
print("📝 Checking recent logs:")
view_logs()

print("\n" + "="*60)
print("🚀 Starting DEBUG voice chat...")
print("This version will run for a while and show more status information")
print("Watch for error messages and connection status")

# Uncomment the line below to run debug version:
run_voice_chat_debug()

print("💡 Uncomment the line above to run the debug version")

📝 Checking recent logs:
📝 Latest log file: logs\2025-09-02_00-06-48_voicelive_notebook.log
📄 Last 20 lines:
File "c:\Users\pablosal\AppData\Local\anaconda3\envs\audioagent\Lib\site-packages\azure\identity\_credentials\shared_cache.py", line 152, in get_token
token_info = self._get_token_base(*scopes, options=options, base_method_name="get_token", **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\pablosal\AppData\Local\anaconda3\envs\audioagent\Lib\site-packages\azure\identity\_credentials\shared_cache.py", line 186, in _get_token_base
account = self._get_account(self._username, self._tenant_id, is_cae=is_cae)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\pablosal\AppData\Local\anaconda3\envs\audioagent\Lib\site-packages\azure\identity\_internal\decorators.py", line 67, in wrapper
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "c:\Users\pablosal\AppData\Local\anaconda3\envs\audioagent\L

KeyboardInterrupt: 

## **12. Azure AI Agent Service Integration**

Azure AI Agent Service provides enterprise-grade conversational AI with:
- **Centralized prompt management** in Azure AI Foundry portal
- **Version-controlled agent configurations** 
- **Multi-tenant agent deployment**
- **Business logic separation** from client code

### **Agent vs Direct Model Comparison**

| Aspect | Direct Model | Azure AI Agent Service |
|--------|--------------|-------------------------|
| **Instructions** | Client-side code | Azure portal managed |
| **Updates** | Code deployment | Portal configuration |
| **Scalability** | Manual management | Enterprise multi-tenant |
| **Governance** | Developer-controlled | Business-user accessible |

### **Environment Configuration**

**Standard AI Foundry Projects (Recommended):**
```env
# Required for AI Foundry Projects
AI_FOUNDRY_PROJECT_NAME=your-project-name
AI_FOUNDRY_AGENT_ID=asst_xxxxxxxxxxxxx

# Standard Voice Live API
AZURE_VOICE_LIVE_ENDPOINT=https://your-resource.cognitiveservices.azure.com
AZURE_VOICE_LIVE_API_VERSION=2025-05-01-preview
```

**Hub-based Projects (Legacy):**
```env
# Required for Hub-based Projects  
AI_FOUNDRY_AGENT_ID=asst_xxxxxxxxxxxxx
AI_FOUNDRY_AGENT_CONNECTION_STRING=your-connection-string

# Standard Voice Live API
AZURE_VOICE_LIVE_ENDPOINT=https://your-resource.cognitiveservices.azure.com
AZURE_VOICE_LIVE_API_VERSION=2025-05-01-preview
```

**Note:** Most users should use the standard AI Foundry project approach with `PROJECT_NAME` + `AGENT_ID`. The connection string is only needed for legacy hub-based deployments.

In [None]:
class AzureVoiceLiveAgent:
    """
    Enterprise Azure Voice Live client with Agent Service integration.
    
    Supports both AI Foundry project-based (recommended) and hub-based deployments.
    """
    
    def __init__(self, azure_endpoint: str, api_version: str = "2025-05-01-preview"):
        self._azure_endpoint = azure_endpoint
        self._api_version = api_version
        self._credential = DefaultAzureCredential()
        self._connection = None
    
    def connect_agent(self, project_name: str = None, agent_id: str = None, 
                     connection_string: str = None) -> VoiceLiveConnection:
        """
        Connect to Azure AI Agent Service.
        
        Standard AI Foundry Projects (Recommended):
            project_name: AI Foundry project name
            agent_id: Agent ID from Azure portal
            
        Hub-based Projects (Legacy):
            agent_id: Agent ID from Azure portal  
            connection_string: Connection string from Azure portal
        """
        if not agent_id:
            raise ValueError("Agent ID is required")
            
        # Get authentication tokens
        voice_token = self._credential.get_token("https://ai.azure.com/.default")
        
        # Build WebSocket URL
        ws_endpoint = self._azure_endpoint.replace("https://", "wss://")
        
        if project_name:
            # Standard AI Foundry project-based deployment (recommended)
            agent_token = voice_token  # Use same token for project-based
            url = (f"{ws_endpoint}/voice-live/realtime?"
                   f"api-version={self._api_version}&"
                   f"agent-project-name={project_name}&"
                   f"agent-id={agent_id}&"
                   f"agent-access-token={agent_token.token}")
            print(f"✅ Using AI Foundry project: {project_name}")
            
        elif connection_string:
            # Hub-based deployment (legacy)
            agent_token = self._credential.get_token("https://ml.azure.com/.default")
            url = (f"{ws_endpoint}/voice-live/realtime?"
                   f"api-version={self._api_version}&"
                   f"agent-connection-string={connection_string}&"
                   f"agent-id={agent_id}&"
                   f"agent-access-token={agent_token.token}")
            print(f"✅ Using hub-based connection (legacy)")
            
        else:
            raise ValueError("Either project_name (recommended) or connection_string (legacy) is required")
        
        # Setup headers
        headers = {
            "Authorization": f"Bearer {voice_token.token}",
            "x-ms-client-request-id": str(uuid.uuid4())
        }
        
        # Create and connect
        self._connection = VoiceLiveConnection(url, headers)
        self._connection.connect()
        return self._connection
    
    def create_agent_session_config(self, voice_name: str = "en-US-Ava:DragonHDLatestNeural") -> dict:
        """Create optimized session configuration for agent deployment."""
        return {
            "type": "session.update",
            "session": {
                "turn_detection": {
                    "type": "azure_semantic_vad",
                    "threshold": 0.3,
                    "prefix_padding_ms": 200,
                    "silence_duration_ms": 200,
                    "remove_filler_words": False,
                    "end_of_utterance_detection": {
                        "model": "semantic_detection_v1",
                        "threshold": 0.01,
                        "timeout": 2,
                    },
                },
                "input_audio_noise_reduction": {"type": "azure_deep_noise_suppression"},
                "input_audio_echo_cancellation": {"type": "server_echo_cancellation"},
                "voice": {
                    "name": voice_name,
                    "type": "azure-standard", 
                    "temperature": 0.8,
                },
            },
            "event_id": ""
        }

# Initialize enterprise agent client
agent_client = AzureVoiceLiveAgent(
    azure_endpoint=AZURE_VOICE_LIVE_ENDPOINT,
    api_version=AZURE_VOICE_LIVE_API_VERSION
)

print("✅ Enterprise Azure Voice Live Agent client initialized")
print(f"   Endpoint: {AZURE_VOICE_LIVE_ENDPOINT}")
print(f"   API Version: {AZURE_VOICE_LIVE_API_VERSION}")
print("   Deployment: Standard AI Foundry project (recommended) or hub-based (legacy)")

✅ Enterprise Azure Voice Live Agent client initialized
   Endpoint: https://poc-ai-agents-voice-resource.cognitiveservices.azure.com/
   API Version: 2025-05-01-preview
   Deployment: Standard AI Foundry project (recommended) or hub-based (legacy)


In [15]:
def run_agent_voice_application():
    """
    Run Azure AI Agent Service voice application with comprehensive event logging.
    
    Enterprise-grade implementation with proper error handling,
    logging, and graceful shutdown.
    """
    global stop_event
    connection = None
    threads = []
    
    # Check if already running
    if not stop_event.is_set():
        print("⚠️  Stopping any previous instance...")
        stop_event.set()
        time.sleep(2)
    
    try:
        stop_event.clear()
        
        print("🚀 Starting Azure AI Agent Service voice application")
        print(f"   Timestamp: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
        
        # Load agent configuration from environment
        project_name = os.environ.get("AI_FOUNDRY_PROJECT_NAME")
        agent_id = os.environ.get("AI_FOUNDRY_AGENT_ID") 
        connection_string = os.environ.get("AI_FOUNDRY_AGENT_CONNECTION_STRING")
        
        print(f"📋 Configuration Check:")
        print(f"   Project Name: {project_name or '❌ Not set'}")
        print(f"   Agent ID: {agent_id or '❌ Not set'}")
        print(f"   Connection String: {'✅ Set' if connection_string else '❌ Not set (hub-based only)'}")
        
        if not agent_id:
            raise ValueError("AI_FOUNDRY_AGENT_ID environment variable required")
        
        # Determine deployment type and connect
        if project_name:
            deployment_type = "AI Foundry Project (Recommended)"
            print(f"✅ Using deployment type: {deployment_type}")
            print(f"   Agent ID: {agent_id}")
            print(f"   Project: {project_name}")
            
            # Connect using project name (recommended)
            print("🔌 Connecting to Azure AI Agent Service...")
            connection = agent_client.connect_agent(
                project_name=project_name,
                agent_id=agent_id
            )
            
        elif connection_string:
            deployment_type = "Hub-based (Legacy)"
            print(f"✅ Using deployment type: {deployment_type}")
            print(f"   Agent ID: {agent_id}")
            
            # Connect using connection string (legacy)
            print("🔌 Connecting to Azure AI Agent Service...")
            connection = agent_client.connect_agent(
                agent_id=agent_id,
                connection_string=connection_string
            )
            
        else:
            raise ValueError(
                "Either AI_FOUNDRY_PROJECT_NAME (recommended) or "
                "AI_FOUNDRY_AGENT_CONNECTION_STRING (legacy) is required"
            )
        
        print("✅ Connected to Azure AI Agent Service")
        
        # Configure session
        print("⚙️  Configuring agent session...")
        session_config = agent_client.create_agent_session_config()
        connection.send(json.dumps(session_config))
        print("✅ Agent session configuration sent")
        
        # Wait a moment for session to be established
        print("⏳ Waiting for session establishment...")
        time.sleep(2)
        
        # Start audio processing threads
        print("🧵 Starting audio processing threads...")
        
        def safe_listen_and_send_audio(connection):
            try:
                listen_and_send_audio(connection)
            except Exception as e:
                print(f"❌ Audio input thread error: {e}")
                logger.error(f"Audio input error: {e}")
                stop_event.set()
        
        def safe_receive_audio_and_playback(connection):
            try:
                receive_audio_and_playback(connection)
            except Exception as e:
                print(f"❌ Audio output thread error: {e}")
                logger.error(f"Audio output error: {e}")
                stop_event.set()
        
        def safe_read_keyboard_and_quit():
            try:
                read_keyboard_and_quit()
            except Exception as e:
                print(f"❌ Keyboard input thread error: {e}")
                logger.error(f"Keyboard input error: {e}")
                stop_event.set()
        
        audio_threads = [
            threading.Thread(target=safe_listen_and_send_audio, args=(connection,), name="AudioInput"),
            threading.Thread(target=safe_receive_audio_and_playback, args=(connection,), name="AudioOutput"),
            threading.Thread(target=safe_read_keyboard_and_quit, name="UserInput")
        ]
        
        for i, thread in enumerate(audio_threads):
            thread.start()
            threads.append(thread)
            print(f"   ✅ Thread {i+1} started: {thread.name}")
        
        print("=" * 60)
        print("🎙️  Agent voice application is now ACTIVE!")
        print("💬 Start speaking - your conversation is managed by the Azure AI agent")
        print("📊 All events will be logged below")
        print("⌨️  Press 'q' + Enter to quit")
        print("=" * 60)
        
        # Wait for user to quit or error
        threads[2].join()  # Wait for keyboard thread
        
    except Exception as e:
        print(f"❌ Application error: {e}")
        logger.error(f"Agent application error: {e}")
        import traceback
        traceback.print_exc()
        
    finally:
        print("\n" + "=" * 60)
        print("🛑 Shutting down agent application...")
        stop_event.set()
        
        # Wait for threads with timeout
        print("⏳ Waiting for threads to complete...")
        for i, thread in enumerate(threads):
            if thread.is_alive():
                print(f"   Stopping thread {i+1}: {thread.name}")
                thread.join(timeout=5)
                if thread.is_alive():
                    print(f"   ⚠️  Thread {i+1} ({thread.name}) did not stop gracefully")
                else:
                    print(f"   ✅ Thread {i+1} ({thread.name}) stopped")
        
        # Close connection
        if connection:
            try:
                print("🔌 Closing agent connection...")
                connection.close()
                print("✅ Agent connection closed successfully")
            except Exception as e:
                print(f"⚠️  Connection cleanup error: {e}")
                logger.error(f"Connection cleanup error: {e}")
        else:
            print("ℹ️  No connection to close")
        
        print("✅ Azure AI Agent Service application shutdown complete")
        print("=" * 60)

print("✅ Azure AI Agent Service voice application function ready")
print("   This provides comprehensive event logging for debugging and monitoring")

✅ Azure AI Agent Service voice application function ready
   This provides comprehensive event logging for debugging and monitoring


## **13. Run Azure AI Agent Service Application**

### **Production Voice Application**

Execute the enterprise-grade voice application with comprehensive event logging and Azure AI Agent Service integration.

**Prerequisites:**
- Azure AI Foundry resource with Voice Live API enabled
- Created agent in Azure AI Foundry portal  
- Environment variables configured (see below)
- Audio devices (headset recommended for production)

**Required Environment Variables:**
- `AI_FOUNDRY_AGENT_ID` - Your agent ID from Azure AI Foundry portal
- `AI_FOUNDRY_PROJECT_NAME` - Your project name (recommended for standard projects)

**Optional (Legacy Hub-based Projects):**
- `AI_FOUNDRY_AGENT_CONNECTION_STRING` - Only needed for hub-based deployments

In [None]:
run_agent_voice_application()

⚠️  Stopping any previous instance...
🚀 Starting Azure AI Agent Service voice application
   Timestamp: 2025-09-02 09:14:52
📋 Configuration Check:
   Project Name: poc-ai-agents-voice
   Agent ID: asst_Dd9U7mxFgfZxjwhEbSr76dyU
   Connection String: ❌ Not set (hub-based only)
✅ Using deployment type: AI Foundry Project (Recommended)
   Agent ID: asst_Dd9U7mxFgfZxjwhEbSr76dyU
   Project: poc-ai-agents-voice
🔌 Connecting to Azure AI Agent Service...
✅ Using AI Foundry project: poc-ai-agents-voice
🔌 Establishing WebSocket connection (ID: 018a7c61)...
✅ WebSocket connection opened (ID: 018a7c61)
✅ Connection established successfully (ID: 018a7c61)
✅ Connected to Azure AI Agent Service
⚙️  Configuring agent session...
✅ Agent session configuration sent
⏳ Waiting for session establishment...
🧵 Starting audio processing threads...
   ✅ Thread 1 started: AudioInput
   ✅ Thread 2 started: AudioOutput
Press 'q' and Enter to quit the chat.
   ✅ Thread 3 started: UserInput
🎙️  Agent voice applicati