# MCP-Powered Voice Agent with DeepSeek

## Overview

This notebook demonstrates the implementation of a voice-enabled AI agent that leverages:
- **MCP (Model Context Protocol)**: For structured AI model interactions
- **DeepSeek**: As the underlying language model for natural language processing
- **Speech Recognition**: For converting voice input to text
- **Text-to-Speech**: For converting AI responses back to voice

## Architecture Overview

```
Voice Input → Speech Recognition → Text Processing → DeepSeek Model → MCP Protocol → Response Generation → Text-to-Speech → Voice Output
```

## Key Components

1. **Audio Processing Pipeline**: Handles voice input/output
2. **Language Model Integration**: DeepSeek for intelligent responses
3. **MCP Integration**: Structured communication protocol
4. **Interactive Interface**: Jupyter widgets for user interaction

## Prerequisites

- Python 3.8+
- Microphone access for voice input
- Audio output capabilities
- Internet connection for model downloads


## 1. Environment Setup and Dependencies

This section installs and configures all necessary dependencies for the voice agent.

### Widget Extensions
We start by setting up Jupyter widgets which will provide interactive UI components for our voice agent interface.

In [None]:
# Install and enable Jupyter widgets for interactive UI components
# ipywidgets: Provides interactive HTML widgets for Jupyter notebooks
# This enables buttons, sliders, and other UI elements for our voice agent interface

!pip install --upgrade ipywidgets

# Enable the widget extension in Jupyter
# Note: This command may show "not found" in some environments but widgets will still work
!jupyter nbextension enable --py widgetsnbextension

print("✅ Jupyter widgets setup completed")
print("📝 Note: If you see 'jupyter-nbextension not found', this is normal in some environments")

### Machine Learning Dependencies

Installing the core AI/ML libraries:
- **Transformers**: Hugging Face library for pre-trained language models
- **PyTorch**: Deep learning framework used by many modern NLP models
- **DeepSeek Integration**: These libraries will enable us to work with DeepSeek models

In [None]:
# Install core machine learning and NLP dependencies

# transformers: Hugging Face library providing access to thousands of pre-trained models
# - Includes tokenizers, model architectures, and training utilities
# - Essential for working with modern language models like DeepSeek

# torch: PyTorch deep learning framework
# - Provides tensor operations and neural network building blocks
# - Required backend for most transformer models

!pip install transformers torch

print("✅ Machine Learning dependencies installed")
print("🤖 Ready to work with transformer models and DeepSeek")

### Audio Processing Dependencies

Installing libraries for voice input and output:
- **SpeechRecognition**: Converts speech to text using various engines (Google, Sphinx, etc.)
- **PyAudio**: Low-level audio I/O library for recording and playing audio
- **pyttsx3**: Text-to-speech conversion library with multiple engine support

In [None]:
# Install audio processing libraries for voice input/output

# SpeechRecognition: Library for performing speech recognition
# - Supports multiple engines: Google Speech Recognition, CMU Sphinx, etc.
# - Handles microphone input and audio file processing

# pyaudio: Cross-platform audio I/O library
# - Provides low-level access to audio hardware
# - Required for real-time audio recording and playback

# pyttsx3: Text-to-speech conversion library
# - Works offline with system TTS engines
# - Cross-platform support (Windows SAPI, macOS NSSpeechSynthesizer, Linux espeak)

!pip install SpeechRecognition pyaudio pyttsx3

print("✅ Audio processing dependencies installed")
print("🎤 Speech recognition ready")
print("🔊 Text-to-speech ready")
print("⚠️  Note: If PyAudio installation fails, you may need system audio development libraries")

### Additional Dependencies for MCP and Enhanced Functionality

In [None]:
# Install additional dependencies for enhanced functionality

# requests: HTTP library for API calls (DeepSeek API integration)
# openai: Client library for OpenAI-compatible APIs (DeepSeek supports OpenAI format)
# asyncio: Asynchronous programming support for non-blocking operations
# json: Built-in JSON handling (usually pre-installed)
# threading: Built-in threading support for concurrent operations

!pip install requests openai aiohttp

print("✅ Additional dependencies installed")
print("🌐 API integration ready")
print("⚡ Async operations supported")

## 2. Import Libraries and Initialize Components

This section imports all necessary libraries and sets up the basic components for our voice agent.

In [None]:
# Core Python libraries
import os
import json
import time
import threading
import asyncio
from typing import Dict, List, Optional, Any
import logging

# Audio processing libraries
import speech_recognition as sr  # Speech-to-text conversion
import pyttsx3                   # Text-to-speech conversion
import pyaudio                   # Audio I/O operations

# Machine learning and NLP libraries
import torch                     # PyTorch deep learning framework
from transformers import (
    AutoTokenizer,               # Automatic tokenizer selection
    AutoModelForCausalLM,        # Causal language model (GPT-style)
    pipeline                     # High-level model interface
)

# HTTP and API libraries
import requests                  # HTTP requests for API calls
from openai import OpenAI        # OpenAI-compatible client for DeepSeek

# Jupyter notebook libraries
import ipywidgets as widgets     # Interactive widgets
from IPython.display import display, clear_output, HTML, Audio

# Configure logging for debugging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

print("✅ All libraries imported successfully")
print("📦 Core components ready for initialization")

## 3. Configuration and Environment Variables

Setting up configuration parameters and environment variables for the voice agent.

In [None]:
# Configuration class for voice agent settings
class VoiceAgentConfig:
    """
    Configuration class containing all settings for the voice agent.
    
    This centralized configuration makes it easy to adjust parameters
    without modifying code throughout the notebook.
    """
    
    # API Configuration
    DEEPSEEK_API_KEY = os.getenv("DEEPSEEK_API_KEY", "your-deepseek-api-key-here")
    DEEPSEEK_BASE_URL = "https://api.deepseek.com/v1"  # DeepSeek API endpoint
    
    # Model Configuration
    MODEL_NAME = "deepseek-chat"  # DeepSeek model identifier
    MAX_TOKENS = 1000             # Maximum tokens in response
    TEMPERATURE = 0.7             # Response creativity (0.0-1.0)
    
    # Audio Configuration
    SAMPLE_RATE = 16000          # Audio sample rate (Hz)
    CHUNK_SIZE = 1024            # Audio chunk size for processing
    AUDIO_FORMAT = pyaudio.paInt16  # Audio format (16-bit)
    CHANNELS = 1                 # Mono audio
    
    # Speech Recognition Configuration
    RECOGNITION_TIMEOUT = 5      # Timeout for speech recognition (seconds)
    PHRASE_TIMEOUT = 1           # Timeout between phrases (seconds)
    ENERGY_THRESHOLD = 300       # Minimum audio energy for speech detection
    
    # Text-to-Speech Configuration
    TTS_RATE = 200              # Speech rate (words per minute)
    TTS_VOLUME = 0.9            # Speech volume (0.0-1.0)
    TTS_VOICE_INDEX = 0         # Voice selection index
    
    # MCP Configuration
    MCP_VERSION = "1.0"         # MCP protocol version
    MCP_TIMEOUT = 30            # MCP operation timeout (seconds)

# Initialize configuration
config = VoiceAgentConfig()

print("✅ Configuration initialized")
print(f"🔑 DeepSeek API configured: {'✓' if config.DEEPSEEK_API_KEY != 'your-deepseek-api-key-here' else '✗ (Please set DEEPSEEK_API_KEY)'}")
print(f"🎯 Model: {config.MODEL_NAME}")
print(f"🎤 Audio sample rate: {config.SAMPLE_RATE} Hz")
print(f"🔊 TTS rate: {config.TTS_RATE} WPM")

## 4. MCP (Model Context Protocol) Implementation

The Model Context Protocol provides a structured way to interact with AI models, ensuring consistent communication and context management.

In [None]:
class MCPProtocol:
    """
    Model Context Protocol implementation for structured AI interactions.
    
    MCP provides a standardized way to:
    - Manage conversation context
    - Handle model requests and responses
    - Maintain session state
    - Ensure consistent communication format
    """
    
    def __init__(self, config: VoiceAgentConfig):
        """
        Initialize MCP protocol handler.
        
        Args:
            config: Voice agent configuration object
        """
        self.config = config
        self.session_id = f"session_{int(time.time())}"
        self.context_history = []  # Conversation history
        self.metadata = {          # Session metadata
            "version": config.MCP_VERSION,
            "created_at": time.time(),
            "model": config.MODEL_NAME
        }
        
        logger.info(f"MCP Protocol initialized - Session: {self.session_id}")
    
    def create_request(self, user_input: str, context: Optional[Dict] = None) -> Dict[str, Any]:
        """
        Create a structured MCP request.
        
        Args:
            user_input: User's text input
            context: Additional context information
            
        Returns:
            Structured MCP request dictionary
        """
        request = {
            "mcp_version": self.config.MCP_VERSION,
            "session_id": self.session_id,
            "timestamp": time.time(),
            "request_id": f"req_{len(self.context_history)}",
            "user_input": user_input,
            "context": context or {},
            "history": self.context_history[-5:],  # Last 5 interactions
            "metadata": self.metadata
        }
        
        logger.info(f"MCP request created: {request['request_id']}")
        return request
    
    def process_response(self, response: str, request_id: str) -> Dict[str, Any]:
        """
        Process and structure an MCP response.
        
        Args:
            response: Model's response text
            request_id: ID of the original request
            
        Returns:
            Structured MCP response dictionary
        """
        response_data = {
            "mcp_version": self.config.MCP_VERSION,
            "session_id": self.session_id,
            "timestamp": time.time(),
            "request_id": request_id,
            "response": response,
            "status": "success",
            "metadata": {
                "model": self.config.MODEL_NAME,
                "tokens_used": len(response.split()),  # Approximate
                "processing_time": 0  # Will be calculated by caller
            }
        }
        
        # Add to context history
        self.context_history.append({
            "request_id": request_id,
            "timestamp": response_data["timestamp"],
            "response": response
        })
        
        logger.info(f"MCP response processed: {request_id}")
        return response_data
    
    def get_context_summary(self) -> str:
        """
        Get a summary of the current conversation context.
        
        Returns:
            String summary of conversation context
        """
        if not self.context_history:
            return "No previous conversation context."
        
        summary = f"Conversation with {len(self.context_history)} previous interactions:\n"
        for i, interaction in enumerate(self.context_history[-3:], 1):
            summary += f"{i}. {interaction['response'][:100]}...\n"
        
        return summary

# Initialize MCP protocol
mcp = MCPProtocol(config)

print("✅ MCP Protocol initialized")
print(f"🆔 Session ID: {mcp.session_id}")
print(f"📋 Context tracking: Active")

## 5. DeepSeek Model Integration

This section implements the integration with DeepSeek's language model API, providing the core AI capabilities for our voice agent.

In [None]:
class DeepSeekClient:
    """
    DeepSeek API client for language model interactions.
    
    This client handles:
    - API authentication and requests
    - Response processing and error handling
    - Context management for conversations
    - Rate limiting and retry logic
    """
    
    def __init__(self, config: VoiceAgentConfig):
        """
        Initialize DeepSeek client.
        
        Args:
            config: Voice agent configuration object
        """
        self.config = config
        
        # Initialize OpenAI-compatible client for DeepSeek
        self.client = OpenAI(
            api_key=config.DEEPSEEK_API_KEY,
            base_url=config.DEEPSEEK_BASE_URL
        )
        
        # System prompt for voice agent behavior
        self.system_prompt = """
        You are a helpful voice assistant powered by DeepSeek. You should:
        - Provide clear, concise responses suitable for speech
        - Be conversational and friendly
        - Ask clarifying questions when needed
        - Keep responses under 200 words for better voice experience
        - Use natural language without excessive technical jargon
        """.strip()
        
        logger.info("DeepSeek client initialized")
    
    async def generate_response(self, user_input: str, context: Optional[List[Dict]] = None) -> str:
        """
        Generate a response using the DeepSeek model.
        
        Args:
            user_input: User's text input
            context: Previous conversation context
            
        Returns:
            Generated response text
            
        Raises:
            Exception: If API call fails
        """
        try:
            # Prepare messages for the API
            messages = [{"role": "system", "content": self.system_prompt}]
            
            # Add conversation context if available
            if context:
                for item in context:
                    messages.append({"role": "assistant", "content": item["response"]})
            
            # Add current user input
            messages.append({"role": "user", "content": user_input})
            
            logger.info(f"Sending request to DeepSeek: {len(user_input)} characters")
            
            # Make API call
            response = self.client.chat.completions.create(
                model=self.config.MODEL_NAME,
                messages=messages,
                max_tokens=self.config.MAX_TOKENS,
                temperature=self.config.TEMPERATURE,
                stream=False
            )
            
            # Extract response text
            response_text = response.choices[0].message.content.strip()
            
            logger.info(f"DeepSeek response received: {len(response_text)} characters")
            return response_text
            
        except Exception as e:
            error_msg = f"DeepSeek API error: {str(e)}"
            logger.error(error_msg)
            return f"I'm sorry, I'm having trouble processing your request right now. Error: {str(e)}"
    
    def test_connection(self) -> bool:
        """
        Test the connection to DeepSeek API.
        
        Returns:
            True if connection successful, False otherwise
        """
        try:
            # Simple test request
            response = self.client.chat.completions.create(
                model=self.config.MODEL_NAME,
                messages=[{"role": "user", "content": "Hello"}],
                max_tokens=10
            )
            
            logger.info("DeepSeek API connection test successful")
            return True
            
        except Exception as e:
            logger.error(f"DeepSeek API connection test failed: {str(e)}")
            return False

# Initialize DeepSeek client
deepseek_client = DeepSeekClient(config)

print("✅ DeepSeek client initialized")
print(f"🌐 API endpoint: {config.DEEPSEEK_BASE_URL}")
print(f"🤖 Model: {config.MODEL_NAME}")

# Test API connection (optional)
if config.DEEPSEEK_API_KEY != "your-deepseek-api-key-here":
    print("🔍 Testing API connection...")
    connection_status = deepseek_client.test_connection()
    print(f"🔗 API Connection: {'✅ Success' if connection_status else '❌ Failed'}")
else:
    print("⚠️  API key not configured - set DEEPSEEK_API_KEY environment variable")

## 6. Audio Processing Components

This section implements the speech recognition and text-to-speech components that enable voice interaction.

In [None]:
class AudioProcessor:
    """
    Audio processing class handling speech recognition and text-to-speech.
    
    This class manages:
    - Microphone input and audio recording
    - Speech-to-text conversion using Google Speech Recognition
    - Text-to-speech output using system TTS engines
    - Audio device configuration and error handling
    """
    
    def __init__(self, config: VoiceAgentConfig):
        """
        Initialize audio processor.
        
        Args:
            config: Voice agent configuration object
        """
        self.config = config
        
        # Initialize speech recognition
        self.recognizer = sr.Recognizer()
        self.microphone = sr.Microphone()
        
        # Configure recognizer settings
        self.recognizer.energy_threshold = config.ENERGY_THRESHOLD
        self.recognizer.dynamic_energy_threshold = True
        self.recognizer.pause_threshold = config.PHRASE_TIMEOUT
        
        # Initialize text-to-speech
        self.tts_engine = pyttsx3.init()
        
        # Configure TTS settings
        self.tts_engine.setProperty('rate', config.TTS_RATE)
        self.tts_engine.setProperty('volume', config.TTS_VOLUME)
        
        # Set voice if available
        voices = self.tts_engine.getProperty('voices')
        if voices and len(voices) > config.TTS_VOICE_INDEX:
            self.tts_engine.setProperty('voice', voices[config.TTS_VOICE_INDEX].id)
        
        # Calibrate microphone for ambient noise
        self._calibrate_microphone()
        
        logger.info("Audio processor initialized")
    
    def _calibrate_microphone(self):
        """
        Calibrate microphone for ambient noise levels.
        
        This helps improve speech recognition accuracy by adjusting
        the energy threshold based on background noise.
        """
        try:
            with self.microphone as source:
                logger.info("Calibrating microphone for ambient noise...")
                self.recognizer.adjust_for_ambient_noise(source, duration=1)
                logger.info(f"Microphone calibrated - Energy threshold: {self.recognizer.energy_threshold}")
        except Exception as e:
            logger.warning(f"Microphone calibration failed: {str(e)}")
    
    def listen_for_speech(self, timeout: Optional[float] = None) -> Optional[str]:
        """
        Listen for speech input and convert to text.
        
        Args:
            timeout: Maximum time to wait for speech (seconds)
            
        Returns:
            Recognized speech text or None if recognition failed
        """
        try:
            logger.info("Listening for speech...")
            
            with self.microphone as source:
                # Listen for audio with timeout
                audio = self.recognizer.listen(
                    source, 
                    timeout=timeout or self.config.RECOGNITION_TIMEOUT,
                    phrase_time_limit=5
                )
            
            logger.info("Processing speech...")
            
            # Recognize speech using Google Speech Recognition
            text = self.recognizer.recognize_google(audio)
            
            logger.info(f"Speech recognized: '{text}'")
            return text
            
        except sr.WaitTimeoutError:
            logger.warning("Speech recognition timeout")
            return None
        except sr.UnknownValueError:
            logger.warning("Could not understand speech")
            return None
        except sr.RequestError as e:
            logger.error(f"Speech recognition service error: {str(e)}")
            return None
        except Exception as e:
            logger.error(f"Unexpected error in speech recognition: {str(e)}")
            return None
    
    def speak_text(self, text: str) -> bool:
        """
        Convert text to speech and play it.
        
        Args:
            text: Text to convert to speech
            
        Returns:
            True if speech synthesis successful, False otherwise
        """
        try:
            logger.info(f"Speaking text: '{text[:50]}...'")
            
            # Use text-to-speech engine
            self.tts_engine.say(text)
            self.tts_engine.runAndWait()
            
            logger.info("Text-to-speech completed")
            return True
            
        except Exception as e:
            logger.error(f"Text-to-speech error: {str(e)}")
            return False
    
    def test_audio_devices(self) -> Dict[str, Any]:
        """
        Test audio input and output devices.
        
        Returns:
            Dictionary with test results
        """
        results = {
            "microphone": False,
            "speakers": False,
            "details": {}
        }
        
        # Test microphone
        try:
            with self.microphone as source:
                self.recognizer.adjust_for_ambient_noise(source, duration=0.5)
            results["microphone"] = True
            results["details"]["microphone"] = "Working"
        except Exception as e:
            results["details"]["microphone"] = f"Error: {str(e)}"
        
        # Test speakers/TTS
        try:
            self.tts_engine.say("Audio test")
            self.tts_engine.runAndWait()
            results["speakers"] = True
            results["details"]["speakers"] = "Working"
        except Exception as e:
            results["details"]["speakers"] = f"Error: {str(e)}"
        
        return results

# Initialize audio processor
audio_processor = AudioProcessor(config)

print("✅ Audio processor initialized")
print(f"🎤 Energy threshold: {audio_processor.recognizer.energy_threshold}")
print(f"🔊 TTS rate: {config.TTS_RATE} WPM")

# Test audio devices
print("🔍 Testing audio devices...")
audio_test = audio_processor.test_audio_devices()
print(f"🎤 Microphone: {'✅' if audio_test['microphone'] else '❌'} {audio_test['details']['microphone']}")
print(f"🔊 Speakers: {'✅' if audio_test['speakers'] else '❌'} {audio_test['details']['speakers']}")

## 7. Voice Agent Core Implementation

This section implements the main voice agent class that orchestrates all components to provide a complete voice interaction experience.

In [None]:
class VoiceAgent:
    """
    Main voice agent class that orchestrates all components.
    
    This class integrates:
    - Audio processing (speech recognition and TTS)
    - MCP protocol for structured interactions
    - DeepSeek model for AI responses
    - Session management and conversation flow
    """
    
    def __init__(self, config: VoiceAgentConfig, mcp: MCPProtocol, 
                 deepseek_client: DeepSeekClient, audio_processor: AudioProcessor):
        """
        Initialize the voice agent.
        
        Args:
            config: Configuration object
            mcp: MCP protocol handler
            deepseek_client: DeepSeek API client
            audio_processor: Audio processing handler
        """
        self.config = config
        self.mcp = mcp
        self.deepseek_client = deepseek_client
        self.audio_processor = audio_processor
        
        # Agent state
        self.is_listening = False
        self.conversation_active = False
        self.stats = {
            "interactions": 0,
            "successful_recognitions": 0,
            "failed_recognitions": 0,
            "responses_generated": 0,
            "start_time": time.time()
        }
        
        logger.info("Voice agent initialized")
    
    async def process_voice_interaction(self, timeout: Optional[float] = None) -> Dict[str, Any]:
        """
        Process a complete voice interaction cycle.
        
        This method handles the full workflow:
        1. Listen for speech input
        2. Convert speech to text
        3. Generate AI response using DeepSeek
        4. Convert response to speech
        5. Update conversation context
        
        Args:
            timeout: Maximum time to wait for speech input
            
        Returns:
            Dictionary containing interaction results
        """
        interaction_start = time.time()
        result = {
            "success": False,
            "user_input": None,
            "ai_response": None,
            "error": None,
            "processing_time": 0
        }
        
        try:
            logger.info("Starting voice interaction cycle")
            self.stats["interactions"] += 1
            
            # Step 1: Listen for speech input
            logger.info("🎤 Listening for speech...")
            user_input = self.audio_processor.listen_for_speech(timeout)
            
            if not user_input:
                result["error"] = "No speech detected or recognition failed"
                self.stats["failed_recognitions"] += 1
                return result
            
            result["user_input"] = user_input
            self.stats["successful_recognitions"] += 1
            logger.info(f"✅ Speech recognized: '{user_input}'")
            
            # Step 2: Create MCP request
            mcp_request = self.mcp.create_request(user_input)
            
            # Step 3: Generate AI response using DeepSeek
            logger.info("🤖 Generating AI response...")
            ai_response = await self.deepseek_client.generate_response(
                user_input, 
                self.mcp.context_history
            )
            
            result["ai_response"] = ai_response
            self.stats["responses_generated"] += 1
            logger.info(f"✅ AI response generated: '{ai_response[:100]}...'")
            
            # Step 4: Process MCP response
            mcp_response = self.mcp.process_response(ai_response, mcp_request["request_id"])
            
            # Step 5: Convert response to speech
            logger.info("🔊 Converting response to speech...")
            speech_success = self.audio_processor.speak_text(ai_response)
            
            if not speech_success:
                logger.warning("Text-to-speech failed, but interaction was successful")
            
            result["success"] = True
            result["processing_time"] = time.time() - interaction_start
            
            logger.info(f"✅ Voice interaction completed in {result['processing_time']:.2f}s")
            
        except Exception as e:
            result["error"] = str(e)
            result["processing_time"] = time.time() - interaction_start
            logger.error(f"Voice interaction failed: {str(e)}")
        
        return result
    
    async def start_conversation_mode(self, max_interactions: int = 10):
        """
        Start continuous conversation mode.
        
        Args:
            max_interactions: Maximum number of interactions before stopping
        """
        logger.info(f"Starting conversation mode (max {max_interactions} interactions)")
        self.conversation_active = True
        
        # Welcome message
        welcome_msg = "Hello! I'm your voice assistant. How can I help you today?"
        print(f"🤖 Assistant: {welcome_msg}")
        self.audio_processor.speak_text(welcome_msg)
        
        interaction_count = 0
        
        while self.conversation_active and interaction_count < max_interactions:
            try:
                print(f"\n--- Interaction {interaction_count + 1} ---")
                print("🎤 Listening... (Speak now or wait for timeout)")
                
                # Process voice interaction
                result = await self.process_voice_interaction(timeout=10)
                
                if result["success"]:
                    print(f"👤 You: {result['user_input']}")
                    print(f"🤖 Assistant: {result['ai_response']}")
                    print(f"⏱️  Processing time: {result['processing_time']:.2f}s")
                    
                    # Check for exit commands
                    if any(word in result["user_input"].lower() for word in ["goodbye", "bye", "exit", "quit", "stop"]):
                        farewell = "Goodbye! It was nice talking with you."
                        print(f"🤖 Assistant: {farewell}")
                        self.audio_processor.speak_text(farewell)
                        break
                        
                else:
                    print(f"❌ Interaction failed: {result['error']}")
                    
                    # Give user feedback
                    if "No speech detected" in str(result['error']):
                        feedback = "I didn't hear anything. Try speaking a bit louder."
                    else:
                        feedback = "Sorry, I had trouble understanding. Could you try again?"
                    
                    print(f"🤖 Assistant: {feedback}")
                    self.audio_processor.speak_text(feedback)
                
                interaction_count += 1
                
                # Brief pause between interactions
                await asyncio.sleep(1)
                
            except KeyboardInterrupt:
                print("\n🛑 Conversation interrupted by user")
                break
            except Exception as e:
                logger.error(f"Unexpected error in conversation mode: {str(e)}")
                print(f"❌ Unexpected error: {str(e)}")
                break
        
        self.conversation_active = False
        logger.info("Conversation mode ended")
    
    def get_session_stats(self) -> Dict[str, Any]:
        """
        Get session statistics.
        
        Returns:
            Dictionary containing session statistics
        """
        runtime = time.time() - self.stats["start_time"]
        
        return {
            "session_id": self.mcp.session_id,
            "runtime_seconds": runtime,
            "total_interactions": self.stats["interactions"],
            "successful_recognitions": self.stats["successful_recognitions"],
            "failed_recognitions": self.stats["failed_recognitions"],
            "recognition_success_rate": (
                self.stats["successful_recognitions"] / max(1, self.stats["interactions"]) * 100
            ),
            "responses_generated": self.stats["responses_generated"],
            "context_history_size": len(self.mcp.context_history)
        }
    
    def stop_conversation(self):
        """Stop the current conversation."""
        self.conversation_active = False
        logger.info("Conversation stopped by user")

# Initialize voice agent
voice_agent = VoiceAgent(config, mcp, deepseek_client, audio_processor)

print("✅ Voice Agent initialized")
print(f"🆔 Session: {voice_agent.mcp.session_id}")
print("🎯 Ready for voice interactions")

## 8. Interactive User Interface

This section creates an interactive Jupyter widget interface for controlling the voice agent.

In [None]:
class VoiceAgentUI:
    """
    Interactive user interface for the voice agent using Jupyter widgets.
    
    This class provides:
    - Control buttons for starting/stopping conversations
    - Real-time status updates
    - Session statistics display
    - Configuration controls
    """
    
    def __init__(self, voice_agent: VoiceAgent):
        """
        Initialize the UI components.
        
        Args:
            voice_agent: Voice agent instance to control
        """
        self.voice_agent = voice_agent
        self.setup_widgets()
        self.setup_event_handlers()
    
    def setup_widgets(self):
        """
        Create all UI widgets.
        """
        # Control buttons
        self.start_button = widgets.Button(
            description="🎤 Start Conversation",
            button_style="success",
            layout=widgets.Layout(width="200px", height="40px")
        )
        
        self.stop_button = widgets.Button(
            description="🛑 Stop Conversation",
            button_style="danger",
            layout=widgets.Layout(width="200px", height="40px"),
            disabled=True
        )
        
        self.single_interaction_button = widgets.Button(
            description="🎯 Single Interaction",
            button_style="info",
            layout=widgets.Layout(width="200px", height="40px")
        )
        
        # Status display
        self.status_output = widgets.Output()
        
        # Configuration controls
        self.max_interactions_slider = widgets.IntSlider(
            value=10,
            min=1,
            max=50,
            description="Max Interactions:",
            style={'description_width': 'initial'}
        )
        
        self.timeout_slider = widgets.FloatSlider(
            value=10.0,
            min=5.0,
            max=30.0,
            step=1.0,
            description="Listen Timeout (s):",
            style={'description_width': 'initial'}
        )
        
        # Statistics display
        self.stats_output = widgets.Output()
        
        # Layout containers
        self.control_box = widgets.HBox([
            self.start_button,
            self.stop_button,
            self.single_interaction_button
        ])
        
        self.config_box = widgets.VBox([
            self.max_interactions_slider,
            self.timeout_slider
        ])
        
        self.main_container = widgets.VBox([
            widgets.HTML("<h3>🎤 Voice Agent Control Panel</h3>"),
            self.control_box,
            widgets.HTML("<h4>⚙️ Configuration</h4>"),
            self.config_box,
            widgets.HTML("<h4>📊 Status</h4>"),
            self.status_output,
            widgets.HTML("<h4>📈 Statistics</h4>"),
            self.stats_output
        ])
    
    def setup_event_handlers(self):
        """
        Set up event handlers for widget interactions.
        """
        self.start_button.on_click(self.on_start_conversation)
        self.stop_button.on_click(self.on_stop_conversation)
        self.single_interaction_button.on_click(self.on_single_interaction)
    
    def on_start_conversation(self, button):
        """
        Handle start conversation button click.
        """
        self.start_button.disabled = True
        self.stop_button.disabled = False
        self.single_interaction_button.disabled = True
        
        with self.status_output:
            clear_output(wait=True)
            print("🚀 Starting conversation mode...")
            print("💡 Say 'goodbye', 'bye', 'exit', 'quit', or 'stop' to end the conversation")
        
        # Start conversation in background
        asyncio.create_task(self.run_conversation())
    
    def on_stop_conversation(self, button):
        """
        Handle stop conversation button click.
        """
        self.voice_agent.stop_conversation()
        self.reset_buttons()
        
        with self.status_output:
            print("🛑 Conversation stopped by user")
    
    def on_single_interaction(self, button):
        """
        Handle single interaction button click.
        """
        self.single_interaction_button.disabled = True
        
        with self.status_output:
            clear_output(wait=True)
            print("🎯 Starting single interaction...")
            print("🎤 Speak now!")
        
        # Start single interaction in background
        asyncio.create_task(self.run_single_interaction())
    
    async def run_conversation(self):
        """
        Run the conversation mode asynchronously.
        """
        try:
            await self.voice_agent.start_conversation_mode(
                max_interactions=self.max_interactions_slider.value
            )
        except Exception as e:
            with self.status_output:
                print(f"❌ Error in conversation mode: {str(e)}")
        finally:
            self.reset_buttons()
            self.update_stats()
    
    async def run_single_interaction(self):
        """
        Run a single interaction asynchronously.
        """
        try:
            result = await self.voice_agent.process_voice_interaction(
                timeout=self.timeout_slider.value
            )
            
            with self.status_output:
                if result["success"]:
                    print(f"✅ Interaction successful!")
                    print(f"👤 You said: {result['user_input']}")
                    print(f"🤖 Assistant responded: {result['ai_response']}")
                    print(f"⏱️  Processing time: {result['processing_time']:.2f}s")
                else:
                    print(f"❌ Interaction failed: {result['error']}")
                    
        except Exception as e:
            with self.status_output:
                print(f"❌ Error in single interaction: {str(e)}")
        finally:
            self.single_interaction_button.disabled = False
            self.update_stats()
    
    def reset_buttons(self):
        """
        Reset button states to default.
        """
        self.start_button.disabled = False
        self.stop_button.disabled = True
        self.single_interaction_button.disabled = False
    
    def update_stats(self):
        """
        Update the statistics display.
        """
        stats = self.voice_agent.get_session_stats()
        
        with self.stats_output:
            clear_output(wait=True)
            print(f"📊 Session Statistics:")
            print(f"  🆔 Session ID: {stats['session_id']}")
            print(f"  ⏰ Runtime: {stats['runtime_seconds']:.1f} seconds")
            print(f"  💬 Total Interactions: {stats['total_interactions']}")
            print(f"  ✅ Successful Recognitions: {stats['successful_recognitions']}")
            print(f"  ❌ Failed Recognitions: {stats['failed_recognitions']}")
            print(f"  📈 Recognition Success Rate: {stats['recognition_success_rate']:.1f}%")
            print(f"  🤖 Responses Generated: {stats['responses_generated']}")
            print(f"  📝 Context History Size: {stats['context_history_size']}")
    
    def display(self):
        """
        Display the UI.
        """
        # Initial status update
        with self.status_output:
            print("✅ Voice Agent ready!")
            print("💡 Click 'Start Conversation' for continuous mode or 'Single Interaction' for one-time use")
        
        # Initial stats update
        self.update_stats()
        
        # Display the main container
        display(self.main_container)

# Initialize and display UI
ui = VoiceAgentUI(voice_agent)
ui.display()

## 9. Testing and Troubleshooting

This section provides comprehensive testing utilities and troubleshooting guides for common issues.

In [None]:
class VoiceAgentTester:
    """
    Comprehensive testing utilities for the voice agent.
    
    This class provides methods to test:
    - Individual components (audio, API, MCP)
    - Integration between components
    - Performance benchmarks
    - Error handling
    """
    
    def __init__(self, voice_agent: VoiceAgent):
        """
        Initialize the tester.
        
        Args:
            voice_agent: Voice agent instance to test
        """
        self.voice_agent = voice_agent
        self.test_results = {}
    
    def run_comprehensive_test(self) -> Dict[str, Any]:
        """
        Run all available tests and return comprehensive results.
        
        Returns:
            Dictionary containing all test results
        """
        print("🧪 Starting comprehensive voice agent tests...\n")
        
        # Test 1: Configuration
        print("📋 Testing Configuration...")
        self.test_results["configuration"] = self.test_configuration()
        
        # Test 2: Audio Components
        print("\n🎤 Testing Audio Components...")
        self.test_results["audio"] = self.test_audio_components()
        
        # Test 3: DeepSeek API
        print("\n🤖 Testing DeepSeek API...")
        self.test_results["deepseek_api"] = self.test_deepseek_api()
        
        # Test 4: MCP Protocol
        print("\n📡 Testing MCP Protocol...")
        self.test_results["mcp_protocol"] = self.test_mcp_protocol()
        
        # Test 5: Integration
        print("\n🔗 Testing Component Integration...")
        self.test_results["integration"] = self.test_integration()
        
        # Generate summary
        self.test_results["summary"] = self.generate_test_summary()
        
        print("\n✅ Comprehensive testing completed!")
        return self.test_results
    
    def test_configuration(self) -> Dict[str, Any]:
        """
        Test configuration validity.
        
        Returns:
            Configuration test results
        """
        results = {
            "api_key_set": False,
            "audio_config_valid": False,
            "model_config_valid": False,
            "issues": []
        }
        
        config = self.voice_agent.config
        
        # Check API key
        if config.DEEPSEEK_API_KEY != "your-deepseek-api-key-here":
            results["api_key_set"] = True
            print("  ✅ DeepSeek API key is configured")
        else:
            results["issues"].append("DeepSeek API key not set")
            print("  ❌ DeepSeek API key not configured")
        
        # Check audio configuration
        if config.SAMPLE_RATE > 0 and config.ENERGY_THRESHOLD > 0:
            results["audio_config_valid"] = True
            print("  ✅ Audio configuration valid")
        else:
            results["issues"].append("Invalid audio configuration")
            print("  ❌ Invalid audio configuration")
        
        # Check model configuration
        if config.MODEL_NAME and config.MAX_TOKENS > 0:
            results["model_config_valid"] = True
            print("  ✅ Model configuration valid")
        else:
            results["issues"].append("Invalid model configuration")
            print("  ❌ Invalid model configuration")
        
        return results
    
    def test_audio_components(self) -> Dict[str, Any]:
        """
        Test audio input/output components.
        
        Returns:
            Audio component test results
        """
        results = {
            "microphone_available": False,
            "tts_available": False,
            "calibration_successful": False,
            "issues": []
        }
        
        audio_test = self.voice_agent.audio_processor.test_audio_devices()
        
        # Check microphone
        if audio_test["microphone"]:
            results["microphone_available"] = True
            print("  ✅ Microphone available and working")
        else:
            results["issues"].append(f"Microphone issue: {audio_test['details']['microphone']}")
            print(f"  ❌ Microphone problem: {audio_test['details']['microphone']}")
        
        # Check text-to-speech
        if audio_test["speakers"]:
            results["tts_available"] = True
            print("  ✅ Text-to-speech available and working")
        else:
            results["issues"].append(f"TTS issue: {audio_test['details']['speakers']}")
            print(f"  ❌ Text-to-speech problem: {audio_test['details']['speakers']}")
        
        # Check calibration
        try:
            energy_threshold = self.voice_agent.audio_processor.recognizer.energy_threshold
            if energy_threshold > 0:
                results["calibration_successful"] = True
                print(f"  ✅ Microphone calibrated (threshold: {energy_threshold})")
            else:
                results["issues"].append("Microphone calibration failed")
                print("  ❌ Microphone calibration failed")
        except Exception as e:
            results["issues"].append(f"Calibration error: {str(e)}")
            print(f"  ❌ Calibration error: {str(e)}")
        
        return results
    
    def test_deepseek_api(self) -> Dict[str, Any]:
        """
        Test DeepSeek API connectivity and functionality.
        
        Returns:
            DeepSeek API test results
        """
        results = {
            "connection_successful": False,
            "response_generation": False,
            "response_quality": False,
            "response_time": 0,
            "issues": []
        }
        
        # Test basic connection
        connection_test = self.voice_agent.deepseek_client.test_connection()
        if connection_test:
            results["connection_successful"] = True
            print("  ✅ DeepSeek API connection successful")
        else:
            results["issues"].append("Cannot connect to DeepSeek API")
            print("  ❌ DeepSeek API connection failed")
            return results
        
        # Test response generation
        try:
            start_time = time.time()
            
            # Use asyncio.run for testing async function
            test_response = asyncio.run(
                self.voice_agent.deepseek_client.generate_response(
                    "Hello, this is a test message. Please respond briefly."
                )
            )
            
            results["response_time"] = time.time() - start_time
            
            if test_response and len(test_response) > 0:
                results["response_generation"] = True
                print(f"  ✅ Response generation successful ({results['response_time']:.2f}s)")
                
                # Basic quality check
                if len(test_response) > 10 and not test_response.startswith("I'm sorry"):
                    results["response_quality"] = True
                    print(f"  ✅ Response quality good: '{test_response[:50]}...'")
                else:
                    results["issues"].append("Response quality concerns")
                    print(f"  ⚠️  Response quality concerns: '{test_response[:50]}...'")
            else:
                results["issues"].append("Empty or invalid response")
                print("  ❌ Empty or invalid response")
                
        except Exception as e:
            results["issues"].append(f"Response generation error: {str(e)}")
            print(f"  ❌ Response generation error: {str(e)}")
        
        return results
    
    def test_mcp_protocol(self) -> Dict[str, Any]:
        """
        Test MCP protocol functionality.
        
        Returns:
            MCP protocol test results
        """
        results = {
            "request_creation": False,
            "response_processing": False,
            "context_management": False,
            "issues": []
        }
        
        try:
            # Test request creation
            test_request = self.voice_agent.mcp.create_request("Test message")
            if all(key in test_request for key in ["mcp_version", "session_id", "user_input"]):
                results["request_creation"] = True
                print("  ✅ MCP request creation successful")
            else:
                results["issues"].append("Invalid MCP request structure")
                print("  ❌ Invalid MCP request structure")
            
            # Test response processing
            test_response = self.voice_agent.mcp.process_response(
                "Test response", 
                test_request["request_id"]
            )
            if all(key in test_response for key in ["mcp_version", "session_id", "response"]):
                results["response_processing"] = True
                print("  ✅ MCP response processing successful")
            else:
                results["issues"].append("Invalid MCP response structure")
                print("  ❌ Invalid MCP response structure")
            
            # Test context management
            context_summary = self.voice_agent.mcp.get_context_summary()
            if isinstance(context_summary, str):
                results["context_management"] = True
                print("  ✅ MCP context management working")
            else:
                results["issues"].append("Context management error")
                print("  ❌ Context management error")
                
        except Exception as e:
            results["issues"].append(f"MCP protocol error: {str(e)}")
            print(f"  ❌ MCP protocol error: {str(e)}")
        
        return results
    
    def test_integration(self) -> Dict[str, Any]:
        """
        Test integration between all components.
        
        Returns:
            Integration test results
        """
        results = {
            "text_processing_flow": False,
            "session_management": False,
            "error_handling": False,
            "issues": []
        }
        
        try:
            # Test text processing flow (without actual speech)
            print("  🔄 Testing text processing flow...")
            
            # Create a mock MCP request
            mock_request = self.voice_agent.mcp.create_request("Hello, how are you?")
            
            # Generate response using DeepSeek (if API is available)
            if self.test_results.get("deepseek_api", {}).get("connection_successful"):
                mock_response = asyncio.run(
                    self.voice_agent.deepseek_client.generate_response(
                        "Hello, how are you?",
                        self.voice_agent.mcp.context_history
                    )
                )
                
                # Process response through MCP
                mcp_response = self.voice_agent.mcp.process_response(
                    mock_response,
                    mock_request["request_id"]
                )
                
                if mcp_response and mock_response:
                    results["text_processing_flow"] = True
                    print("    ✅ Text processing flow successful")
                else:
                    results["issues"].append("Text processing flow failed")
                    print("    ❌ Text processing flow failed")
            else:
                results["issues"].append("Skipping text flow test - API unavailable")
                print("    ⚠️  Skipping text flow test - API unavailable")
            
            # Test session management
            session_stats = self.voice_agent.get_session_stats()
            if session_stats and "session_id" in session_stats:
                results["session_management"] = True
                print("    ✅ Session management working")
            else:
                results["issues"].append("Session management error")
                print("    ❌ Session management error")
            
            # Test error handling
            try:
                # Try to create an invalid request
                error_response = asyncio.run(
                    self.voice_agent.deepseek_client.generate_response("")
                )
                # If we get here without exception, error handling is working
                results["error_handling"] = True
                print("    ✅ Error handling working")
            except Exception:
                # This is expected - we want graceful error handling
                results["error_handling"] = True
                print("    ✅ Error handling working (graceful exception)")
                
        except Exception as e:
            results["issues"].append(f"Integration test error: {str(e)}")
            print(f"    ❌ Integration test error: {str(e)}")
        
        return results
    
    def generate_test_summary(self) -> Dict[str, Any]:
        """
        Generate a summary of all test results.
        
        Returns:
            Test summary
        """
        summary = {
            "total_tests": 0,
            "passed_tests": 0,
            "failed_tests": 0,
            "warnings": 0,
            "overall_status": "unknown",
            "critical_issues": [],
            "recommendations": []
        }
        
        # Count tests and issues
        for category, results in self.test_results.items():
            if category == "summary":
                continue
                
            if isinstance(results, dict):
                for key, value in results.items():
                    if isinstance(value, bool):
                        summary["total_tests"] += 1
                        if value:
                            summary["passed_tests"] += 1
                        else:
                            summary["failed_tests"] += 1
                
                # Collect critical issues
                if "issues" in results:
                    summary["critical_issues"].extend(results["issues"])
        
        # Determine overall status
        if summary["failed_tests"] == 0:
            summary["overall_status"] = "excellent"
        elif summary["passed_tests"] > summary["failed_tests"]:
            summary["overall_status"] = "good"
        else:
            summary["overall_status"] = "needs_attention"
        
        # Generate recommendations
        if not self.test_results.get("configuration", {}).get("api_key_set"):
            summary["recommendations"].append("Set DEEPSEEK_API_KEY environment variable")
        
        if not self.test_results.get("audio", {}).get("microphone_available"):
            summary["recommendations"].append("Check microphone permissions and hardware")
        
        if not self.test_results.get("audio", {}).get("tts_available"):
            summary["recommendations"].append("Check audio output devices and drivers")
        
        return summary
    
    def print_summary(self):
        """
        Print a formatted test summary.
        """
        if "summary" not in self.test_results:
            print("❌ No test results available. Run tests first.")
            return
        
        summary = self.test_results["summary"]
        
        print("\n" + "="*50)
        print("🧪 VOICE AGENT TEST SUMMARY")
        print("="*50)
        
        # Overall status
        status_emoji = {
            "excellent": "🟢",
            "good": "🟡", 
            "needs_attention": "🔴",
            "unknown": "⚪"
        }
        
        print(f"Overall Status: {status_emoji[summary['overall_status']]} {summary['overall_status'].replace('_', ' ').title()}")
        print(f"Tests Passed: {summary['passed_tests']}/{summary['total_tests']}")
        
        # Critical issues
        if summary["critical_issues"]:
            print("\n🚨 Critical Issues:")
            for issue in summary["critical_issues"]:
                print(f"  • {issue}")
        
        # Recommendations
        if summary["recommendations"]:
            print("\n💡 Recommendations:")
            for rec in summary["recommendations"]:
                print(f"  • {rec}")
        
        print("\n" + "="*50)

# Initialize tester
tester = VoiceAgentTester(voice_agent)

print("✅ Voice Agent Tester initialized")
print("🧪 Use tester.run_comprehensive_test() to run all tests")
print("📊 Use tester.print_summary() to see test results")

### Run Comprehensive Tests

Execute this cell to run all tests and get a comprehensive health check of your voice agent.

In [None]:
# Run comprehensive tests
test_results = tester.run_comprehensive_test()

# Print formatted summary
tester.print_summary()

## 10. Troubleshooting Guide

### Common Issues and Solutions

#### 🎤 Audio Issues

**Problem**: Microphone not working or poor recognition
- **Solution**: Check microphone permissions in your browser/system
- **Solution**: Adjust energy threshold: `audio_processor.recognizer.energy_threshold = 500`
- **Solution**: Test with different microphones or audio devices

**Problem**: No sound output from text-to-speech
- **Solution**: Check system audio settings and volume
- **Solution**: Try different TTS voices: modify `config.TTS_VOICE_INDEX`
- **Solution**: Restart the audio processor: `audio_processor = AudioProcessor(config)`

#### 🤖 API Issues

**Problem**: DeepSeek API connection failed
- **Solution**: Verify API key is correctly set in environment variables
- **Solution**: Check internet connection and firewall settings
- **Solution**: Verify API endpoint URL is correct

**Problem**: API rate limiting or quota exceeded
- **Solution**: Add delays between requests
- **Solution**: Implement request queuing and retry logic
- **Solution**: Check your API usage limits

#### 📡 MCP Issues

**Problem**: Context not maintained between interactions
- **Solution**: Check MCP session management: `mcp.context_history`
- **Solution**: Verify request/response processing
- **Solution**: Restart MCP protocol: `mcp = MCPProtocol(config)`

#### 🔧 Performance Issues

**Problem**: Slow response times
- **Solution**: Reduce `config.MAX_TOKENS` for faster responses
- **Solution**: Optimize speech recognition timeout settings
- **Solution**: Use faster TTS settings: increase `config.TTS_RATE`

### Environment-Specific Solutions

#### Linux/Ubuntu
```bash
# Install audio dependencies
sudo apt-get install portaudio19-dev python3-pyaudio
sudo apt-get install espeak espeak-data libespeak-dev
```

#### macOS
```bash
# Install audio dependencies
brew install portaudio
pip install pyaudio
```

#### Windows
```bash
# PyAudio wheel installation
pip install pipwin
pipwin install pyaudio
```

### Debug Mode

Enable verbose logging for debugging:

In [None]:
# Enable debug logging
import logging

# Set logging level to DEBUG for detailed information
logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    force=True  # Override existing configuration
)

print("🐛 Debug mode enabled")
print("📝 All operations will now show detailed logging information")
print("💡 To disable debug mode, restart the kernel or set level to INFO")

# Test debug logging
logger.debug("Debug mode test message")
logger.info("Info level message")
logger.warning("Warning level message")

## 11. Usage Examples and Best Practices

### Example Conversation Flows

Here are some example interactions you can try with your voice agent:

#### 🎯 Simple Q&A
- **You**: "What's the weather like today?"
- **Agent**: "I don't have access to real-time weather data, but I can help you understand how to check the weather or discuss weather-related topics."

#### 💡 Creative Tasks
- **You**: "Write a short poem about technology"
- **Agent**: [Generates creative poem]

#### 🧮 Problem Solving
- **You**: "Explain how machine learning works"
- **Agent**: [Provides clear explanation suitable for voice]

#### 🎓 Learning Assistant
- **You**: "Help me understand Python functions"
- **Agent**: [Explains programming concepts conversationally]

### Best Practices for Voice Interaction

#### For Users:
1. **Speak clearly** and at a moderate pace
2. **Use a quiet environment** to reduce background noise
3. **Be specific** in your questions for better responses
4. **Wait for the agent** to finish speaking before responding
5. **Use keywords** like "goodbye" or "stop" to end conversations

#### For Developers:
1. **Calibrate microphone** regularly for optimal recognition
2. **Monitor API usage** to avoid rate limits
3. **Implement error handling** for robust operation
4. **Test in different environments** for reliability
5. **Keep responses concise** for better voice experience

### Advanced Configuration Examples


In [None]:
# Example: Custom configuration for different use cases

def create_presentation_config():
    """
    Configuration optimized for presentation mode.
    - Slower speech rate for clarity
    - Higher energy threshold for noisy environments
    - Shorter responses for engagement
    """
    config = VoiceAgentConfig()
    config.TTS_RATE = 150  # Slower speech
    config.ENERGY_THRESHOLD = 500  # Higher threshold
    config.MAX_TOKENS = 500  # Shorter responses
    config.TEMPERATURE = 0.5  # More focused responses
    return config

def create_casual_config():
    """
    Configuration optimized for casual conversation.
    - Natural speech rate
    - Creative responses
    - Longer interaction timeout
    """
    config = VoiceAgentConfig()
    config.TTS_RATE = 200  # Normal speech
    config.ENERGY_THRESHOLD = 300  # Standard threshold
    config.MAX_TOKENS = 800  # Longer responses
    config.TEMPERATURE = 0.8  # More creative
    config.RECOGNITION_TIMEOUT = 10  # Longer timeout
    return config

def create_accessibility_config():
    """
    Configuration optimized for accessibility.
    - Very clear speech
    - Patient interaction timing
    - Detailed responses
    """
    config = VoiceAgentConfig()
    config.TTS_RATE = 120  # Very slow speech
    config.TTS_VOLUME = 1.0  # Maximum volume
    config.RECOGNITION_TIMEOUT = 15  # Very patient
    config.PHRASE_TIMEOUT = 2  # Longer pause tolerance
    config.MAX_TOKENS = 600  # Detailed responses
    return config

# Example usage:
print("📋 Configuration Examples Created:")
print("  🎤 Presentation Mode: create_presentation_config()")
print("  💬 Casual Mode: create_casual_config()")
print("  ♿ Accessibility Mode: create_accessibility_config()")

# To use a different configuration:
# new_config = create_presentation_config()
# new_voice_agent = VoiceAgent(new_config, MCPProtocol(new_config), 
#                              DeepSeekClient(new_config), AudioProcessor(new_config))

## 12. Conclusion and Next Steps

### 🎉 Congratulations!

You have successfully set up and configured a comprehensive MCP-powered voice agent with DeepSeek integration. This implementation includes:

✅ **Complete Voice Pipeline**: Speech recognition → AI processing → Text-to-speech  
✅ **MCP Protocol Integration**: Structured communication and context management  
✅ **DeepSeek AI Model**: Advanced language understanding and generation  
✅ **Interactive Interface**: User-friendly Jupyter widget controls  
✅ **Comprehensive Testing**: Automated health checks and diagnostics  
✅ **Detailed Documentation**: Extensive code comments and explanations  

### 🚀 Next Steps for Enhancement

#### Immediate Improvements:
1. **Add wake word detection** for hands-free activation
2. **Implement conversation memory** across sessions
3. **Add multi-language support** for global accessibility
4. **Create custom voice commands** for specific actions

#### Advanced Features:
1. **Integration with external APIs** (weather, news, calendar)
2. **Voice emotion detection** for more natural responses
3. **Streaming responses** for faster interaction
4. **Voice cloning** for personalized TTS

#### Production Deployment:
1. **Add authentication and security**
2. **Implement rate limiting and monitoring**
3. **Create web interface** for broader accessibility
4. **Add cloud deployment options**

### 📚 Learning Resources

- **MCP Documentation**: Learn more about Model Context Protocol
- **DeepSeek API Docs**: Explore advanced model capabilities
- **Speech Recognition**: Dive deeper into audio processing
- **Jupyter Widgets**: Create more interactive interfaces

### 🤝 Community and Support

- Share your voice agent implementations
- Contribute improvements and bug fixes
- Help others troubleshoot common issues
- Explore creative use cases and applications

### 💡 Final Tips

1. **Start with the testing suite** to ensure everything works
2. **Experiment with different configurations** for your use case
3. **Monitor API usage** to manage costs effectively
4. **Keep your environment updated** for security and performance
5. **Document your customizations** for future reference

---

**Happy voice agent building!** 🎤🤖✨

*Remember: This is a powerful tool - use it responsibly and respect user privacy and consent when processing voice data.*