# Extend model capabilities
## Introduction
* In this notebook, we will explore how to extend the capabilities of AI models by integrating external tools and APIs.
* Extend methodologies include:
  - Tool Use: Integrating external tools to enhance model functionality.
  - API Calls: Leveraging APIs to fetch real-time data or perform specific tasks.
  - MCP (Model context protocol): Using structured protocols to manage model interactions and context.
  - Retrieval-Augmented Generation (RAG): Combining model outputs with retrieved information from external sources.
  - Fine-tuning: Customizing models on specific datasets to improve performance on targeted tasks.

## Technologies
* OpenAI: calling api to OpenAI for getting response
* Gradio: support in building user interface for interacting with AI models

In [15]:
from openai import OpenAI
import gradio as gr
import os
import copy

In [16]:
from dotenv import load_dotenv
load_dotenv()

class EnvService():
    def get_open_ai_key(self):
        open_ai_key = os.getenv("OPEN_AI_KEY")
        if not open_ai_key:
            print("OPEN AI KEY IS NOT SET!!!")
            return 
        return open_ai_key
    def get_weather_api_key(self):
        weather_api_key = os.getenv("WEATHER_API_KEY")
        if not weather_api_key:
            print("WEATHER_API_KEY IS NOT SET!!!")
            return 
        return weather_api_key
    def get_gemini_api_key(self):
        gemini_ai_key = os.getenv("GEMINI_AI_KEY")
        if not gemini_ai_key:
            print("GEMINI_AI_KEY IS NOT SET!!!")
            return
        return gemini_ai_key
env_service = EnvService()

In [17]:
class AIService:
    model = "gpt-4.1"
    def __init__(self):
        self.init_client()
        
    def init_client(self):
        self.client = OpenAI(api_key=env_service.get_open_ai_key())

    def chat(self, messages):
        responses = self.client.chat.completions.create(
            model=self.model,
            messages=messages,
            stream=True
        )
        return responses

In [18]:
class ChatBot:
    def __init__(self):
        self.ai_service = AIService()
    def chat(self, messages, history):
        new_messages = copy.deepcopy(history)
        new_messages.append({"role": "user", "content": messages})
    
        responses = self.ai_service.chat(new_messages)
    
        partial = ""
        for chunk in responses:
            delta = chunk.choices[0].delta
            if delta.content is not None:
                partial += delta.content
                yield [
                    {"role": "assistant", "content": partial}
                ]

    def render_ui(self):
        chat_interface = gr.ChatInterface(fn=self.chat, type="messages")
        chat_interface.launch()
    def run(self):
        self.render_ui()

In [19]:
chat_bot = ChatBot()
chat_bot.run()

* Running on local URL:  http://127.0.0.1:7862
* To create a public link, set `share=True` in `launch()`.


# Chatbot with Tools

## Introduction
* Using tools to improve the chat bot that can create image, audio
* Using tools to imptove the chat bot can provide current weather at a location

In [20]:
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "Location name, e.g. London"},
                },
                "required": ["location"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "generate_image",
            "description": "Generate an image based on a text prompt",
            "parameters": {
                "type": "object",
                "properties": {
                    "prompt": {"type": "string", "description": "What to generate"},
                },
                "required": ["prompt"],
            },
        },
    },
]


In [21]:
import requests

class WeatherService:
    def __init__(self):
        self.api_key = env_service.get_weather_api_key()
        self.api_url = "https://api.openweathermap.org/data/2.5/weather"

    def get_weather(self, location: str) -> str:
        params = {
            "q": location,
            "appid": self.api_key,
            "units": "metric"  # return °C instead of Kelvin
        }
        try:
            response = requests.get(self.api_url, params=params)
            data = response.json()

            if response.status_code != 200:
                return f"Error: {data.get('message', 'Unable to fetch weather')}"

            # Parse relevant info
            city = data.get("name", location)
            country = data.get("sys", {}).get("country", "")
            weather_main = data["weather"][0]["main"]
            weather_desc = data["weather"][0]["description"]
            temp = data["main"]["temp"]
            feels_like = data["main"]["feels_like"]
            humidity = data["main"]["humidity"]
            wind_speed = data["wind"]["speed"]

            return (
                f"Weather in {city}, {country}:\n"
                f"- Condition: {weather_main} ({weather_desc})\n"
                f"- Temperature: {temp:.1f}°C (feels like {feels_like:.1f}°C)\n"
                f"- Humidity: {humidity}%\n"
                f"- Wind speed: {wind_speed} m/s"
            )
        except Exception as e:
            return f"Error fetching weather: {e}"


In [22]:
from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO

class AIServiceWithTools(AIService):
    model_gen_image = 'gemini-2.0-flash-preview-image-generation'

    def __init__(self):
        super().__init__()
        self.tools = tools
        self.client_gemini = genai.Client(
            api_key=env_service.get_gemini_api_key(),
        )
    def chat_with_tools(self, messages):
        responses = self.client.chat.completions.create(
            model=self.model,
            messages=messages,
            tools=self.tools,
            stream=True
        )
        return responses
    def generate_image(self, prompt, save_path):
        response = self.client_gemini.models.generate_content(
            model="gemini-2.0-flash-preview-image-generation",
            contents=prompt,
            config=types.GenerateContentConfig(
            response_modalities=['TEXT', 'IMAGE']
            )
        )

        for part in response.candidates[0].content.parts:
            if part.text is not None:
                print(part.text)
            elif part.inline_data is not None:
                image = Image.open(BytesIO((part.inline_data.data)))
                image.save(save_path)


In [23]:
import json
import copy
import gradio as gr

class ChatBotWithTools(ChatBot):
    def __init__(self):
        super().__init__()
        self.ai_service = AIServiceWithTools()
        self.weather_service = WeatherService()

    # --- Utility: clean history before sending to model ---
    def sanitize_history_for_model(self, history):
        cleaned = []
        for msg in history:
            role = msg.get("role")
            content = msg.get("content")

            # Skip image dicts from Gradio
            if isinstance(content, dict):
                img_desc = content.get("path", "")
                if img_desc:
                    cleaned.append({
                        "role": role,
                        "content": f"[Image shown to user: {img_desc}]"
                    })
                continue

            # Extract text from multimodal lists if any
            if isinstance(content, list):
                text_parts = []
                for c in content:
                    if isinstance(c, dict) and c.get("type") == "text":
                        text_parts.append(c.get("text", ""))
                    elif isinstance(c, str):
                        text_parts.append(c)
                content = "\n".join(text_parts)

            # Convert tuples to string
            if isinstance(content, tuple):
                content = " ".join(map(str, content))

            # Skip empty messages
            if not content or not isinstance(content, str):
                continue

            cleaned.append({"role": role, "content": content})
        return cleaned


    # --- Main chat handler ---
    def chat_with_tools(self, messages, history):
        # Step 1: sanitize previous conversation before sending to model
        safe_history = self.sanitize_history_for_model(history)

        # Step 2: append current user message (handle multimodal format from Gradio)
        user_content = messages
        if isinstance(messages, list):
            # Extract text from multimodal message format
            text_parts = [m.get("text", "") for m in messages if isinstance(m, dict) and m.get("type") == "text"]
            user_content = "\n".join(text_parts) if text_parts else messages
        elif isinstance(messages, dict):
            # Handle dict format (e.g., file uploads)
            user_content = str(messages)
        
        new_messages = copy.deepcopy(safe_history)
        new_messages.append({"role": "user", "content": user_content})

        # Step 3: stream model responses
        print("Sending to model...", new_messages)
        responses = self.ai_service.chat_with_tools(new_messages)

        partial = ""
        tool_call_data = {}

        for chunk in responses:
            delta = chunk.choices[0].delta

            # --- Stream normal text ---
            if delta.content is not None:
                partial += delta.content
                yield [{"role": "assistant", "content": partial}]

            # --- Collect streamed tool calls ---
            if delta.tool_calls:
                for tool_call in delta.tool_calls:
                    idx = tool_call.index
                    fn_name = tool_call.function.name
                    fn_args_part = tool_call.function.arguments

                    if idx not in tool_call_data:
                        tool_call_data[idx] = {"name": fn_name, "args": ""}

                    if fn_name:
                        tool_call_data[idx]["name"] = fn_name
                    if fn_args_part:
                        tool_call_data[idx]["args"] += fn_args_part

        # Step 4: Execute tools after streaming completes
        for idx, tool in tool_call_data.items():
            fn_name = tool["name"]
            args_str = tool["args"]

            try:
                args = json.loads(args_str)
            except Exception as e:
                print("Failed to parse tool args:", args_str, e)
                args = {}

            # --- Weather tool ---
            if fn_name == "get_weather":
                tool_result = self.weather_service.get_weather(**args)
                yield [{"role": "assistant", "content": tool_result}]

            # --- Image generation tool ---
            elif fn_name == "generate_image":
                prompt = args.get("prompt", "")
                save_path = "generate_image.png"
                self.ai_service.generate_image(prompt, save_path)

                # Step 4a: Tell user what image was made
                yield [{"role": "assistant", "content": f"Here is your image for: {prompt}"}]

                # Step 4b: Show image in Gradio
                yield [{"role": "assistant", "content": {"path": save_path, "mime_type": "image/png"}}]

                # Step 4c: Add text reference so model can remember next time
                history.append({"role": "assistant", "content": f"[Generated image for: {prompt}]"})

            else:
                yield [{"role": "assistant", "content": f"Unknown function call: {fn_name}"}]

    # --- Gradio UI setup ---
    def render_ui(self):
        chat_interface = gr.ChatInterface(fn=self.chat_with_tools, type="messages")
        chat_interface.launch()

    def run(self):
        self.render_ui()


In [24]:
chat_bot_with_tools = ChatBotWithTools()
chat_bot_with_tools.run()

* Running on local URL:  http://127.0.0.1:7863
* To create a public link, set `share=True` in `launch()`.


## ChatBot with MCP

### Introduction
* Extending the base ChatBot class to support tool usage via MCP
* Give example of setting calendar for scheduling study time (list available subjects, set study time, view schedule)

In [25]:
class AIServiceWithMCP(AIServiceWithTools):
    def chat_with_mcp(self, messages):
        responses = self.client.chat.completions.create(
            model=self.model,
            messages=messages,
            tools=tools,
            stream=True
        )
        return responses

In [26]:
import asyncio
from typing import Optional
from contextlib import AsyncExitStack
import nest_asyncio

from mcp import ClientSession
from mcp.client.sse import sse_client

# Allow nested event loops (required for Jupyter + Gradio + MCP)
nest_asyncio.apply()

class ChatBotWithMCP(ChatBotWithTools):
    def __init__(self):
        super().__init__()
        self.mcp_session: Optional[ClientSession] = None
        self.exit_stack = AsyncExitStack()
        self.mcp_tools = []
        self.ai_service_with_mcp = AIServiceWithMCP()
        self.event_loop = None  # Store the loop where MCP session was created
    
    async def start_mcp_session(self):
        """Connect to SSE-based MCP server and keep session alive"""
        server_url = "http://localhost:8000/sse"
        
        # Store the event loop where we create the session
        self.event_loop = asyncio.get_event_loop()
        
        # Keep the context alive by storing in exit_stack
        read, write = await self.exit_stack.enter_async_context(sse_client(server_url))
        self.mcp_session = await self.exit_stack.enter_async_context(ClientSession(read, write))
        
        await self.mcp_session.initialize()
        
        # List available tools
        response = await self.mcp_session.list_tools()
        print("\nConnected to MCP server with tools:", [tool.name for tool in response.tools])
        self.mcp_tools = response.tools
        
        return self.mcp_session

    def convert_mcp_tools_to_openai_format(self):
        """Convert MCP tools to OpenAI function calling format."""
        openai_tools = []
        for tool in self.mcp_tools:
            # Get input schema or create default
            input_schema = tool.inputSchema if hasattr(tool, 'inputSchema') and tool.inputSchema else {
                "type": "object",
                "properties": {},
                "required": []
            }
            
            # Ensure input_schema has required fields
            if not isinstance(input_schema, dict):
                input_schema = {"type": "object", "properties": {}, "required": []}
            if "type" not in input_schema:
                input_schema["type"] = "object"
            if "properties" not in input_schema:
                input_schema["properties"] = {}
            
            openai_tool = {
                "type": "function",
                "function": {
                    "name": tool.name,
                    "description": tool.description if hasattr(tool, 'description') and tool.description else f"Execute {tool.name}",
                    "parameters": input_schema
                }
            }
            openai_tools.append(openai_tool)
            
        print(f"Converted {len(openai_tools)} MCP tools to OpenAI format")
        # Merge with existing tools (weather, image generation)
        all_tools = tools + openai_tools  # Combine built-in and MCP tools
        print(f"Total tools available: {len(all_tools)} (Built-in: {len(tools)}, MCP: {len(openai_tools)})")
        return all_tools

    async def execute_mcp_tool(self, tool_name: str, arguments: dict) -> str:
        """Execute a tool on the MCP server."""
        if not self.mcp_session:
            return "Error: MCP session not initialized"
        
        try:
            print(f"Executing MCP tool: {tool_name} with args: {arguments}")
            result = await self.mcp_session.call_tool(tool_name, arguments=arguments)
            print(f"MCP tool result: {result}")
            
            # Extract text content from result
            if hasattr(result, 'content') and len(result.content) > 0:
                content_item = result.content[0]
                if hasattr(content_item, 'text'):
                    return content_item.text
            return str(result)
        except Exception as e:
            import traceback
            error_details = traceback.format_exc()
            print(f"Error executing MCP tool: {error_details}")
            return f"Error executing tool {tool_name}: {str(e)}"

    def chat_with_mcp_tool(self, messages, history):
        """Chat method that uses MCP server tools with OpenAI function calling."""
        # Sanitize history
        safe_history = self.sanitize_history_for_model(history)
        
        # Prepare user message
        user_content = messages
        if isinstance(messages, list):
            text_parts = [m.get("text", "") for m in messages if isinstance(m, dict) and m.get("type") == "text"]
            user_content = "\n".join(text_parts) if text_parts else messages
        elif isinstance(messages, dict):
            user_content = str(messages)
        
        new_messages = copy.deepcopy(safe_history)
        new_messages.append({"role": "user", "content": user_content})

        # Get all tools (MCP + built-in)
        all_tools = self.convert_mcp_tools_to_openai_format()
        
        # Call OpenAI with combined tools
        responses = self.ai_service.client.chat.completions.create(
            model=self.ai_service.model,
            messages=new_messages,
            tools=all_tools,
            stream=True
        )

        partial = ""
        tool_call_data = {}

        for chunk in responses:
            delta = chunk.choices[0].delta

            # Stream text content
            if delta.content is not None:
                partial += delta.content
                yield [{"role": "assistant", "content": partial}]

            # Collect tool calls
            if delta.tool_calls:
                for tool_call in delta.tool_calls:
                    idx = tool_call.index
                    fn_name = tool_call.function.name
                    fn_args_part = tool_call.function.arguments

                    if idx not in tool_call_data:
                        tool_call_data[idx] = {"name": fn_name, "args": ""}

                    if fn_name:
                        tool_call_data[idx]["name"] = fn_name
                    if fn_args_part:
                        tool_call_data[idx]["args"] += fn_args_part

        # Execute collected tool calls
        for idx, tool in tool_call_data.items():
            fn_name = tool["name"]
            args_str = tool["args"]

            try:
                args = json.loads(args_str)
            except Exception as e:
                print("Failed to parse tool args:", args_str, e)
                args = {}

            # Check if it's an MCP tool
            mcp_tool_names = [t.name for t in self.mcp_tools]
            if fn_name in mcp_tool_names:
                # Execute MCP tool using the same event loop where session was created
                try:
                    if self.event_loop and self.event_loop.is_running():
                        # Use the same loop that created the MCP session
                        future = asyncio.run_coroutine_threadsafe(
                            self.execute_mcp_tool(fn_name, args),
                            self.event_loop
                        )
                        tool_result = future.result(timeout=30)
                    else:
                        # Fallback if loop is not running
                        tool_result = asyncio.run(self.execute_mcp_tool(fn_name, args))
                    
                    # Ensure we got a valid result
                    if tool_result is None:
                        tool_result = "No response from MCP tool"
                        
                except Exception as e:
                    import traceback
                    tool_result = f"Error executing MCP tool: {str(e)}\n{traceback.format_exc()}"
                    
                yield [{"role": "assistant", "content": tool_result}]
            
            # Handle built-in tools
            elif fn_name == "get_weather":
                tool_result = self.weather_service.get_weather(**args)
                yield [{"role": "assistant", "content": tool_result}]

            elif fn_name == "generate_image":
                prompt = args.get("prompt", "")
                save_path = "generate_image.png"
                self.ai_service.generate_image(prompt, save_path)
                yield [{"role": "assistant", "content": f"Here is your image for: {prompt}"}]
                yield [{"role": "assistant", "content": {"path": save_path, "mime_type": "image/png"}}]
                history.append({"role": "assistant", "content": f"[Generated image for: {prompt}]"})

            else:
                yield [{"role": "assistant", "content": f"Unknown function call: {fn_name}"}]

    # --- Gradio UI setup ---
    def render_ui_with_mcp(self):
        """Render UI that uses MCP tools."""
        chat_interface = gr.ChatInterface(fn=self.chat_with_mcp_tool, type="messages")
        chat_interface.launch()

    async def run_with_mcp_async(self):
        """Async wrapper to initialize MCP before launching UI"""
        await self.start_mcp_session()
        self.render_ui_with_mcp()

    def run_with_mcp(self):
        """Run chatbot with MCP tools."""
        # Initialize MCP session before starting UI
        loop = asyncio.get_event_loop()
        if loop.is_running():
            # Jupyter notebook case
            asyncio.ensure_future(self.run_with_mcp_async())
        else:
            loop.run_until_complete(self.run_with_mcp_async())
    
    async def cleanup(self):
        """Cleanup MCP session when done"""
        await self.exit_stack.aclose()

In [27]:
# Initialize and test MCP connection
chat_bot_with_mcp = ChatBotWithMCP()
chat_bot_with_mcp.run_with_mcp()

# ChatBot with RAG
## Introduction
* Using Retrieval-Augmented Generation (RAG) to enhance chatbot responses with relevant external information.
* Example: Building a chatbot that can answer questions about a specific document or dataset by retrieving relevant

In [28]:
import sys
from pathlib import Path

# Add the src directory to path for importing embedded module
current_dir = Path.cwd()
if current_dir.name == 'src':
    sys.path.insert(0, str(current_dir))
else:
    src_dir = current_dir / 'src'
    if src_dir.exists():
        sys.path.insert(0, str(src_dir))

from embedded import KnowledgeBaseVectorizer

class ChatBotWithRAG(ChatBotWithMCP):
    def __init__(self):
        super().__init__()
        # Initialize RAG-specific components with pgvector
        self.vectorizer = self._init_vectorizer()
    
    def _init_vectorizer(self):
        """Initialize the KnowledgeBaseVectorizer for RAG retrieval"""
        db_config = {
            'host': os.getenv('DB_HOST', 'localhost'),
            'port': os.getenv('DB_PORT', '5432'),
            'database': os.getenv('DB_NAME', 'ai_chatbot'),
            'user': os.getenv('DB_USER', 'postgres'),
            'password': os.getenv('DB_PASSWORD', 'postgres')
        }
        
        # Path to knowledge base - navigate from current directory
        current_path = Path.cwd()
        if current_path.name == 'src':
            knowledge_base_path = current_path.parent / 'knowledge-base'
        else:
            knowledge_base_path = current_path / 'knowledge-base'
        
        vectorizer = KnowledgeBaseVectorizer(str(knowledge_base_path), db_config)
        vectorizer.connect_db()
        print("✓ RAG vectorizer initialized and connected to database")
        
        return vectorizer
    
    def retrieve_relevant_docs(self, query: str, top_k: int = 3) -> list:
        """
        Retrieve relevant documents from pgvector based on the query
        
        Args:
            query: User's question/query
            top_k: Number of relevant documents to retrieve
            
        Returns:
            List of relevant document contents
        """
        try:
            # Use pgvector similarity search to find relevant content
            results = self.vectorizer.search_similar(
                query=query,
                top_k=top_k,
                expand_query=True,  # Use intelligent query expansion
                min_similarity=0.3   # Only return reasonably relevant results
            )
            
            # Extract content from results and clean metadata prefixes
            relevant_docs = []
            for result in results:
                content = result['content']
                # Remove metadata prefix for cleaner context
                if '\n\n' in content:
                    content = content.split('\n\n', 1)[-1]
                
                # Add source information
                source_info = f"[Source: {result['file_path']} - Similarity: {result['similarity']:.2f}]"
                relevant_docs.append(f"{source_info}\n{content}")
            
            return relevant_docs
            
        except Exception as e:
            print(f"Error retrieving documents: {e}")
            return []
    
    def chat_with_rag(self, messages, history):
        # Step 1: Sanitize history
        safe_history = self.sanitize_history_for_model(history)
        
        # Step 2: Prepare user message
        user_content = messages
        if isinstance(messages, list):
            text_parts = [m.get("text", "") for m in messages if isinstance(m, dict) and m.get("type") == "text"]
            user_content = "\n".join(text_parts) if text_parts else messages
        elif isinstance(messages, dict):
            user_content = str(messages)
        
        new_messages = copy.deepcopy(safe_history)
        
        # Step 3: Retrieve relevant documents from pgvector
        print(f"🔍 Retrieving relevant documents for: {user_content}")
        relevant_docs = self.retrieve_relevant_docs(user_content, top_k=3)
        
        # Step 4: Augment user message with retrieved docs if found
        if relevant_docs:
            docs_content = "\n\n---\n\n".join(relevant_docs)
            augmented_message = (
                f"User Question: {user_content}\n\n"
                f"Relevant Context from Knowledge Base:\n{docs_content}\n\n"
                f"Please answer the user's question using the provided context. "
                f"If the context is relevant, use it to provide accurate information. "
                f"If the context is not relevant, answer based on your general knowledge."
            )
            print(f"✓ Found {len(relevant_docs)} relevant documents")
        else:
            augmented_message = user_content
            print("⚠️  No relevant documents found, using general knowledge")
        
        new_messages.append({"role": "user", "content": augmented_message})
        
        # Step 5: Get all tools (MCP + built-in)
        all_tools = self.convert_mcp_tools_to_openai_format() if self.mcp_tools else tools
        
        # Step 6: Call chat with tools
        responses = self.ai_service.client.chat.completions.create(
            model=self.ai_service.model,
            messages=new_messages,
            tools=all_tools,
            stream=True
        )
        
        partial = ""
        tool_call_data = {}

        for chunk in responses:
            delta = chunk.choices[0].delta

            # Stream text content
            if delta.content is not None:
                partial += delta.content
                yield [{"role": "assistant", "content": partial}]

            # Collect tool calls
            if delta.tool_calls:
                for tool_call in delta.tool_calls:
                    idx = tool_call.index
                    fn_name = tool_call.function.name
                    fn_args_part = tool_call.function.arguments

                    if idx not in tool_call_data:
                        tool_call_data[idx] = {"name": fn_name, "args": ""}

                    if fn_name:
                        tool_call_data[idx]["name"] = fn_name
                    if fn_args_part:
                        tool_call_data[idx]["args"] += fn_args_part

        # Execute collected tool calls
        for idx, tool in tool_call_data.items():
            fn_name = tool["name"]
            args_str = tool["args"]

            try:
                args = json.loads(args_str)
            except Exception as e:
                print("Failed to parse tool args:", args_str, e)
                args = {}

            # Check if it's an MCP tool
            if self.mcp_tools:
                mcp_tool_names = [t.name for t in self.mcp_tools]
                if fn_name in mcp_tool_names:
                    try:
                        if self.event_loop and self.event_loop.is_running():
                            future = asyncio.run_coroutine_threadsafe(
                                self.execute_mcp_tool(fn_name, args),
                                self.event_loop
                            )
                            tool_result = future.result(timeout=30)
                        else:
                            tool_result = asyncio.run(self.execute_mcp_tool(fn_name, args))
                        
                        if tool_result is None:
                            tool_result = "No response from MCP tool"
                    except Exception as e:
                        import traceback
                        tool_result = f"Error executing MCP tool: {str(e)}\n{traceback.format_exc()}"
                    
                    yield [{"role": "assistant", "content": tool_result}]
                    continue
            
            # Handle built-in tools
            if fn_name == "get_weather":
                tool_result = self.weather_service.get_weather(**args)
                yield [{"role": "assistant", "content": tool_result}]

            elif fn_name == "generate_image":
                prompt = args.get("prompt", "")
                save_path = "generate_image.png"
                self.ai_service.generate_image(prompt, save_path)
                yield [{"role": "assistant", "content": f"Here is your image for: {prompt}"}]
                yield [{"role": "assistant", "content": {"path": save_path, "mime_type": "image/png"}}]
                history.append({"role": "assistant", "content": f"[Generated image for: {prompt}]"})

            else:
                yield [{"role": "assistant", "content": f"Unknown function call: {fn_name}"}]
    
    def render_ui_with_rag(self):
        """Render UI that uses RAG with pgvector."""
        chat_interface = gr.ChatInterface(
            fn=self.chat_with_rag, 
            type="messages",
            title="AI Chatbot with RAG (pgvector)",
            description="Ask questions and I'll search the knowledge base for relevant information!"
        )
        chat_interface.launch()

    async def run_with_rag_async(self):
        """Async wrapper to initialize MCP (if available) before launching UI"""
        try:
            await self.start_mcp_session()
            print("✓ MCP session started")
        except Exception as e:
            print(f"⚠️  MCP not available: {e}")
            print("Continuing with RAG only...")
        
        self.render_ui_with_rag()

    def run_with_rag(self):
        """Run chatbot with RAG and optional MCP tools."""
        loop = asyncio.get_event_loop()
        if loop.is_running():
            # Jupyter notebook case
            asyncio.ensure_future(self.run_with_rag_async())
        else:
            loop.run_until_complete(self.run_with_rag_async())
    
    def cleanup(self):
        """Cleanup resources"""
        if self.vectorizer:
            self.vectorizer.close()
        super().cleanup()


In [29]:
# Initialize and run RAG chatbot
chat_bot_with_rag = ChatBotWithRAG()
chat_bot_with_rag.run_with_rag()


✓ Connected to PostgreSQL database
✓ RAG vectorizer initialized and connected to database


⚠️  MCP not available: unhandled errors in a TaskGroup (1 sub-exception)
Continuing with RAG only...
* Running on local URL:  http://127.0.0.1:7864
* To create a public link, set `share=True` in `launch()`.
* Running on local URL:  http://127.0.0.1:7864
* To create a public link, set `share=True` in `launch()`.


🔍 Retrieving relevant documents for: Hello do you know Oven company?
Total embeddings in database: 11
Expanded query: 'Company information and details: Hello do you know Oven company?'
Generating embedding for query: 'Company information and details: Hello do you know Oven company?'
✓ Query embedding generated (dimension: 1536)
✓ Found 3 similar results
  Top result similarity: 0.7200 (distance: 0.2800)
  Unique entities: 3
✓ Found 3 relevant documents
✓ Query embedding generated (dimension: 1536)
✓ Found 3 similar results
  Top result similarity: 0.7200 (distance: 0.2800)
  Unique entities: 3
✓ Found 3 relevant documents
🔍 Retrieving relevant documents for: Have you know members in Oven and which roles of them?
Total embeddings in database: 11
Expanded query: 'Company information and details: Have you know members in Oven and which roles of them?'
Generating embedding for query: 'Company information and details: Have you know members in Oven and which roles of them?'
🔍 Retrieving rele

## How RAG Integration Works

### Features:
1. **pgvector Integration**: Uses the `KnowledgeBaseVectorizer` from `embedded.py` to search relevant documents
2. **Semantic Search**: Automatically finds the most relevant information from your knowledge base using embeddings
3. **Context Augmentation**: Enriches user queries with relevant documents before sending to the AI model
4. **Intelligent Query Expansion**: Automatically expands queries for better semantic matching
5. **Similarity Filtering**: Only includes documents with similarity > 0.3 to ensure relevance

### Example Queries:
- "Who is a Backend Engineer?" → Searches employee profiles and returns relevant matches
- "Tell me about Long" → Retrieves Long's profile information
- "What skills does a Frontend Engineer need?" → Finds relevant employee profiles with those skills
- "What does Oven do?" → Searches company information

### How It Works:
1. User asks a question
2. System searches pgvector for relevant documents (top 3 results)
3. Augments the user's question with retrieved context
4. Sends augmented query to AI model
5. AI responds using both the context and its general knowledge

### Requirements:
- PostgreSQL with pgvector extension installed
- Embeddings already generated (run `python src/embedded.py` first)
- Environment variables set in `.env` file (DB credentials, OpenAI API key)
