# Extend model capabilities
## Introduction
* In this notebook, we will explore how to extend the capabilities of AI models by integrating external tools and APIs.
* Extend methodologies include:
  - Tool Use: Integrating external tools to enhance model functionality.
  - API Calls: Leveraging APIs to fetch real-time data or perform specific tasks.
  - MCP (Model context protocol): Using structured protocols to manage model interactions and context.
  - Retrieval-Augmented Generation (RAG): Combining model outputs with retrieved information from external sources.
  - Fine-tuning: Customizing models on specific datasets to improve performance on targeted tasks.

## Technologies
* OpenAI: calling api to OpenAI for getting response
* Gradio: support in building user interface for interacting with AI models

In [6]:
from openai import OpenAI
import gradio as gr
import os
import copy
import json

In [7]:
from dotenv import load_dotenv
load_dotenv()

class EnvService():
    def get_open_ai_key(self):
        open_ai_key = os.getenv("OPEN_AI_KEY")
        if not open_ai_key:
            print("OPEN AI KEY IS NOT SET!!!")
            return 
        return open_ai_key
    def get_weather_api_key(self):
        weather_api_key = os.getenv("WEATHER_API_KEY")
        if not weather_api_key:
            print("WEATHER_API_KEY IS NOT SET!!!")
            return 
        return weather_api_key
    def get_gemini_api_key(self):
        gemini_ai_key = os.getenv("GEMINI_AI_KEY")
        if not gemini_ai_key:
            print("GEMINI_AI_KEY IS NOT SET!!!")
            return
        return gemini_ai_key
env_service = EnvService()

In [8]:
class AIService:
    model = "gpt-4.1"
    def __init__(self):
        self.init_client()
        
    def init_client(self):
        self.client = OpenAI(api_key=env_service.get_open_ai_key())

    def chat(self, messages):
        responses = self.client.chat.completions.create(
            model=self.model,
            messages=messages,
            stream=True
        )
        return responses

In [9]:
class ChatBot:
    def __init__(self):
        self.ai_service = AIService()
    def chat(self, messages, history):
        new_messages = copy.deepcopy(history)
        new_messages.append({"role": "user", "content": messages})
    
        responses = self.ai_service.chat(new_messages)
    
        partial = ""
        for chunk in responses:
            delta = chunk.choices[0].delta
            if delta.content is not None:
                partial += delta.content
                yield [
                    {"role": "assistant", "content": partial}
                ]

    def render_ui(self):
        chat_interface = gr.ChatInterface(fn=self.chat, type="messages")
        chat_interface.launch()
    def run(self):
        self.render_ui()

In [10]:
chat_bot = ChatBot()
chat_bot.run()

* Running on local URL:  http://127.0.0.1:7861
* To create a public link, set `share=True` in `launch()`.


# Chatbot with Tools

## Introduction
* Using tools to improve the chat bot that can create image, audio
* Using tools to imptove the chat bot can provide current weather at a location

In [11]:
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "Location name, e.g. London"},
                },
                "required": ["location"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "generate_image",
            "description": "Generate an image based on a text prompt",
            "parameters": {
                "type": "object",
                "properties": {
                    "prompt": {"type": "string", "description": "What to generate"},
                },
                "required": ["prompt"],
            },
        },
    },
]


In [12]:
import requests

class WeatherService:
    def __init__(self):
        self.api_key = env_service.get_weather_api_key()
        self.api_url = "https://api.openweathermap.org/data/2.5/weather"

    def get_weather(self, location: str) -> str:
        params = {
            "q": location,
            "appid": self.api_key,
            "units": "metric"  # return °C instead of Kelvin
        }
        try:
            response = requests.get(self.api_url, params=params)
            data = response.json()

            if response.status_code != 200:
                return f"Error: {data.get('message', 'Unable to fetch weather')}"

            # Parse relevant info
            city = data.get("name", location)
            country = data.get("sys", {}).get("country", "")
            weather_main = data["weather"][0]["main"]
            weather_desc = data["weather"][0]["description"]
            temp = data["main"]["temp"]
            feels_like = data["main"]["feels_like"]
            humidity = data["main"]["humidity"]
            wind_speed = data["wind"]["speed"]

            return (
                f"Weather in {city}, {country}:\n"
                f"- Condition: {weather_main} ({weather_desc})\n"
                f"- Temperature: {temp:.1f}°C (feels like {feels_like:.1f}°C)\n"
                f"- Humidity: {humidity}%\n"
                f"- Wind speed: {wind_speed} m/s"
            )
        except Exception as e:
            return f"Error fetching weather: {e}"


In [13]:
from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO

class AIServiceWithTools(AIService):
    model_gen_image = 'gemini-2.0-flash-preview-image-generation'

    def __init__(self):
        super().__init__()
        self.tools = tools
        self.client_gemini = genai.Client(
            api_key=env_service.get_gemini_api_key(),
        )
    def chat_with_tools(self, messages):
        responses = self.client.chat.completions.create(
            model=self.model,
            messages=messages,
            tools=self.tools,
            stream=True
        )
        return responses
    def generate_image(self, prompt, save_path):
        response = self.client_gemini.models.generate_content(
            model="gemini-2.0-flash-preview-image-generation",
            contents=prompt,
            config=types.GenerateContentConfig(
            response_modalities=['TEXT', 'IMAGE']
            )
        )

        for part in response.candidates[0].content.parts:
            if part.text is not None:
                print(part.text)
            elif part.inline_data is not None:
                image = Image.open(BytesIO((part.inline_data.data)))
                image.save(save_path)


In [33]:
import json
import copy
import gradio as gr

class ChatBotWithTools(ChatBot):
    def __init__(self):
        super().__init__()
        self.ai_service = AIServiceWithTools()
        self.weather_service = WeatherService()

    # --- Utility: clean history before sending to model ---
    def sanitize_history_for_model(self, history):
        cleaned = []
        for msg in history:
            role = msg.get("role")
            content = msg.get("content")

            # Skip image dicts from Gradio
            if isinstance(content, dict):
                img_desc = content.get("path", "")
                if img_desc:
                    cleaned.append({
                        "role": role,
                        "content": f"[Image shown to user: {img_desc}]"
                    })
                continue

            # Extract text from multimodal lists if any
            if isinstance(content, list):
                text_parts = []
                for c in content:
                    if isinstance(c, dict) and c.get("type") == "text":
                        text_parts.append(c.get("text", ""))
                    elif isinstance(c, str):
                        text_parts.append(c)
                content = "\n".join(text_parts)

            # Convert tuples to string
            if isinstance(content, tuple):
                content = " ".join(map(str, content))

            # Skip empty messages
            if not content or not isinstance(content, str):
                continue

            cleaned.append({"role": role, "content": content})
        return cleaned


    # --- Main chat handler ---
    def chat_with_tools(self, messages, history):
        # Step 1: sanitize previous conversation before sending to model
        safe_history = self.sanitize_history_for_model(history)

        # Step 2: append current user message (handle multimodal format from Gradio)
        user_content = messages
        if isinstance(messages, list):
            # Extract text from multimodal message format
            text_parts = [m.get("text", "") for m in messages if isinstance(m, dict) and m.get("type") == "text"]
            user_content = "\n".join(text_parts) if text_parts else messages
        elif isinstance(messages, dict):
            # Handle dict format (e.g., file uploads)
            user_content = str(messages)
        
        new_messages = copy.deepcopy(safe_history)
        new_messages.append({"role": "user", "content": user_content})

        # Step 3: stream model responses
        print("Sending to model...", new_messages)
        responses = self.ai_service.chat_with_tools(new_messages)

        partial = ""
        tool_call_data = {}

        for chunk in responses:
            delta = chunk.choices[0].delta

            # --- Stream normal text ---
            if delta.content is not None:
                partial += delta.content
                yield [{"role": "assistant", "content": partial}]

            # --- Collect streamed tool calls ---
            if delta.tool_calls:
                for tool_call in delta.tool_calls:
                    idx = tool_call.index
                    fn_name = tool_call.function.name
                    fn_args_part = tool_call.function.arguments

                    if idx not in tool_call_data:
                        tool_call_data[idx] = {"name": fn_name, "args": ""}

                    if fn_name:
                        tool_call_data[idx]["name"] = fn_name
                    if fn_args_part:
                        tool_call_data[idx]["args"] += fn_args_part

        # Step 4: Execute tools after streaming completes
        for idx, tool in tool_call_data.items():
            fn_name = tool["name"]
            args_str = tool["args"]

            try:
                args = json.loads(args_str)
            except Exception as e:
                print("Failed to parse tool args:", args_str, e)
                args = {}

            # --- Weather tool ---
            if fn_name == "get_weather":
                tool_result = self.weather_service.get_weather(**args)
                yield [{"role": "assistant", "content": tool_result}]

            # --- Image generation tool ---
            elif fn_name == "generate_image":
                prompt = args.get("prompt", "")
                save_path = "generate_image.png"
                self.ai_service.generate_image(prompt, save_path)

                # Step 4a: Tell user what image was made
                yield [{"role": "assistant", "content": f"Here is your image for: {prompt}"}]

                # Step 4b: Show image in Gradio
                yield [{"role": "assistant", "content": {"path": save_path, "mime_type": "image/png"}}]

                # Step 4c: Add text reference so model can remember next time
                history.append({"role": "assistant", "content": f"[Generated image for: {prompt}]"})

            else:
                yield [{"role": "assistant", "content": f"Unknown function call: {fn_name}"}]

    # --- Gradio UI setup ---
    def render_ui(self):
        chat_interface = gr.ChatInterface(fn=self.chat_with_tools, type="messages")
        chat_interface.launch()

    def run(self):
        self.render_ui()


In [32]:
chat_bot_with_tools = ChatBotWithTools()
chat_bot_with_tools.run()

* Running on local URL:  http://127.0.0.1:7870
* To create a public link, set `share=True` in `launch()`.


Sending to model... [{'role': 'user', 'content': 'create a picture beautiful cat'}]
I will generate an image of a stunning feline with a luxurious coat that catches the light, captivatingly luminous eyes that convey intelligence and curiosity, and positioned in an elegant and serene posture.

Cleaned message: user create a picture beautiful cat
Cleaned message: assistant /tmp/gradio/1c7ad51123a025caa10580b870d6347a779d8534d90ae00a7edd6d8e106aeb4b/generate_image.png
Sending to model... [{'role': 'user', 'content': 'create a picture beautiful cat'}, {'role': 'assistant', 'content': '/tmp/gradio/1c7ad51123a025caa10580b870d6347a779d8534d90ae00a7edd6d8e106aeb4b/generate_image.png'}, {'role': 'user', 'content': 'nice'}]
Cleaned message: user create a picture beautiful cat
Cleaned message: assistant /tmp/gradio/1c7ad51123a025caa10580b870d6347a779d8534d90ae00a7edd6d8e106aeb4b/generate_image.png
Cleaned message: user nice
Cleaned message: assistant I'm glad you liked it! If you'd like to see mo