# Lesson 3: Chatbot Example

In this lesson, you will familiarize yourself with the chatbot example you will work on during this course. The example includes the tool definitions and execution, as well as the chatbot code. Make sure to interact with the chatbot at the end of this notebook.

## Import Libraries

In [9]:
import arxiv
import json
import os
from typing import List
from dotenv import load_dotenv
import anthropic

## Tool Functions

In [10]:
PAPER_DIR = "papers"

The first tool searches for relevant arXiv papers based on a topic and stores the papers' info in a JSON file (title, authors, summary, paper url and the publication date). The JSON files are organized by topics in the `papers` directory. The tool does not download the papers.  

In [11]:
from typing import List

def search_papers(topic: str, max_results: int = 5) -> List[str]:
    """
    Search for papers on arXiv based on a topic and store their information.
    
    Args:
        topic: The topic to search for
        max_results: Maximum number of results to retrieve (default: 5)
        
    Returns:
        List of paper IDs found in the search
    """
    
    # Use arxiv to find the papers 
    client = arxiv.Client()

    # Search for the most relevant articles matching the queried topic
    search = arxiv.Search(
        query = topic,
        max_results = max_results,
        sort_by = arxiv.SortCriterion.Relevance
    )

    papers = client.results(search)
    
    # Create directory for this topic
    path = os.path.join(PAPER_DIR, topic.lower().replace(" ", "_"))
    os.makedirs(path, exist_ok=True)
    
    file_path = os.path.join(path, "papers_info.json")

    # Try to load existing papers info
    try:
        with open(file_path, "r") as json_file:
            papers_info = json.load(json_file)
    except (FileNotFoundError, json.JSONDecodeError):
        papers_info = {}

    # Process each paper and add to papers_info  
    paper_ids = []
    for paper in papers:
        paper_ids.append(paper.get_short_id())
        paper_info = {
            'title': paper.title,
            'authors': [author.name for author in paper.authors],
            'summary': paper.summary,
            'pdf_url': paper.pdf_url,
            'published': str(paper.published.date())
        }
        papers_info[paper.get_short_id()] = paper_info
    
    # Save updated papers_info to json file
    with open(file_path, "w") as json_file:
        json.dump(papers_info, json_file, indent=2)
    
    print(f"Results are saved in: {file_path}")
    
    return paper_ids

In [12]:
search_papers("RAG pipelines")

Results are saved in: papers\rag_pipelines\papers_info.json


['2506.09542v1',
 '2501.15228v1',
 '2504.13587v1',
 '2409.19019v1',
 '2506.08364v2']

The second tool looks for information about a specific paper across all topic directories inside the `papers` directory.

In [13]:
def extract_info(paper_id: str) -> str:
    """
    Search for information about a specific paper across all topic directories.
    
    Args:
        paper_id: The ID of the paper to look for
        
    Returns:
        JSON string with paper information if found, error message if not found
    """
 
    for item in os.listdir(PAPER_DIR):
        item_path = os.path.join(PAPER_DIR, item)
        if os.path.isdir(item_path):
            file_path = os.path.join(item_path, "papers_info.json")
            if os.path.isfile(file_path):
                try:
                    with open(file_path, "r") as json_file:
                        papers_info = json.load(json_file)
                        if paper_id in papers_info:
                            return json.dumps(papers_info[paper_id], indent=2)
                except (FileNotFoundError, json.JSONDecodeError) as e:
                    print(f"Error reading {file_path}: {str(e)}")
                    continue
    
    return f"There's no saved information related to paper {paper_id}."

In [14]:
extract_info('2506.09542v1')

'{\n  "title": "KG-Infused RAG: Augmenting Corpus-Based RAG with External Knowledge Graphs",\n  "authors": [\n    "Dingjun Wu",\n    "Yukun Yan",\n    "Zhenghao Liu",\n    "Zhiyuan Liu",\n    "Maosong Sun"\n  ],\n  "summary": "Retrieval-Augmented Generation (RAG) improves factual accuracy by grounding\\nresponses in external knowledge. However, existing methods typically rely on a\\nsingle source, either unstructured text or structured knowledge. Moreover, they\\nlack cognitively inspired mechanisms for activating relevant knowledge. To\\naddress these issues, we propose KG-Infused RAG, a framework that integrates\\nKGs into RAG systems to implement spreading activation, a cognitive process\\nthat enables concept association and inference. KG-Infused RAG retrieves KG\\nfacts, expands the query accordingly, and enhances generation by combining\\ncorpus passages with structured facts, enabling interpretable, multi-source\\nretrieval grounded in semantic structure. We further improve KG-I

## Tool Schema

Here are the schema of each tool which you will provide to the LLM.

In [16]:
import ollama
ollama.list().models

[Model(model='gemma3:4b', modified_at=datetime.datetime(2025, 6, 18, 0, 35, 9, 741285, tzinfo=TzInfo(+05:30)), digest='a2af6cc3eb7fa8be8504abaf9b04e88f17a119ec3f04a3addf55f92841195f5a', size=3338801804, details=ModelDetails(parent_model='', format='gguf', family='gemma3', families=['gemma3'], parameter_size='4.3B', quantization_level='Q4_K_M')),
 Model(model='qwen3:1.7b', modified_at=datetime.datetime(2025, 6, 13, 0, 0, 55, 230752, tzinfo=TzInfo(+05:30)), digest='8f68893c685c3ddff2aa3fffce2aa60a30bb2da65ca488b61fff134a4d1730e7', size=1359293444, details=ModelDetails(parent_model='', format='gguf', family='qwen3', families=['qwen3'], parameter_size='2.0B', quantization_level='Q4_K_M')),
 Model(model='gemma3:1b', modified_at=datetime.datetime(2025, 5, 10, 11, 10, 25, 297843, tzinfo=TzInfo(+05:30)), digest='8648f39daa8fbf5b18c7b4e6a8fb4990c692751d49917417b8842ca5758e7ffc', size=815319791, details=ModelDetails(parent_model='', format='gguf', family='gemma3', families=['gemma3'], parameter_

In [17]:
import ollama

response = ollama.chat(
    model='qwen3:1.7b',
    messages=[{'role': 'user', 'content':
        'What is the weather in Toronto?'}],

		# provide a weather checking tool to the model
    tools=[{
      'type': 'function',
      'function': {
        'name': 'get_current_weather',
        'description': 'Get the current weather for a city',
        'parameters': {
          'type': 'object',
          'properties': {
            'city': {
              'type': 'string',
              'description': 'The name of the city',
            },
          },
          'required': ['city'],
        },
      },
    },
  ],
)

print(response['message']['tool_calls'])

[ToolCall(function=Function(name='get_current_weather', arguments={'city': 'Toronto'}))]


In [18]:
tools = [
    {
        "name": "search_papers",
        "description": "This tool search for papers on arXiv based on a topic and store their information.",
        "input_schema": {
            "type": "object",
            "properties": {
                "topic": {
                    "type": "string",
                    "description": "The topic to search for"
                }, 
                "max_results": {
                    "type": "integer",
                    "description": "Maximum number of results to retrieve",
                    "default": 5
                }
            },
            "required": ["topic"]
        }
    },
    {
        "name": "extract_info",
        "description": "This tool search for information about a specific paper across all topic directories.",
        "input_schema": {
            "type": "object",
            "properties": {
                "paper_id": {
                    "type": "string",
                    "description": "The ID of the paper to look for"
                }
            },
            "required": ["paper_id"]
        }
    },
    {
      'type': 'function',
      'function': {
        'name': 'get_current_weather',
        'description': 'Get the current weather for a city',
        'parameters': {
          'type': 'object',
          'properties': {
            'city': {
              'type': 'string',
              'description': 'The name of the city',
            },
          },
          'required': ['city'],
        },
      },
    }
]

## Tool Mapping

This code handles tool mapping and execution.

In [None]:
mapping_tool_function = {
    "search_papers": search_papers,
    "extract_info": extract_info
}

def execute_tool(tool_name, tool_args):
    
    result = mapping_tool_function[tool_name](**tool_args)

    if result is None:
        result = "The operation completed but didn't return any results."
        
    elif isinstance(result, list):
        result = ', '.join(result)
        
    elif isinstance(result, dict):
        # Convert dictionaries to formatted JSON strings
        result = json.dumps(result, indent=2)
    
    else:
        # For any other type, convert using str()
        result = str(result)
    return result

In [None]:
tools

[{'name': 'search_papers',
  'description': 'This tool search for papers on arXiv based on a topic and store their information.',
  'input_schema': {'type': 'object',
   'properties': {'topic': {'type': 'string',
     'description': 'The topic to search for'},
    'max_results': {'type': 'integer',
     'description': 'Maximum number of results to retrieve',
     'default': 5}},
   'required': ['topic']}},
 {'name': 'extract_info',
  'description': 'This tool search for information about a specific paper across all topic directories.',
  'input_schema': {'type': 'object',
   'properties': {'paper_id': {'type': 'string',
     'description': 'The ID of the paper to look for'}},
   'required': ['paper_id']}},
 {'type': 'function',
  'function': {'name': 'get_current_weather',
   'description': 'Get the current weather for a city',
   'parameters': {'type': 'object',
    'properties': {'city': {'type': 'string',
      'description': 'The name of the city'}},
    'required': ['city']}}}]

## Chatbot Code

The chatbot handles the user's queries one by one, but it does not persist memory across the queries.

In [8]:
import ollama 
from ollama import Client

client = Client()

response = ollama.chat(
    model='qwen3:1.7b',
    messages=[{'role': 'user', 'content':
        'what is the weather in bangalore'}],
		# provide a weather checking tool to the model
    tools=tools
)

NameError: name 'tools' is not defined

In [None]:
print(response.message.content)

<think>
Okay, the user wants to search for a paper on mercury from Arxiv. Let me think about how to approach this.

First, I need to recall the available tools. The user provided two empty functions, but maybe there's a way to use an external service. Since the functions are empty, perhaps I should use a web search or a specific API.

Wait, the user mentioned Arxiv, so maybe the assistant can use a web search function. But the tools given are empty. Hmm. The assistant might not have the capability to perform web searches directly. Alternatively, maybe the assistant is supposed to use a predefined function for searching Arxiv.

Wait, the tools section shows two functions with empty names and descriptions. That's confusing. Maybe the assistant is supposed to generate a search query based on the user's input. Since the user is asking for papers on mercury from Arxiv, the assistant could generate a search term like "mercury" and use a web search function.

But the assistant's tools are emp

In [69]:
for content in response.message:
    print(content)

('role', 'assistant')
('content', '<think>\nOkay, the user wants to search for a paper on mercury from Arxiv. Let me think about how to approach this.\n\nFirst, I need to recall if there are any specific functions available for accessing Arxiv. The user provided some tools, but looking at the functions, they\'re both empty. Wait, maybe the functions are placeholders? The description says "search for paper on the topic mercury from arxiv" but the parameters are empty. \n\nSince the tools are empty, I can\'t use any functions. But the user is asking for a search. Maybe I should inform them that the current setup doesn\'t support that functionality. Alternatively, perhaps there\'s a way to simulate a search using the available tools, even if they\'re empty. But since the functions are empty, I can\'t perform any action. \n\nI should check if there\'s any other way. Maybe the assistant can directly provide information, but the tools are not set up for that. Since the user is asking for a s

In [43]:
for content in response:
    print(content)

('model', 'qwen3:1.7b')
('created_at', '2025-07-16T17:06:27.7576121Z')
('done', True)
('done_reason', 'stop')
('total_duration', 9947200900)
('load_duration', 118638900)
('prompt_eval_count', 154)
('prompt_eval_duration', 344793200)
('eval_count', 278)
('eval_duration', 9481054300)
('message', Message(role='assistant', content="<think>\nOkay, the user is asking for the latest weather in Bangalore. Let me think about how to approach this.\n\nFirst, I need to check if there's a function available to get weather data. Looking at the tools provided, there are two functions listed, but their names and descriptions are empty. That's a problem because without knowing the actual functions, I can't use them. \n\nWait, maybe the functions are supposed to be part of the system, but in the given tools, they're not detailed. The user might have intended to include weather-related functions, but they weren't properly formatted. \n\nSince I can't use the functions as they are, I need to inform the us

In [48]:
response.message

Message(role='assistant', content="<think>\nOkay, the user is asking for the latest weather in Bangalore. Let me think about how to approach this.\n\nFirst, I need to check if there's a function available to get weather data. Looking at the tools provided, there are two functions listed, but their names and descriptions are empty. That's a problem because without knowing the actual functions, I can't use them. \n\nWait, maybe the functions are supposed to be part of the system, but in the given tools, they're not detailed. The user might have intended to include weather-related functions, but they weren't properly formatted. \n\nSince I can't use the functions as they are, I need to inform the user that I can't retrieve the latest weather data because the necessary tools aren't available. I should apologize and explain the limitation. \n\nAlso, the user might be expecting an answer, but without the right tools, I can't provide that. So, the best response is to let them know the current

In [47]:
response.message.content

"<think>\nOkay, the user is asking for the latest weather in Bangalore. Let me think about how to approach this.\n\nFirst, I need to check if there's a function available to get weather data. Looking at the tools provided, there are two functions listed, but their names and descriptions are empty. That's a problem because without knowing the actual functions, I can't use them. \n\nWait, maybe the functions are supposed to be part of the system, but in the given tools, they're not detailed. The user might have intended to include weather-related functions, but they weren't properly formatted. \n\nSince I can't use the functions as they are, I need to inform the user that I can't retrieve the latest weather data because the necessary tools aren't available. I should apologize and explain the limitation. \n\nAlso, the user might be expecting an answer, but without the right tools, I can't provide that. So, the best response is to let them know the current situation and suggest checking a 

### Query Processing

In [None]:
def process_query(query):
    
    messages = [{'role': 'user', 'content': query}]
    
    response = client.chat(model = 'qwen3:1.7b', 
                                  tools = tools,
                                  messages = messages)
    
    process_query = True
    while process_query:
        assistant_content = []

        for content in response:
            if content.type == 'text':
                
                print(content.text)
                assistant_content.append(content)
                
                if len(response.content) == 1:
                    process_query = False
            
            elif response.message.tool_calls:
                
                assistant_content.append(response.message.content)
                messages.append({'role': 'assistant', 'content': assistant_content})
                
                tool_id = content.id
                tool_args = content.input
                tool_name = content.name
                print(f"Calling tool {tool_name} with args {tool_args}")
                
                result = execute_tool(tool_name, tool_args)
                messages.append({"role": "user", 
                                  "content": [
                                      {
                                          "type": "tool_result",
                                          "tool_use_id": tool_id,
                                          "content": result
                                      }
                                  ]
                                })
                response = client.messages.create(max_tokens = 2024,
                                  model = 'claude-3-7-sonnet-20250219', 
                                  tools = tools,
                                  messages = messages) 
                
                if len(response.content) == 1 and response.content[0].type == "text":
                    print(response.content[0].text)
                    process_query = False

### Chat Loop

In [13]:
def chat_loop():
    print("Type your queries or 'quit' to exit.")
    while True:
        try:
            query = input("\nQuery: ").strip()
            if query.lower() == 'quit':
                break
    
            process_query(query)
            print("\n")
        except Exception as e:
            print(f"\nError: {str(e)}")

Feel free to interact with the chatbot. Here's an example query: 

- Search for 2 papers on "LLM interpretability"

To access the `papers` folder: 1) click on the `File` option on the top menu of the notebook and 2) click on `Open` and then 3) click on `L3`.

In [None]:
chat_loop()

Type your queries or 'quit' to exit.



Query:  Hi


Hello! I'm here to help you search for and extract information about academic papers on arXiv. I can help you:

1. Search for papers on a specific topic
2. Get detailed information about a specific paper using its ID

What would you like to know about? Please let me know if you'd like to search for papers on a particular topic or learn more about a specific paper.





Query:  search for MCP related application in protein biology


I'll search for papers related to MCP applications in protein biology. MCP likely refers to "Methyl-accepting Chemotaxis Protein" or possibly "Multiple Chemical Programming" in this context, but I'll search broadly to capture relevant papers.
Calling tool search_papers with args {'topic': 'MCP protein biology application', 'max_results': 5}
Results are saved in: papers/mcp_protein_biology_application/papers_info.json
Let me retrieve detailed information about each of these papers to see how they relate to MCP and protein biology:
Calling tool extract_info with args {'paper_id': '2504.08999v1'}
Calling tool extract_info with args {'paper_id': '2504.12757v2'}
Calling tool extract_info with args {'paper_id': '1610.04116v1'}
Calling tool extract_info with args {'paper_id': '2105.03617v1'}
Calling tool extract_info with args {'paper_id': '1711.03897v2'}
Based on the search results, I can provide you with information about MCP applications in protein biology:

I notice that the search return

In the next lessons, you will take out the tool definitions to wrap them in an MCP server. Then you will create an MCP client inside the chatbot to make the chatbot MCP compatible.  

## Resources

[Guide on how to implement tool use](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview#how-to-implement-tool-use)

In [None]:
# Replace the Anthropic-based process_query with an Ollama-based version

def process_query(query, model="llama3"):
    """
    Process a user query using a local Ollama LLM.
    Args:
        query (str): The user input prompt.
        model (str): The Ollama model to use (default: 'llama3').
    Returns:
        str: The LLM's response.
    """
    import ollama
    response = ollama.chat(model=model, messages=[{"role": "user", "content": query}])
    return response["message"]["content"]

# Example usage:
# print(process_query("What is MCP?"))

## Ollama Function calling

In [1]:
import ollama
import yfinance as yf 
from typing import Dict,Any, Callable

In [2]:
def get_stock_price(symbol:str) -> float:
    ticker = yf.Ticker(symbol)
    price_attrs = ['regularMarketPrice', 'currentPrice', 'price']

    for attr in price_attrs:
        if attr in ticker.info and ticker.info[attr] is not None:
            return ticker.info[attr]
        
    fast_info = ticker.fast_info
    if hasattr(fast_info,'last_price') and fast_info.last_price is not None:
        return fast_info.last_price
    raise Exception("Could not find any info")

In [3]:
get_stock_price("GOOGL")

180.498

In [9]:
get_stock_price_tool = {
    'type': 'function',
    'function': {
        'name': 'get_stock_price',
        'description': 'Get the current stock price for any symbol',
        'parameters': {
            'type': 'object',
            'required': ['symbol'],
            'properties': {
                'symbol': {'type': 'string', 'description': 'The stock symbol (e.g., AAPL, GOOGL)'},
            },
        },
    },
}

In [5]:
prompt = 'What is the current stock price of Apple?'
print('Prompt:', prompt)

Prompt: What is the current stock price of Apple?


In [7]:
available_functions:Dict[str,Callable] = {
    'get_stock_price' :get_stock_price
}

In [8]:
response = ollama.chat(
    'qwen3:1.7b',
    messages=[{'role':'user', 'content':prompt}],
    tools = [get_stock_price]
)

In [None]:
response

ChatResponse(model='qwen3:1.7b', created_at='2025-07-14T16:43:20.1356194Z', done=True, done_reason='stop', total_duration=11528222200, load_duration=7940143600, prompt_eval_count=140, prompt_eval_duration=1089402100, eval_count=107, eval_duration=2481158100, message=Message(role='assistant', content='<think>\nOkay, the user is asking for the current stock price of Apple. Let me check the tools available. There\'s a function called get_stock_price which requires a symbol parameter. Apple\'s stock symbol is AAPL. So I need to call this function with the symbol "AAPL". I should make sure the parameters are correctly formatted as JSON within the tool_call tags. Let me structure the response properly.\n</think>\n\n', images=None, tool_calls=[ToolCall(function=Function(name='get_stock_price', arguments={'symbol': 'AAPL'}))]))

In [10]:
if response.message.tool_calls:
    for tool in response.message.tool_calls:
        if function_to_call := available_functions.get(tool.function.name):
            print('Calling function:', tool.function.name)
            print('Arguments:', tool.function.arguments)
            print('Function output:', function_to_call(**tool.function.arguments))
        else:
            print('Function', tool.function.name, 'not found')

Calling function: get_stock_price
Arguments: {'symbol': 'AAPL'}
Function output: 209.05


In [14]:
import ollama
ollama.list().models

[Model(model='gemma3:4b', modified_at=datetime.datetime(2025, 6, 18, 0, 35, 9, 741285, tzinfo=TzInfo(+05:30)), digest='a2af6cc3eb7fa8be8504abaf9b04e88f17a119ec3f04a3addf55f92841195f5a', size=3338801804, details=ModelDetails(parent_model='', format='gguf', family='gemma3', families=['gemma3'], parameter_size='4.3B', quantization_level='Q4_K_M')),
 Model(model='qwen3:1.7b', modified_at=datetime.datetime(2025, 6, 13, 0, 0, 55, 230752, tzinfo=TzInfo(+05:30)), digest='8f68893c685c3ddff2aa3fffce2aa60a30bb2da65ca488b61fff134a4d1730e7', size=1359293444, details=ModelDetails(parent_model='', format='gguf', family='qwen3', families=['qwen3'], parameter_size='2.0B', quantization_level='Q4_K_M')),
 Model(model='gemma3:1b', modified_at=datetime.datetime(2025, 5, 10, 11, 10, 25, 297843, tzinfo=TzInfo(+05:30)), digest='8648f39daa8fbf5b18c7b4e6a8fb4990c692751d49917417b8842ca5758e7ffc', size=815319791, details=ModelDetails(parent_model='', format='gguf', family='gemma3', families=['gemma3'], parameter_

In [17]:
from ollama import chat
from ollama import ChatResponse

response: ChatResponse = chat(model='qwen3:1.7b', messages=[
  {
    'role': 'user',
    'content': 'what is the stock price of Ford',
  }
],
tools=[get_stock_price_tool])# or access fields directly from the response object
print(response.message.content)

<think>
Okay, the user is asking for the stock price of Ford. Let me check the tools available. There's a function called get_stock_price that takes a symbol parameter. Ford's stock symbol is F. So I need to call this function with symbol "F". Let me make sure I have the correct symbol. Yes, Ford is listed as F on the stock market. Alright, I'll use that function to retrieve the current price.
</think>




In [18]:
response.message.tool_calls

[ToolCall(function=Function(name='get_stock_price', arguments={'symbol': 'F'}))]

In [38]:
for content in response.message:
    print(content[1])
    if content[0] == 'tool_calls':
        print(content)

assistant
<think>
Okay, the user is asking for the stock price of Ford. Let me check the tools available. There's a function called get_stock_price that takes a symbol parameter. Ford's stock symbol is F. So I need to call this function with symbol "F". Let me make sure I have the correct symbol. Yes, Ford is listed as F on the stock market. Alright, I'll use that function to retrieve the current price.
</think>


None
[ToolCall(function=Function(name='get_stock_price', arguments={'symbol': 'F'}))]
('tool_calls', [ToolCall(function=Function(name='get_stock_price', arguments={'symbol': 'F'}))])
