# Lesson 3: Chatbot Example

In this lesson, you will familiarize yourself with the chatbot example you will work on during this course. The example includes the tool definitions and execution, as well as the chatbot code. Make sure to interact with the chatbot at the end of this notebook.

## Import Libraries

In [1]:
import arxiv
import json
import os
from typing import List
from dotenv import load_dotenv
import anthropic

## Tool Functions

In [2]:
PAPER_DIR = "papers"

The first tool searches for relevant arXiv papers based on a topic and stores the papers' info in a JSON file (title, authors, summary, paper url and the publication date). The JSON files are organized by topics in the `papers` directory. The tool does not download the papers.  

In [3]:
def search_papers(topic: str, max_results: int = 5) -> List[str]:
    """
    Search for papers on arXiv based on a topic and store their information.
    
    Args:
        topic: The topic to search for
        max_results: Maximum number of results to retrieve (default: 5)
        
    Returns:
        List of paper IDs found in the search
    """
    
    # Use arxiv to find the papers 
    client = arxiv.Client()

    # Search for the most relevant articles matching the queried topic
    search = arxiv.Search(
        query = topic,
        max_results = max_results,
        sort_by = arxiv.SortCriterion.Relevance
    )

    papers = client.results(search)
    
    # Create directory for this topic
    path = os.path.join(PAPER_DIR, topic.lower().replace(" ", "_"))
    os.makedirs(path, exist_ok=True)
    
    file_path = os.path.join(path, "papers_info.json")

    # Try to load existing papers info
    try:
        with open(file_path, "r") as json_file:
            papers_info = json.load(json_file)
    except (FileNotFoundError, json.JSONDecodeError):
        papers_info = {}

    # Process each paper and add to papers_info  
    paper_ids = []
    for paper in papers:
        paper_ids.append(paper.get_short_id())
        paper_info = {
            'title': paper.title,
            'authors': [author.name for author in paper.authors],
            'summary': paper.summary,
            'pdf_url': paper.pdf_url,
            'published': str(paper.published.date())
        }
        papers_info[paper.get_short_id()] = paper_info
    
    # Save updated papers_info to json file
    with open(file_path, "w") as json_file:
        json.dump(papers_info, json_file, indent=2)
    
    print(f"Results are saved in: {file_path}")
    
    return paper_ids

In [5]:
search_papers("transformer architecture")

Results are saved in: papers/transformer_architecture/papers_info.json


['2308.11421v1',
 '2006.08231v1',
 '2311.12678v1',
 '2212.02789v1',
 '2506.19125v1']

The second tool looks for information about a specific paper across all topic directories inside the `papers` directory.

In [6]:
def extract_info(paper_id: str) -> str:
    """
    Search for information about a specific paper across all topic directories.
    
    Args:
        paper_id: The ID of the paper to look for
        
    Returns:
        JSON string with paper information if found, error message if not found
    """
 
    for item in os.listdir(PAPER_DIR):
        item_path = os.path.join(PAPER_DIR, item)
        if os.path.isdir(item_path):
            file_path = os.path.join(item_path, "papers_info.json")
            if os.path.isfile(file_path):
                try:
                    with open(file_path, "r") as json_file:
                        papers_info = json.load(json_file)
                        if paper_id in papers_info:
                            return json.dumps(papers_info[paper_id], indent=2)
                except (FileNotFoundError, json.JSONDecodeError) as e:
                    print(f"Error reading {file_path}: {str(e)}")
                    continue
    
    return f"There's no saved information related to paper {paper_id}."

In [7]:
extract_info('2308.11421v1')

'{\n  "title": "TurboViT: Generating Fast Vision Transformers via Generative Architecture Search",\n  "authors": [\n    "Alexander Wong",\n    "Saad Abbasi",\n    "Saeejith Nair"\n  ],\n  "summary": "Vision transformers have shown unprecedented levels of performance in\\ntackling various visual perception tasks in recent years. However, the\\narchitectural and computational complexity of such network architectures have\\nmade them challenging to deploy in real-world applications with\\nhigh-throughput, low-memory requirements. As such, there has been significant\\nresearch recently on the design of efficient vision transformer architectures.\\nIn this study, we explore the generation of fast vision transformer\\narchitecture designs via generative architecture search (GAS) to achieve a\\nstrong balance between accuracy and architectural and computational efficiency.\\nThrough this generative architecture search process, we create TurboViT, a\\nhighly efficient hierarchical vision trans

## Tool Schema

Here are the schema of each tool which you will provide to the LLM.

In [8]:
tools = [
    {
        "name": "search_papers",
        "description": "Search for papers on arXiv based on a topic and store their information.",
        "input_schema": {
            "type": "object",
            "properties": {
                "topic": {
                    "type": "string",
                    "description": "The topic to search for"
                }, 
                "max_results": {
                    "type": "integer",
                    "description": "Maximum number of results to retrieve",
                    "default": 5
                }
            },
            "required": ["topic"]
        }
    },
    {
        "name": "extract_info",
        "description": "Search for information about a specific paper across all topic directories.",
        "input_schema": {
            "type": "object",
            "properties": {
                "paper_id": {
                    "type": "string",
                    "description": "The ID of the paper to look for"
                }
            },
            "required": ["paper_id"]
        }
    }
]

## Tool Mapping

This code handles tool mapping and execution.

In [9]:
mapping_tool_function = {
    "search_papers": search_papers,
    "extract_info": extract_info
}

def execute_tool(tool_name, tool_args):
    
    result = mapping_tool_function[tool_name](**tool_args)

    if result is None:
        result = "The operation completed but didn't return any results."
        
    elif isinstance(result, list):
        result = ', '.join(result)
        
    elif isinstance(result, dict):
        # Convert dictionaries to formatted JSON strings
        result = json.dumps(result, indent=2)
    
    else:
        # For any other type, convert using str()
        result = str(result)
    return result

## Chatbot Code

The chatbot handles the user's queries one by one, but it does not persist memory across the queries.

In [10]:
load_dotenv() 
client = anthropic.Anthropic()

### Query Processing

In [11]:
def process_query(query):
    
    messages = [{'role': 'user', 'content': query}]
    
    response = client.messages.create(max_tokens = 2024,
                                  model = 'claude-3-7-sonnet-20250219', 
                                  tools = tools,
                                  messages = messages)
    
    process_query = True
    while process_query:
        assistant_content = []

        for content in response.content:
            if content.type == 'text':
                
                print(content.text)
                assistant_content.append(content)
                
                if len(response.content) == 1:
                    process_query = False
            
            elif content.type == 'tool_use':
                
                assistant_content.append(content)
                messages.append({'role': 'assistant', 'content': assistant_content})
                
                tool_id = content.id
                tool_args = content.input
                tool_name = content.name
                print(f"Calling tool {tool_name} with args {tool_args}")
                
                result = execute_tool(tool_name, tool_args)
                messages.append({"role": "user", 
                                  "content": [
                                      {
                                          "type": "tool_result",
                                          "tool_use_id": tool_id,
                                          "content": result
                                      }
                                  ]
                                })
                response = client.messages.create(max_tokens = 2024,
                                  model = 'claude-3-7-sonnet-20250219', 
                                  tools = tools,
                                  messages = messages) 
                
                if len(response.content) == 1 and response.content[0].type == "text":
                    print(response.content[0].text)
                    process_query = False

### Chat Loop

In [12]:
def chat_loop():
    print("Type your queries or 'quit' to exit.")
    while True:
        try:
            query = input("\nQuery: ").strip()
            if query.lower() == 'quit':
                break
    
            process_query(query)
            print("\n")
        except Exception as e:
            print(f"\nError: {str(e)}")

Feel free to interact with the chatbot. Here's an example query: 

- Search for 2 papers on "LLM interpretability"

To access the `papers` folder: 1) click on the `File` option on the top menu of the notebook and 2) click on `Open` and then 3) click on `L3`.

In [None]:
chat_loop()

Type your queries or 'quit' to exit.



Query:  Hi, can you please tell me about the attention is all you need paper from google


I'd be happy to tell you about the "Attention Is All You Need" paper from Google. This is a famous paper in the field of natural language processing and machine learning. Let me search for it and provide you with the details.
Calling tool search_papers with args {'topic': 'Attention Is All You Need', 'max_results': 5}


**Output which I got (not sure why the entire output is not shown above)**

Type your queries or 'quit' to exit.

Query:  Hi, can you please tell me about the attention is all you need paper from google
I'd be happy to tell you about the "Attention Is All You Need" paper from Google. This is a famous paper in the field of natural language processing and machine learning. Let me search for it and provide you with the details.
Calling tool search_papers with args {'topic': 'Attention Is All You Need', 'max_results': 5}
Results are saved in: papers/attention_is_all_you_need/papers_info.json
I need to get more specific information about these papers to find the original "Attention Is All You Need" paper. Let me check each of them to find the correct one.
Calling tool extract_info with args {'paper_id': '2407.15516v1'}
Calling tool extract_info with args {'paper_id': '2107.08000v1'}
Let me check the other papers to find the original Google paper:
Calling tool extract_info with args {'paper_id': '2306.01926v1'}
Calling tool extract_info with args {'paper_id': '2112.05993v1'}
Calling tool extract_info with args {'paper_id': '2501.09166v1'}
I don't see the original Google "Attention Is All You Need" paper in these results. Let me search for it with a more specific query:
Calling tool search_papers with args {'topic': 'Attention Is All You Need Vaswani Google', 'max_results': 5}
Results are saved in: papers/attention_is_all_you_need_vaswani_google/papers_info.json
Let me extract information about the first paper, which appears to be the correct one:
Calling tool extract_info with args {'paper_id': '1706.03762v7'}
Great! I found the original "Attention Is All You Need" paper that you're asking about. Here's the information about this landmark paper:

# Attention Is All You Need

This groundbreaking paper was published on June 12, 2017, by a team of researchers primarily from Google:
- Ashish Vaswani
- Noam Shazeer
- Niki Parmar
- Jakob Uszkoreit
- Llion Jones
- Aidan N. Gomez
- Lukasz Kaiser
- Illia Polosukhin

## Key Contributions:

1. **Introduction of the Transformer Architecture**: This paper introduced the Transformer, a novel neural network architecture that relies entirely on attention mechanisms, eliminating the need for recurrent neural networks (RNNs) or convolutional neural networks (CNNs) that were previously dominant in sequence-to-sequence tasks.

2. **Self-Attention Mechanism**: The paper proposed a self-attention mechanism that allows the model to weigh the importance of different words in a sentence when processing a specific word, regardless of their positions.

3. **Parallel Processing**: Unlike RNNs that process sequences sequentially, the Transformer can process all elements of a sequence in parallel, dramatically reducing training time.

4. **State-of-the-Art Results**: The authors demonstrated superior performance on machine translation tasks, achieving new state-of-the-art results on English-to-German and English-to-French translation benchmarks.

5. **Efficiency**: The model required significantly less training time compared to previous state-of-the-art approaches.

## Impact:

This paper has been immensely influential in the field of natural language processing. The Transformer architecture has become the foundation for most modern language models, including BERT, GPT, T5, and many others. The attention mechanism introduced in this paper revolutionized how machines process sequential data and has been adapted for numerous applications beyond language processing, including computer vision, audio processing, and multimodal tasks.

The paper's PDF is available at: http://arxiv.org/pdf/1706.03762v7

Would you like me to explain any specific aspect of the paper in more detail?



Query:  quit

<p style="background-color:#f7fff8; padding:15px; border-width:3px; border-color:#e0f0e0; border-style:solid; border-radius:6px"> 🚨
&nbsp; <b>Different Run Results:</b> The output generated by AI chat models can vary with each execution due to their dynamic, probabilistic nature. Don't be surprised if your results differ from those shown in the video.</p>

<div style="background-color:#fff6ff; padding:13px; border-width:3px; border-color:#efe6ef; border-style:solid; border-radius:6px">
<p> 💻 &nbsp; <b> To Access the <code>requirements.txt</code> file or the <code>papers</code> folder: </b> 1) click on the <em>"File"</em> option on the top menu of the notebook and then 2) click on <em>"Open"</em> and finally 3) click on <em>"L3"</em>.
</div>

In the next lessons, you will take out the tool definitions to wrap them in an MCP server. Then you will create an MCP client inside the chatbot to make the chatbot MCP compatible.  

## Resources

[Guide on how to implement tool use](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview#how-to-implement-tool-use)

<div style="background-color:#fff6ff; padding:13px; border-width:3px; border-color:#efe6ef; border-style:solid; border-radius:6px">


<p> ⬇ &nbsp; <b>Download Notebooks:</b> 1) click on the <em>"File"</em> option on the top menu of the notebook and then 2) click on <em>"Download as"</em> and select <em>"Notebook (.ipynb)"</em>.</p>

</div>