# 寫一隻聊天機器人/Write a chatbot

在這堂課中，你將熟悉你在本課程中會用到的聊天機器人範例。這個範例包含工具的定義與執行，以及聊天機器人的程式碼。請務必在本筆記本的結尾與聊天機器人進行互動。

In this lesson, you will familiarize yourself with the chatbot example you will work on during this course. The example includes the tool definitions and execution, as well as the chatbot code. Make sure to interact with the chatbot at the end of this notebook.

## Import Libraries

In [None]:
pip install arxiv dotenv anthropic install google-api-python-client google-generativeai protobuf==5.27.3

請將您的 API 金鑰設定在 cred.py 檔案中

Remember to set your API keys in the `cred.py` file.
Content as below

keys = {'GEMINI_API_KEY':'XXX',}

In [None]:
import arxiv
import json
import os
from typing import List
from dotenv import load_dotenv
import anthropic
import cred 
import google.generativeai as genai

  from .autonotebook import tqdm as notebook_tqdm


## Tool Functions

In [3]:
PAPER_DIR = "papers"

第一個工具會根據一個主題，搜尋相關的 arXiv 論文，並將論文的資訊（標題、作者、摘要、論文網址和發表日期）儲存成 JSON 檔案。這些 JSON 檔案會依主題分類，存放在 `papers` 目錄中。這個工具並不會下載論文。

The first tool searches for relevant arXiv papers based on a topic and stores the papers' info in a JSON file (title, authors, summary, paper url and the publication date). The JSON files are organized by topics in the `papers` directory. The tool does not download the papers.  

In [None]:
def search_papers(topic: str, max_results: float = 5.0) -> List[str]:
    """
    搜尋 ArXiv 論文，根據一個主題在 ArXiv 上搜尋論文並儲存其資訊。

    參數
        topic：要搜尋的主題

        max_results：要檢索的最大結果數（預設值：5）

    回傳值
        搜尋中找到的論文 ID 列表

    Search for papers on arXiv based on a topic and store their information.
    
    Args:
        topic: The topic to search for
        max_results: Maximum number of results to retrieve (default: 5)
        
    Returns:
        List of paper IDs found in the search
    """
    max_results = int(max_results)  # Ensure max_results is an integer
    # Use arxiv to find the papers 
    client = arxiv.Client()
    if max_results <= 0:
        raise ValueError("max_results must be greater than 0")

    # Search for the most relevant articles matching the queried topic
    search = arxiv.Search(
        query = topic,
        max_results = max_results,
        sort_by = arxiv.SortCriterion.Relevance
    )
    # print(topic, max_results)
    papers = client.results(search)
    
    # Create directory for this topic
    path = os.path.join(PAPER_DIR, topic.lower().replace(" ", "_"))
    os.makedirs(path, exist_ok=True)
    
    file_path = os.path.join(path, "papers_info.json")

    # Try to load existing papers info
    try:
        with open(file_path, "r") as json_file:
            papers_info = json.load(json_file)
    except (FileNotFoundError, json.JSONDecodeError):
        papers_info = {}

    # Process each paper and add to papers_info  
    paper_ids = []
    for paper in papers:
        paper_ids.append(paper.get_short_id())
        paper_info = {
            'title': paper.title,
            'authors': [author.name for author in paper.authors],
            'summary': paper.summary,
            'pdf_url': paper.pdf_url,
            'published': str(paper.published.date())
        }
        papers_info[paper.get_short_id()] = paper_info
    
    # Save updated papers_info to json file
    with open(file_path, "w") as json_file:
        json.dump(papers_info, json_file, indent=2)
    
    print(f"Results are saved in: {file_path}")
    
    return paper_ids

In [5]:
search_papers("computers")

Results are saved in: papers\computers\papers_info.json


['1310.7911v2',
 'math/9711204v1',
 '2208.00733v1',
 '2504.07020v1',
 '2403.03925v1']

第二個工具會在 `papers` 目錄下，搜尋所有主題資料夾中特定論文的資訊。

The second tool looks for information about a specific paper across all topic directories inside the `papers` directory.

In [None]:
def extract_info(paper_id: str) -> str:
    """
    搜尋指定論文
    在所有主題目錄中搜尋特定論文的資訊。

    參數
        paper_id：要尋找的論文 ID

    回傳值
        如果找到論文，則回傳包含論文資訊的 JSON 字串；如果找不到，則回傳錯誤訊息。

    Search for information about a specific paper across all topic directories.
    
    Args:
        paper_id: The ID of the paper to look for
        
    Returns:
        JSON string with paper information if found, error message if not found
    """
 
    for item in os.listdir(PAPER_DIR):
        item_path = os.path.join(PAPER_DIR, item)
        if os.path.isdir(item_path):
            file_path = os.path.join(item_path, "papers_info.json")
            if os.path.isfile(file_path):
                try:
                    with open(file_path, "r") as json_file:
                        papers_info = json.load(json_file)
                        if paper_id in papers_info:
                            return json.dumps(papers_info[paper_id], indent=2)
                except (FileNotFoundError, json.JSONDecodeError) as e:
                    print(f"Error reading {file_path}: {str(e)}")
                    continue
    
    return f"There's no saved information related to paper {paper_id}."

In [16]:
extract_info('1310.7911v2')

'{\n  "title": "Compact manifolds with computable boundaries",\n  "authors": [\n    "Zvonko Iljazovic"\n  ],\n  "summary": "We investigate conditions under which a co-computably enumerable closed set\\nin a computable metric space is computable and prove that in each locally\\ncomputable computable metric space each co-computably enumerable compact\\nmanifold with computable boundary is computable. In fact, we examine the notion\\nof a semi-computable compact set and we prove a more general result: in any\\ncomputable metric space each semi-computable compact manifold with computable\\nboundary is computable. In particular, each semi-computable compact\\n(boundaryless) manifold is computable.",\n  "pdf_url": "http://arxiv.org/pdf/1310.7911v2",\n  "published": "2013-10-29"\n}'

## Tool Schema

Here are the schema of each tool which you will provide to the LLM.

In [9]:
tools = [
  {
    "function_declarations": [
      {
        "name": "search_papers",
        "description": "Search for papers on arXiv based on a topic and store their information.",
        "parameters": {
          "type": "object",
          "properties": {
            "topic": {
              "type": "string",
              "description": "The topic to search for"
            },
            "max_results": {
              "type": "integer",
              "description": "Maximum number of results to retrieve, must a int cannot be a float, default is 5, and value must be greater than 0",
            }
          },
          "required": [
            "topic"
          ]
        }
      },
      {
        "name": "extract_info",
        "description": "Search for information about a specific paper across all topic directories.",
        "parameters": {
          "type": "object",
          "properties": {
            "paper_id": {
              "type": "string",
              "description": "The ID of the paper to look for"
            }
          },
          "required": [
            "paper_id"
          ]
        }
      }
    ]
  }
]

## Tool Mapping

處理工具的映像跟執行

This code handles tool mapping and execution.

In [10]:
mapping_tool_function = {
    "search_papers": search_papers,
    "extract_info": extract_info
}

def execute_tool(tool_name, tool_args):
    
    result = mapping_tool_function[tool_name](**tool_args)

    if result is None:
        result = "The operation completed but didn't return any results."
        
    elif isinstance(result, list):
        result = ', '.join(result)
        
    elif isinstance(result, dict):
        # Convert dictionaries to formatted JSON strings
        result = json.dumps(result, indent=2)
    
    else:
        # For any other type, convert using str()
        result = str(result)
    return result

## 聊天機器人程式碼/Chatbot Code

這個聊天機器人會一個一個處理使用者的問題，但它不會在不同問題之間保留記憶。

The chatbot handles the user's queries one by one, but it does not persist memory across the queries.

In [11]:
load_dotenv() 
client = anthropic.Anthropic()

### 執行流程/Query Processing

In [None]:
def process_query(query):
    # 配置 API 密鑰
    genai.configure(api_key=cred.keys['GEMINI_API_KEY'])
    
    # 初始化模型
    model = genai.GenerativeModel(model_name="gemini-1.5-flash", tools=tools)

    # 初始對話訊息
    messages = [{'role': 'user', 'parts': [query]}]
    
    process_query_loop = True
    while process_query_loop:
        # 呼叫 generate_content
        response = model.generate_content(messages)
        
        # 檢查回應內容
        for part in response.candidates[0].content.parts:
            # 處理文字回應
            if part.text:
                print(part.text)
                # 如果只有文字回應，則結束迴圈
                if len(response.candidates[0].content.parts) == 1:
                    process_query_loop = False
                
            # 處理工具呼叫
            elif part.function_call:
                tool_name = part.function_call.name
                # 將 MapComposite 物件轉換為標準的 Python 字典
                tool_args = dict(part.function_call.args) 
                print(f"Calling tool {tool_name} with args {tool_args}")
                
                # 將助手的工具呼叫加入到訊息歷史中
                messages.append({'role': 'model', 'parts': [part]})
                
                # 執行工具並獲取結果
                result = execute_tool(tool_name, tool_args)
                
                # 將工具結果加入到訊息歷史中，以便模型能夠繼續對話
                messages.append({
                    "role": "user",
                    "parts": [
                        {
                            "function_response": {
                                "name": tool_name,
                                "response": {"content": result}
                            }
                        }
                    ]
                })


### 循環對話/Chat Loop

In [None]:
def chat_loop():
    print("輸入您的查詢或輸入 'quit' 以退出。")
    print("Type your queries or 'quit' to exit.")
    while True:
        try:
            query = input("\nQuery: ").strip()
            if query.lower() == 'quit':
                break
    
            process_query(query)
            print("\n")
        except Exception as e:
            print(f"\nError: {str(e)}")

你可以隨意與這個聊天機器人互動。這裡有一個範例查詢：

- 搜尋兩篇關於「LLM 可解釋性」的論文

要存取 `papers` 資料夾，請依照以下步驟：

1.  點擊筆記本頂部選單上的 **`檔案` (File)** 選項。
2.  點擊 **`開啟` (Open)**。
3.  點擊 **`L3`**。

Feel free to interact with the chatbot. Here's an example query: 

- Search for 2 papers on "LLM interpretability"

To access the `papers` folder: 1) click on the `File` option on the top menu of the notebook and 2) click on `Open` and then 3) click on `L3`.

In [20]:
chat_loop()

Type your queries or 'quit' to exit.
Calling tool search_papers with args {'max_results': 5.0, 'topic': 'computer'}
Results are saved in: papers\computer\papers_info.json
Here's the information for the papers found:


Calling tool extract_info with args {'paper_id': '1310.7911v2'}
Calling tool extract_info with args {'paper_id': 'math/9711204v1'}
Calling tool extract_info with args {'paper_id': '2208.00733v1'}
Calling tool extract_info with args {'paper_id': '2504.07020v1'}
Calling tool extract_info with args {'paper_id': '2403.03925v1'}
Here are some papers about computers I found:

* **Compact manifolds with computable boundaries:** This paper investigates conditions under which a co-computably enumerable closed set in a computable metric space is computable.
* **Aspects of Computability in Physics:** This paper reviews connections between physics and computation, exploring topics such as computational hardness of physical systems and quantum computation.
* **The Rise of Quantum Inte

<p style="background-color:#f7fff8; padding:15px; border-width:3px; border-color:#e0f0e0; border-style:solid; border-radius:6px"> 🚨
&nbsp; <b>Different Run Results:</b> The output generated by AI chat models can vary with each execution due to their dynamic, probabilistic nature. Don't be surprised if your results differ from those shown in the video.</p>

In the next lessons, you will take out the tool definitions to wrap them in an MCP server. Then you will create an MCP client inside the chatbot to make the chatbot MCP compatible.  

## Resources

[Guide on how to implement tool use](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview#how-to-implement-tool-use)