# Tool using in AutoGen
Authors: [Jiale Liu](https://github.com/LeoLjl), [Linxin Song](https://linxins.net/), [Jieyu Zhang](https://jieyuz2.github.io/)

In this notebook, we introduce how to use tools in AutoGen. Given a query, a ToolBuilder will retrieve tools based on semantic similarity.

## Preparations
To use all the tools in the library, we need to install requirements, obtain Bing api key and RapidApi key following the instructions in this link.

In [None]:
# %pip install -r ../tools/requirements.txt

## Setup API endpoint

The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file. It first looks for an environment variable with a specified name. The value of the environment variable needs to be a valid json string. If that variable is not found, it looks for a json file with the same name. It filters the configs by filter_dict.

The config list should look like this:
```python
config_list = [
    {
        'model': 'gpt-4',
        'api_key': '<your OpenAI API key here>',
    },  # OpenAI API endpoint for gpt-4
    {
        'model': 'gpt-4',
        'api_key': '<your Azure OpenAI API key here>',
        'base_url': '<your Azure OpenAI API base here>',
        'api_type': 'azure',
        'api_version': '2024-02-15-preview',
    },  # Azure OpenAI API endpoint for gpt-4
    {
        'model': 'gpt-4-32k',
        'api_key': '<your Azure OpenAI API key here>',
        'base_url': '<your Azure OpenAI API base here>',
        'api_type': 'azure',
        'api_version': '2024-02-15-preview',
    },  # Azure OpenAI API endpoint for gpt-4-32k
]
```

In [None]:
import autogen

config_list = autogen.config_list_from_json(
    "OAI_CONFIG_LIST",
    filter_dict={
        "model": ["gpt-4-1106-preview"],
    },
)

## Configure Bing
In order for web search related tools to operate properly (`perform_web_search`), Bing API is needed. You can read more about how to get an API on the [Bing Web Search API](https://www.microsoft.com/en-us/bing/apis/bing-web-search-api) page.

Once you have your key, fill in your key below.

In [18]:
import os

os.environ["BING_API_KEY"] = ""

## Configure RapidAPI key
Some tools in information_retrieval category requires access to RapidAPI. You need to subscribe to these two specific apis in order for related tools to work([link1](https://rapidapi.com/illmagination/api/youtube-captions-and-transcripts/), [link2](https://rapidapi.com/420vijay47/api/youtube-mp3-downloader2/)). These apis have a free pricing tier, no need to worry about extra cost.

Once you have the api keys, fill in your key below.

In [19]:
import os

os.environ["RAPID_API_KEY"] = ""

## Step 1: Initate ToolBuilder and retrieve tools
Now that all things are set, we should initate a ToolBuilder object and retrieve tools. The ToolBuilder takes in a user query, calculates the semantic similarity between the query and tool description, then retrieves the `topk` amount of tools.

Suppose the task we'd like to solve: Examine the video at https://www.youtube.com/watch?v=1htKBjuUWec. What does Teal'c say in response to the question "Isn't that hot?"

This task requires video transcription skills, so the query can be: Expertise in utilizing YouTube's API or similar services to extract video captions or subtitles.

In [5]:
from autogen.agentchat.contrib.tool_retriever import ToolBuilder

builder = ToolBuilder(
    corpus_path="../tools/tool_description.tsv",
    retriever="all-mpnet-base-v2",
)
tool_query = "Expertise in utilizing YouTube's API or similar services to extract video captions or subtitles."
tools = builder.retrieve(tool_query, top_k=3)
print(tools)

['information_retrieval get_youtube_caption Retrieves the captions for a YouTube video.', 'information_retrieval youtube_download Downloads a YouTube video and returns the download link.', 'information_retrieval perform_web_search Perform a web search using Bing API.']


## Step 2: Get tool signatures and bind it to agents


In [17]:
import os

from autogen.tool_utils import get_full_tool_description

tool_root = "../tools"

descriptions = []
for tool in tools:
    category, tool_name = tool.split(" ")[0], tool.split(" ")[1]
    tool_path = os.path.join(tool_root, category, f"{tool_name}.py")
    descriptions.append(get_full_tool_description(tool_path))

assistant = autogen.AssistantAgent(
    name="Information retriever",
    llm_config={
        "config_list": config_list,
    },
    max_consecutive_auto_reply=5,
)
proxy = autogen.UserProxyAgent(
    name="User proxy",
    is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
    human_input_mode="NEVER",
    code_execution_config={"work_dir": "coding"},  # This will be updated later
)

# Bind the tools to the assistant
builder.bind(assistant, "\n\n".join(descriptions))

# Bind the tools to user proxy
proxy = builder.bind_user_proxy(proxy, tool_root)

Under the hood, the assistant's system message is updated with instructions on how to use tools by writing python code. The user proxy is equipped with executor that can run tool-related code. This feature is based on [User Defined Functions](https://microsoft.github.io/autogen/docs/topics/code-execution/user-defined-functions) and currently cannot operate on Docker.

## Step 3: Let the agent finish the task

In [21]:
PROMPT = """Today's date is 2024-04-14.
# Task
You need to solve the below question given by a user.

# Question
{question}
"""
question = """Examine the video at https://www.youtube.com/watch?v=1htKBjuUWec.

What does Teal'c say in response to the question "Isn't that hot?"
""".strip()

chat = proxy.initiate_chat(assistant, message=PROMPT.format(question=question))

[33mUser proxy[0m (to Information retriever):

Today's date is 2024-04-14.
# Task
You need to solve the below question given by a user.

# Question
Examine the video at https://www.youtube.com/watch?v=1htKBjuUWec.

What does Teal'c say in response to the question "Isn't that hot?"


--------------------------------------------------------------------------------
[33mInformation retriever[0m (to User proxy):

To solve this task, I will use the `get_youtube_caption` function to retrieve the captions for the YouTube video in question and then search the captions for the line where someone asks "Isn't that hot?" and find Teal'c's response to that question. I'll proceed step by step:

1. Retrieve the captions of the YouTube video using its video ID (the part of the URL after "watch?v=").
2. Scan through the captions to find the line containing the question.
3. Extract Teal'c's response that immediately follows the question.

Let's start with the first step.

```python
# filename: get_yo

If we look at the youtube video ourselves, we'll find out that the answer is correct. Provided with relevant apis, language models can do vision and audio related tasks, which can lead to more versatile and useful agents.