Skip to content

Commit

Permalink
Refine documentation for Tools (huggingface#23266)
Browse files Browse the repository at this point in the history
* refine documentation for Tools

* + one bugfix
  • Loading branch information
sgugger authored and novice03 committed Jun 23, 2023
1 parent 890a1bc commit f57e24e
Show file tree
Hide file tree
Showing 4 changed files with 45 additions and 47 deletions.
2 changes: 1 addition & 1 deletion docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@
- local: community
title: Community resources
- local: custom_tools
title: Custom Tools
title: Custom Tools and Prompts
- local: troubleshooting
title: Troubleshoot
title: Developer guides
Expand Down
13 changes: 8 additions & 5 deletions docs/source/en/custom_tools.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,9 @@ what the tool does and the second states what input arguments and return values

A good tool name and tool description are very important for the agent to correctly use it. Note that the only
information the agent has about the tool is its name and description, so one should make sure that both
are precisely written and match the style of the existing tools in the toolbox.
are precisely written and match the style of the existing tools in the toolbox. In particular make sure the description
mentions all the arguments expected by name in code-style, along with the expected type and a description of what they
are.

<Tip>

Expand All @@ -137,7 +139,7 @@ The third part includes a set of curated examples that show the agent exactly wh
for what kind of user request. The large language models empowering the agent are extremely good at
recognizing patterns in a prompt and repeating the pattern with new data. Therefore, it is very important
that the examples are written in a way that maximizes the likelihood of the agent to generating correct,
executable code in practice.
executable code in practice.

Let's have a look at one example:

Expand Down Expand Up @@ -466,7 +468,8 @@ The set of curated tools already has an `image_transformer` tool which is hereby

Overwriting existing tools can be beneficial if we want to use a custom tool exactly for the same task as an existing tool
because the agent is well-versed in using the specific task. Beware that the custom tool should follow the exact same API
as the overwritten tool in this case.
as the overwritten tool in this case, or you should adapt the prompt template to make sure all examples using that
tool are updated.

</Tip>

Expand Down Expand Up @@ -627,14 +630,14 @@ In order to let others benefit from it and for simpler initialization, we recomm
namespace. To do so, just call `push_to_hub` on the `tool` variable:

```python
tool.push_to_hub("lysandre/hf-model-downloads")
tool.push_to_hub("hf-model-downloads")
```

You now have your code on the Hub! Let's take a look at the final step, which is to have the agent use it.

#### Having the agent use the tool

We now have our tool that lives on the Hub which can be instantiated as such:
We now have our tool that lives on the Hub which can be instantiated as such (change the user name for your tool):

```python
from transformers import load_tool
Expand Down
73 changes: 33 additions & 40 deletions docs/source/en/transformers_agents.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@ can vary as the APIs or underlying models are prone to change.

</Tip>

Transformers version v4.29.0, building on the concept of *tools* and *agents*.
Transformers version v4.29.0, building on the concept of *tools* and *agents*. You can play with in
[this colab](https://colab.research.google.com/drive/1c7MHD-T1forUPGcC_jlwsIptOzpG3hSj).

In short, it provides a natural language API on top of transformers: we define a set of curated tools and design an
agent to interpret natural language and to use these tools. It is extensible by design; we curated some relevant tools,
Expand Down Expand Up @@ -60,10 +61,19 @@ agent.run(
## Quickstart

Before being able to use `agent.run`, you will need to instantiate an agent, which is a large language model (LLM).
We recommend using the [bigcode/starcoder](https://huggingface.co/bigcode/starcoder) checkpoint as it works very well
for the task at hand and is open-source, but please find other examples below.
We provide support for openAI models as well as opensource alternatives from BigCode and OpenAssistant. The openAI
models perform better (but require you to have an openAI API key, so cannot be used for free); Hugging Face is
providing free access to endpoints for BigCode and OpenAssistant models.

Start by logging in to have access to the Inference API:
To use openAI models, you instantiate an [`OpenAiAgent`]:

```py
from transformers import OpenAiAgent

agent = OpenAiAgent(model="text-davinci-003", api_key="<your_api_key>")
```

To use BigCode or OpenAssistant, start by logging in to have access to the Inference API:

```py
from huggingface_hub import login
Expand All @@ -76,17 +86,22 @@ Then, instantiate the agent
```py
from transformers import HfAgent

# Starcoder
agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder")
# StarcoderBase
# agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoderbase")
# OpenAssistant
# agent = HfAgent(url_endpoint="https://api-inference.huggingface.co/models/OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5")
```

This is using the inference API that Hugging Face provides for free at the moment if you have your inference
This is using the inference API that Hugging Face provides for free at the moment. If you have your own inference
endpoint for this model (or another one) you can replace the URL above with your URL endpoint.

<Tip>

We're showcasing StarCoder as the default in the documentation as the model is free to use and performs admirably well
on simple tasks. However, the checkpoint doesn't hold up when handling more complex prompts. If you're facing such an
issue, we recommend trying out the OpenAI model which, while sadly not open-source, performs better at this given time.
StarCoder and OpenAssistant are free to use and perform admirably well on simple tasks. However, the checkpoints
don't hold up when handling more complex prompts. If you're facing such an issue, we recommend trying out the OpenAI
model which, while sadly not open-source, performs better at this given time.

</Tip>

Expand All @@ -97,7 +112,7 @@ You're now good to go! Let's dive into the two APIs that you now have at your di
The single execution method is when using the [`~Agent.run`] method of the agent:

```py
agent.run("Draw me a picture of rivers and lakes")
agent.run("Draw me a picture of rivers and lakes.")
```

<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rivers_and_lakes.png" width=200>
Expand All @@ -107,7 +122,7 @@ can perform one or several tasks in the same instruction (though the more comple
the agent is to fail).

```py
agent.chat("Draw me a picture of the sea then transform the picture to add an island.")
agent.run("Draw me a picture of the sea then transform the picture to add an island")
```

<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/sea_and_island.png" width=200>
Expand All @@ -118,15 +133,16 @@ agent.chat("Draw me a picture of the sea then transform the picture to add an is
Every [`~Agent.run`] operation is independent, so you can run it several times in a row with different tasks.

Note that your `agent` is just a large-language model, so small variations in your prompt might yield completely
different results. It's important to explain as clearly as possible the task you want to perform.
different results. It's important to explain as clearly as possible the task you want to perform. We go more in-depth
on how to write good prompts [here](custom_tools#writing-good-user-inputs).

If you'd like to keep a state across executions or to pass non-text objects to the agent, you can do so by specifying
variables that you would like the agent to use. For example, you could generate the first image of rivers and lakes,
and ask the model to update that picture to add an island by doing the following:

```python
picture = agent.run("Draw me a picture of rivers and lakes")
updated_picture = agent.chat("Take that `picture` and add an island to it", picture=picture)
picture = agent.run("Generate a picture of rivers and lakes.")
updated_picture = agent.run("Transform the image in `picture` to add an island to it.", picture=picture)
```

<Tip>
Expand Down Expand Up @@ -155,7 +171,7 @@ agent.run("Draw me a picture of the `prompt`", prompt="a capybara swimming in th
The agent also has a chat-based approach, using the [`~Agent.chat`] method:

```py
agent.chat("Draw me a picture of rivers and lakes")
agent.chat("Generate a picture of rivers and lakes")
```

<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rivers_and_lakes.png" width=200>
Expand Down Expand Up @@ -197,6 +213,8 @@ agent.chat("Draw me a picture of rivers and lakes", remote=True)

### What's happening here? What are tools, and what are agents?

<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/diagram.png">

#### Agents

The "agent" here is a large language model, and we're prompting it so that it has access to a specific set of tools.
Expand Down Expand Up @@ -270,6 +288,7 @@ directly with the agent. We've added a few
- **Text downloader**: to download a text from a web URL
- **Text to image**: generate an image according to a prompt, leveraging stable diffusion
- **Image transformation**: modify an image given an initial image and a prompt, leveraging instruct pix2pix stable diffusion
- **Text to video**: generate a small video according to a prompt, leveraging damo-vilab

The text-to-image tool we have been using since the beginning is a remote tool that lives in
[*huggingface-tools/text-to-image*](https://huggingface.co/spaces/huggingface-tools/text-to-image)! We will
Expand All @@ -278,32 +297,6 @@ continue releasing such tools on this and other organizations, to further superc
The agents have by default access to tools that reside on `huggingface-tools`.
We explain how to you can write and share your tools as well as leverage any custom tool that resides on the Hub in [following guide](custom_tools).

### Leveraging different agents

We showcase here how to use the [bigcode/starcoder](https://huggingface.co/bigcode/starcoder) model as an LLM, but
it isn't the only model available. We also support the OpenAssistant model and OpenAI's davinci models (3.5 and 4).

We're planning on supporting local language models in an ulterior version.

The tools defined in this implementation are agnostic to the agent used; we are showcasing the agents that work with
our prompts below, but the tools can also be used with Langchain, Minichain, or any other Agent-based library.

#### Example code for the OpenAssistant model

```py
from transformers import HfAgent

agent = HfAgent(url_endpoint="https://OpenAssistant/oasst-sft-1-pythia-12b", token="<HF_TOKEN>")
```

#### Example code for OpenAI models

```py
from transformers import OpenAiAgent

agent = OpenAiAgent(model="text-davinci-003", api_key="<API_KEY>")
```

### Code generation

So far we have shown how to use the agents to perform actions for you. However, the agent is only generating code
Expand Down
4 changes: 3 additions & 1 deletion src/transformers/tools/agents.py
Original file line number Diff line number Diff line change
Expand Up @@ -264,7 +264,9 @@ def chat(self, task, *, return_code=False, remote=False, **kwargs):
"""
prompt = self.format_prompt(task, chat_mode=True)
result = self.generate_one(prompt, stop=["Human:", "====="])
self.chat_history = prompt + result + "\n"
self.chat_history = prompt + result
if not self.chat_history.endswith("\n"):
self.chat_history += "\n"
explanation, code = clean_code_for_chat(result)

print(f"==Explanation from the agent==\n{explanation}")
Expand Down

0 comments on commit f57e24e

Please sign in to comment.