# Welcome to TapeAgents!

**TapeAgents** is a framework to build, debug, serve and optimize your AI agent. It takes a holistic view of the agent lifecycle and aims to support you at all stages. The main distinguishing feature of the framework is that by design a **TapeAgent** creates its **Tape**: a compherensive semantic log of the agent's session that greatly facilitates audit, debugging, finetuning, agent optimization, etc.

In this tutorial you will learn:
- how to create TapeAgents using the low-level API
- run and resume TapeAgents
- have one TapeAgent reuse another TapeAgent's tape as training data

In upcoming versions of this tutorial you will also learn: 
- how to make a team TapeAgent with subagents
- how to build TapeAgents using available high-level APIs
- how to build a TapeAgent that streams partial steps

Other tutorials and examples will cover:
- code execution and browser use
- finetuning
- the TapeAgents apps (Studio and Browser)

# 0. Setup

#### 0. Install conda
https://conda.io/projects/conda/en/latest/user-guide/install/index.html


#### 1. All commands assume execution from the tapeagents directory.
```bash
cd tapeagents/
```



#### 2. Create and activate conda environment with Python 3.10:
```bash
conda create -y -n tapeagents python=3.10
conda activate tapeagents
```



#### 3. Set Jupyter Notebook kernel to your newly created `tapeagents` conda environment
https://code.visualstudio.com/docs/datascience/jupyter-notebooks#_create-or-open-a-jupyter-notebook




#### 4. Install tapeagents and its dependencies
```bash
pip install -e .
```



#### 5. Set your LLM API keys and other minor things

In [None]:
import os

# os.environ["OPENAI_API_KEY"] = "", # put your https://platform.openai.com/ key here
# os.environ["OPENAI_ORGANIZATION"] = "", # optional
# os.environ["TOGETHER_API_KEY"] = "" # put your https://together.ai/ key here
os.environ["TRANSFORMERS_NO_ADVISORY_WARNINGS"] = "1"
today = "2024-09-17"


# 1. Your first TapeAgent

In this section we will build the simplest possible "hello world" agent. We will then go through all the new concepts that you need to know to understand the code. This section is quite long, but with the solid foundation you acquire here other TapeAgent tutorials will be easy to process.

Without further ado, here's the code!

In [None]:
from tapeagents.agent import Agent, Node
from tapeagents.core import Jump, Prompt
from tapeagents.dialog import AssistantStep, Dialog, UserStep
from tapeagents.llms import LLMStream, LiteLLM

llm = LiteLLM(model_name="gpt-4o")


def make_prompt(agent, tape: Dialog) -> Prompt:
    return Prompt(messages=tape.as_prompt_messages())


def generate_steps(agent, dialog, llm_stream: LLMStream):
    yield AssistantStep(content=llm_stream.get_text())
    yield Jump(next_node=0)


flow = [Node().with_prompt(make_prompt).with_generate_steps(generate_steps)]
agent = Agent[Dialog].create(llm, flow=flow)
start_tape = Dialog() + [UserStep(content="Tell me about Vulcan in 3 sentences")]
final_tape = agent.run(start_tape).get_final_tape()
print(final_tape.model_dump_json(indent=2))

# Please ignore "Models won't be available and only tokenizers, configuration" warning from Huggingface Transformers


Now let's learn about tapes, steps, prompts, llm streams, nodes and agents.

### Tape

The core-most concept in TapeAgents is the `Tape`, a comprehensive semantic level log of the agent's session. A `Tape` contains a context and a sequence of `Step` objects. As you can see, a TapeAgent runs by adding steps (such as `UserStep` or `AssistantStep`) to the _tape_. This example uses the `Dialog` tape, which is a basic tape for user-assistant conversations. Let's see what are the possible steps in a `Dialog` tape.

In [None]:
# We use Python generics to instantiate many different Tape types by
# specifying different Context and Step types. In the output of this cell,
# look at Union[UserStep, AssistantStep, ...]
# for the list of possible step types in the Dialog tape.
Dialog


Some of these steps should be familiar to you. `UserStep`, `AssistantStep`, `SystemStep` and `ToolResult` correspond to `role=user`, `role=assistant`, `role=system` and `role=tool` LLM API messages respectively. `ToolCalls` and `AssistantThought` correspond to assistant messages where the LLM requests a tool call or produces an intermediate thought that is not meant to be shown to the user. `Jump` and `Pass` are TapeAgent's internal step to control which node it should run at the next iteration (more on this below).

### Prompt format; LLMs

We use the industry-standard "chat.completions" prompt format in TapeAgents: a list of user/assistant/system/tool messages plus tool schemas.

In [None]:
# Almost all classes in TapeAgents are Pydantic base models.
# This allows easy validation, serialization and instrospection. For example,
# here we area able to list all the fields in the Prompt model.
Prompt.model_fields


The LLMs in TapeAgent take `Prompt` and return an `LLMStream` object. The `LLMStream` object can be used both to fast-forward to the complete response text and to stream partial outputs. 

In [None]:
from tapeagents.llms import LLAMA

llama = LLAMA(
    base_url="https://api.together.xyz",
    model_name="meta-llama/Meta-Llama-3-70B-Instruct-Turbo",
    tokenizer_name="meta-llama/Meta-Llama-3-70B-Instruct",
    # you must use stream=True if you wish to have message chunks in your LLM events
    stream=True,
)
# Streaming
prompt = Prompt(messages=[UserStep(content="Write hello world in Java")])
for event in llama.generate(prompt):
    print(event.chunk, end="")
# No streaming
# (note: you can not use Prompt object for more than 1 LLM call in TapeAgents)
prompt = Prompt(messages=[UserStep(content="Write hello world in C")])
print("\n" + "-" * 30)
print(llama.generate(prompt).get_text())


In the example above we use the easiest way to create a prompt from the tapes: `tape.as_prompt_messages()`. Under the hood this method uses `step.llm_dict()` method of all non-control steps in the tape to create the prompt:

In [None]:
print((user := UserStep(content="hi AI!")).llm_dict())
print((assistant := AssistantStep(content="hello human")).llm_dict())
print(Dialog(steps=[user, assistant]).as_prompt_messages())


A key priority in TapeAgents is making use of the data that running the agent generates. To make this possible, some TapeAgent LLMs know how to make their finetuning data:

In [None]:
from tapeagents.core import LLMMessage


prompt = Prompt(messages=[UserStep(content="Say bla 3 times and foo 2 times")])
text = llama.make_training_text(
    prompt=prompt, completion=LLMMessage(content="Write a Python program that says bla 3 times and foo 2 times")
)
print("--- ALL TEXT ---")
print(text.text)
print("--- PREDICTED CHARACTERS ---")
print(text.completion_str)


### Node

A node represents an uninterruptible atom of TapeAgent's computation. When TapeAgents runs a node, it uses its two main functions: `make_prompt` and `generate_steps`. To build a node, you can replace the default implementations of these functions using `with_prompt` and `with_generate_steps` methods. Or you can subclass `Node` and override these functions. Hote that `generate_steps` must be a generator, a design choice we made to make TapeAgents a streaming-friendly framework. 

Let's see what the node from the above example can do.

In [None]:
from tapeagents.llms import LLMEvent


def make_prompt(agent, tape: Dialog) -> Prompt:
    return Prompt(messages=tape.as_prompt_messages())


def generate_steps(agent, dialog, llm_stream: LLMStream):
    yield AssistantStep(content=llm_stream.get_text())
    yield Jump(next_node=0)


node = Node().with_prompt(make_prompt).with_generate_steps(generate_steps)

# Let's run "make_prompt" in isolation.
print("")
print(node.make_prompt(agent=None, tape=Dialog(steps=[UserStep(content="Hi, AI!")])))


# Now, let's run "generate_steps" in isolation.
# We need to construct a fake LLMStream to do that.
def _generator():
    yield LLMEvent(completion=LLMMessage(content="Hello, human!"))


stream = LLMStream(_generator(), Prompt())
print(list(node.generate_steps(agent=None, tape=Dialog(), llm_stream=stream)))

# When the agent runs a node, it is a equivalent to the following three steps:
# Step 1: make a prompt
start_tape = Dialog(steps=[UserStep(content="Hi, AI!")])
prompt = node.make_prompt(agent, start_tape)
# Step 2: construct the LLMStream from the prompt (happens inside the agent)
stream = llama.generate(prompt)
# Step 3: generate steps that the agent will then add to the tape
for step in node.generate_steps(agent, start_tape, stream):
    print(step.model_dump_json(indent=2))


### Agent and its flow

The TapeAgent agent iteratively runs the nodes from its **flow** and appends the steps generated by each node to the tape. To select which next node to run, internally a TapeAgent computes the **tape view** object.


In [None]:
from tapeagents.view import TapeViewStack

# The "top" view in the tape view stack is the view of current agent. Initially `top.next_node` is 0".
tape1 = Dialog(steps=[UserStep(content="Hi, AI!")])
print(TapeViewStack.compute(tape1).top.next_node)
# When the agent computes the view, it bumps up `top.next_node` every time it encounters a step with a new `prompt_id``.
# The new prompt_id on the tape signals to the agent the current node has run.
tape2 = Dialog(steps=[UserStep(content="Hi, AI!"), AssistantStep(prompt_id="123", content="AI here, how I can help?")])
print(TapeViewStack.compute(tape2).top.next_node)
# The Jump step on the tape changes `top.next_node` to the value of the `next_node` field in the Jump step.
tape3 = Dialog(
    steps=[
        UserStep(content="Hi, AI!"),
        AssistantStep(prompt_id="123", content="AI here, how I can help?"),
        Jump(next_node=0),
    ]
)
print(TapeViewStack.compute(tape3).top.next_node)


By default the agent stop after the last node has produced an `Action` step. The action steps are the step by which the agent requests information from the environment. For example, `AssistantStep` is an `Action` as it indicates the agent awaits the user response, `ToolCalls` is an action requesting tool call results. Let's look at all possible steps in `Dialog` tape and see which of them are actions, observations and thoughts.

In [13]:
from tapeagents.core import Action, Pass, Thought, Observation
from tapeagents.dialog import AssistantThought, ToolCalls, ToolResult

assert all([issubclass(step_class, Action) for step_class in [AssistantStep, ToolCalls]])
assert all([issubclass(step_class, Thought) for step_class in [AssistantThought, Jump, Pass]])
assert all([issubclass(step_class, Observation) for step_class in [UserStep, ToolResult]])


Now we ready to look at a simplified summary of the corner-stone `agent.run` algorithm.

1. Compute the new tape view
2. Choose the active agent (more on multi-agent TapeAgents later)
3. Choose the active node
4. Run the node and add steps on the tape
5. If the last node yielded an action, then stop, else repeat.

`agent.run` returns an `AgentStream` object which allows iterating through the agent's steps (or partial steps when streaming) and fast-forwardin to the complete new tape with `get_final_tape`.

#### Converse with a TapeAgent

In [None]:
tape_to_continue = final_tape + [UserStep(content="No, I mean Vulcan the company")]
continued_tape = agent.run(tape_to_continue).get_final_tape()
print(continued_tape.model_dump_json(indent=2))


Note, that the agent is able to continue talking to you thanks for `Jump(next_node=0)` step that `generate_steps` produced. If you try to remove this step as an exercise, the agent will crash cause there is only node in its flow.

#### Tape rendering

LLM agents create a lot of data that can be overwhelming to process. In TapeAgents we render the tape with the associated prompts and completions into a more readable HTML for you. To make this work, we store prompts and completions in an SQLite database every time you call `agent.run()`.

Here's how to use tape rendering in the notebook:

In [None]:
from tapeagents.rendering import PrettyRenderer, render_tape_with_prompts
from IPython.display import HTML

HTML(render_tape_with_prompts(continued_tape, PrettyRenderer()))


# 2. Your TapeAgent with planning and tools

Let's build a TapeAgent that plans and acts. We will be using OpenAI function calling capabilities in this example.

In [None]:
from tapeagents.core import Jump
from tapeagents.dialog import SystemStep, AssistantThought, ToolCalls
from tapeagents.environment import ToolEnvironment
from tapeagents.runtime import main_loop
from tapeagents.examples.intro_tools import get_stock_ticker, get_stock_data

system_instruction = f"""
You will help the user to learn about financials of companies.
Use as many relevant tools as possible to include more details and facts in your responses.
Today is {today}.
"""
system_message = SystemStep(content=system_instruction)

env = ToolEnvironment([get_stock_ticker, get_stock_data])


def make_planning_prompt(agent, tape: Dialog) -> Prompt:
    guidance = "Write a natural language plan on how to use tools help the user. Output a list of numbered items, like 1., 2., 3., etc."
    guidance_message = UserStep(content=guidance)
    return Prompt(
        messages=[system_message] + tape.as_prompt_messages() + [guidance_message], tools=env.get_tool_schema_dicts()
    )


def generate_planning_steps(agent, dialog, llm_stream: LLMStream):
    if content := getattr(llm_stream.get_message(), "content", None):
        yield AssistantThought(content=content)
    else:
        raise ValueError()


def make_prompt(agent, tape: Dialog) -> Prompt:
    guidance = "Follow the plan you created to earlier. When you are done, respond to the user."
    guidance_message = UserStep(content=guidance)
    return Prompt(
        messages=[system_message] + tape.as_prompt_messages() + [guidance_message], tools=env.get_tool_schema_dicts()
    )


def generate_steps(agent, dialog, llm_stream: LLMStream):
    m = llm_stream.get_message()
    if content := getattr(m, "content", None):
        yield AssistantStep(content=content)
        yield Jump(next_node=0)
    elif tool_calls := getattr(m, "tool_calls", None):
        yield ToolCalls(tool_calls=tool_calls)
        yield Jump(next_node=1)
    else:
        raise ValueError()


agent1 = Agent.create(
    LiteLLM(model_name="gpt-4o", parameters={"temperature": 0.1}),
    flow=[
        Node().with_prompt(make_planning_prompt).with_generate_steps(generate_planning_steps),
        Node().with_prompt(make_prompt).with_generate_steps(generate_steps),
    ],
)

final_tape1 = None
for event in main_loop(agent1, Dialog() + [UserStep(content="Tell me about Vulcan in 3 sentences")], env):
    if ae := event.agent_event:
        if ae.step:
            print(ae.step.model_dump_json(indent=2))
        if ae.final_tape:
            final_tape1 = ae.final_tape
    if event.observation:
        print(event.observation.model_dump_json(indent=2))
assert final_tape1
HTML(render_tape_with_prompts(final_tape1, PrettyRenderer()))


The main new thing in this example is the environment. In TapeAgents framework the environment responds to the agent `Action` steps with `Observation` steps. We expect you to use the environment to encapsulate tool use, retrieval, code execution: everything that is non-deterministic, non-stationary, or computationally heavy. On the contrary, we encourage you to implements the agent's deterministic decision-making in `make_prompt` and `generate_steps` methods.

Here we use a pre-defined `main_loop` orchestrator to run the agent and the environment. `main_loop` is a generator of events that you can use as you wish. You are free to implement your own orchestration paradigm with a fine-grained control over what actions get to be executed.

# 3. Agent configuration, resumption

Let's try building a similar agent with an open-weights LLAMA3.1 70B models. Conveniently, [Together AI](together.ai) offers API endpoints. You can create an account and get API key with some free quota.

We've found that LLAMA3 function-calling is not yet battle-ready. We will use the structured output approach to make it call tools instead. We are also making this agent trainable by adding `make_completion` methods to each node. `Node.make_completion` defines how a node can reconstruct the LLM completion message that would be required to make the steps from the given tape at the given index. You can think of `Node.make_completion` as the inverse of `Node.generate_steps`.

When you run the code below, you might see a different behavior every time. Often the LLAMA-based agent gets stuck in a loop. We will look into how TapeAgents supports you in addressing this issue by
- tuning the prompt and resuming the agent exactly where it got stuck 
- producing training text from a different agent's tape 

In [None]:
import json
from tapeagents.core import Completion, Observation
from tapeagents.llms import LLAMA
from litellm.utils import ChatCompletionMessageToolCall
from litellm.utils import Function

env = ToolEnvironment([get_stock_ticker, get_stock_data])

system_instruction = f"""
You will help the user to learn about financials of companies.
Use as many relevant tools as possible to include more details and facts in your responses.
Today is {today}.

You have access to the following tools: {env.get_tool_schema_dicts()}"""

planning_guidance = "Write a natural language plan on how to use tools help the user. Output a list of numbered items, like 1., 2., 3., etc."

call_or_respond_guidance = """
Follow the plan you created earlier. When you are done, respond to the user.
If you want to call a or several tools, output JSON like this
{"kind": "tool_call", "tool_name": "...", "parameters": "... unquoted parameters json ..."}
If you have called all the tools in the plan, respond to the user with the JSON of the form
{"kind": "response", "content": "... you response ... "}.
Output ONE JSON OBJECT ONLY PER LINE ONLY AND NOTHING ELSE.
"""

def make_planning_prompt(agent, tape: Dialog) -> Prompt:
    system_message = SystemStep(content=system_instruction)
    guidance_message = UserStep(content=agent.templates["planning"])
    return Prompt(messages=[system_message] + tape.as_prompt_messages() + [guidance_message])


def generate_planning_steps(agent, dialog, llm_stream: LLMStream):
    if content := getattr(llm_stream.get_message(), "content", None):
        yield AssistantThought(content=content)
    else:
        raise ValueError()


def make_planning_completion(agent, tape: Dialog, index: int) -> Completion:
    if not isinstance(current := tape[index], AssistantThought):
        raise ValueError()
    return Completion(role="assistant", content=current.content)


def _llm_message_content(step: AssistantStep | ToolCalls):
    """Helper function to make both the prompt and the target completion"""
    match step:
        case AssistantStep():
            return json.dumps({"kind": "response", "content": step.content})
        case ToolCalls():
            content = ""
            for tool_call in step.tool_calls:
                if content:
                    content += "\n"
                content += json.dumps(
                    {
                        "kind": "tool_call",
                        "tool_name": tool_call.function.name,
                        "parameters": json.loads(tool_call.function.arguments),
                    }
                )
            return content
        case _:
            raise ValueError()


def make_prompt(agent, tape: Dialog) -> Prompt:
    system_message = SystemStep(content=system_instruction)
    guidance_message = UserStep(content=agent.templates["call_or_respond"])
    messages = [system_message]
    for step in tape:
        if isinstance(step, (ToolCalls)):
            messages.append(AssistantStep(content=_llm_message_content(step)))
        elif not isinstance(step, Jump):
            messages.append(tape.step_to_message(step))
    messages += [guidance_message]
    return Prompt(messages=messages)


def generate_steps(agent, dialog, llm_stream: LLMStream):
    m = llm_stream.get_message()
    try:
        assert m.content
        tool_calls = []
        response = None
        for line in m.content.split("\n"):
            data = json.loads(line)
            if data.get("kind") == "response":
                response = data["content"]
            elif data.get("kind") == "tool_call":
                tool_call = ChatCompletionMessageToolCall(
                    function=Function(name=data["tool_name"], arguments=json.dumps(data["parameters"])),
                    # tool call must be a unique string, it helps to make it something deterministic
                    id=f"tool_call_{len(tool_calls)}_node_starts_at_{len(dialog)}",
                )
                tool_calls.append(tool_call)
            else:
                yield AssistantStep(content="Invalid LLM completion: kind field must be 'response' or 'tool_call'")
        if response and tool_calls:
            yield AssistantStep(content="Invalid LLM completion: response and tool_call cannot be in the same message")
        if response:
            yield AssistantStep(content=response)
            yield Jump(next_node=0)
        if tool_calls:
            yield ToolCalls(tool_calls=tool_calls)
            yield Jump(next_node=1)
    except Exception as e:
        yield AssistantStep(content="Invalid JSON object: " + str(e))


def make_completion(agent, dialog: Dialog, index: int) -> Completion:
    if not isinstance(step := dialog[index], AssistantStep | ToolCalls):
        raise ValueError()
    content = _llm_message_content(step)
    return Completion(role="assistant", content=content)


agent2 = Agent.create(
    LLAMA(
        base_url="https://api.together.xyz",
        model_name="meta-llama/Meta-Llama-3-70B-Instruct-Turbo",
        tokenizer_name="meta-llama/Meta-Llama-3-70B-Instruct",
        parameters=dict(temperature=0.01),
    ),
    templates={
        "system": system_instruction,
        "planning": planning_guidance,
        "call_or_respond": call_or_respond_guidance,
    },
    flow=[
        Node()
        .with_prompt(make_planning_prompt)
        .with_generate_steps(generate_planning_steps)
        .with_completion(make_planning_completion),
        Node().with_prompt(make_prompt).with_generate_steps(generate_steps).with_completion(make_completion),
    ],
)

final_tape2 = None
for event in main_loop(
    agent2, Dialog() + [UserStep(content="Tell me about Vulcan Materials in 3 sentences")], env, max_loops=3
):
    if ae := event.agent_event:
        if ae.step:
            print(ae.step.model_dump_json(indent=2))
        if ae.final_tape:
            final_tape2 = ae.final_tape
    if event.observation:
        print(event.observation.model_dump_json(indent=2))
assert final_tape2
HTML(render_tape_with_prompts(final_tape2, PrettyRenderer()))


At this point you're likely seeing that the LLAMA-based agent is having trouble. Let's try to help it. For reproducibility, we'll use a pre-recorded failed tape.

In [None]:
with open("assets/failed_tape.json") as src:
    failed_tape = Dialog.model_validate(json.load(src))
agent2b = agent2.model_copy(deep=True)
agent2b.templates["call_or_respond"] += (
    "REMEMBER: check what tool calls you have already made. Do not do the same call again!"
)
resume_from_step8 = agent2b.run(failed_tape[:8]).get_final_tape()
HTML(render_tape_with_prompts(resume_from_step8, PrettyRenderer()))


We found that this helpful hint often gets LLAMA-based agent unstuck. Note how easy it was to test it thanks to the ability of the agent to resume from step 8!

# 4. Tape reuse and training data

Another way to help this agent (or one with an even smaller LLM) is to finetune the LLM. And the most important step towards finetuning is making the training data!

There are two ways to make training data in TapeAgents:
- the basic one: use the `LLMCall` structures that the agent created when it produced the tape. You can retrieve them from the SQLite storage and convert into training text.
- the much powerful one: call `agent.reuse` to reconstructed the prompts and completions **and** to validate that with the reconstructed LLMCalls the agent would indeed create the given tape

The big advantage of the second approach is that it allows you to use the tape from a teacher agent (think slower and more expensive) to train a student agent (think faster and cheaper). Or to train an agent on its own revised tapes.

Of course, restrictions apply: the tape by agent A may not be reusable by agent B directly. You might have to add/remove some steps. But at least `agent.reuse` verifies if your tape modifications led to creation of a tape that the agent B can indeed produce.

#### Make training data from the past LLM calls

In [None]:
from tapeagents.observe import retrieve_tape_llm_calls

llm_calls = retrieve_tape_llm_calls(final_tape2)
print(f"Retrieved {len(llm_calls)} LLM calls from the tape.")
# under the hood agent2 will route this request to its llm
example_text = agent2.make_training_text(list(llm_calls.values())[0])
print(f"From the first retrieved LLM call the LLM will be trained to predict this text:")
print("---")
print(example_text.completion_str)


#### Make training data by reusing a tape

Note how `agent2` reuses the tape by `agent1`, even though they have very different prompt and output formats! 

You can inspect the reused tape below and see that the steps are the same as before, but the prompts and completions are different.

In [None]:
reused_tape, _ = agent2.reuse(final_tape1)
HTML(render_tape_with_prompts(reused_tape, PrettyRenderer()))


We offer a quick way to harness the tape reuse to make training data. 

In [None]:
training_data = agent2.make_training_data(final_tape1)
print(training_data[0].completion_str)


What could be simpler than that?!