# LLMCompiler

This notebook shows how to implement [LLMCompiler, by Kim, et. al](https://arxiv.org/abs/2312.04511) in LangGraph.

LLMCompiler is an agent architecture designed to **speed up** the execution of agentic tasks by eagerly-executed tasks within a DAG. It also saves costs on redundant token usage by reducing the number of calls to the LLM. Below is an overview of its computational graph:

![LLMCompiler Graph](./img/llm-compiler.png)

It has 3 main components:

1. Planner: stream a DAG of tasks.
2. Task Fetching Unit: schedules and executes the tasks as soon as they are executable
3. Joiner: Responds to the user or triggers a second plan


This notebook walks through each component and shows how to wire them together using LangGraph. The end result will leave a trace [like the following](https://smith.langchain.com/public/218c2677-c719-4147-b0e9-7bc3b5bb2623/r).


**First,** install the dependencies, and set up LangSmith for tracing to more easily debug and observe the agent.

In [1]:
# %pip install -U --quiet langchain_openai langsmith langgraph langchain numexpr

In [1]:
# import os
# import getpass


# def _get_pass(var: str):
#     if var not in os.environ:
#         os.environ[var] = getpass.getpass(f"{var}: ")


# Optional: Debug + trace calls using LangSmith
# os.environ["LANGCHAIN_TRACING_V2"] = "True"
# os.environ["LANGCHAIN_PROJECT"] = "LLMCompiler"
# _get_pass("LANGCHAIN_API_KEY")
# _get_pass("OPENAI_API_KEY")

In [1]:
import os
import logging
from dotenv import load_dotenv

In [2]:
logging.basicConfig(level=logging.INFO)
log = logging.getLogger(__name__)

load_dotenv()

True

## Part 1: Tools

We'll first define the tools for the agent to use in our demo. We'll give it the class search engine + calculator combo.

If you don't want to sign up for tavily, you can replace it with the free [DuckDuckGo](https://python.langchain.com/docs/integrations/tools/ddg).

In [3]:
from langchain_openai import ChatOpenAI
from langchain_community.tools.tavily_search import TavilySearchResults

# Imported from the https://github.com/langchain-ai/langgraph/tree/main/examples/plan-and-execute repo
from math_tools import get_math_tool

# _get_pass("TAVILY_API_KEY")

calculate = get_math_tool(ChatOpenAI(model="gpt-4o"))
search = TavilySearchResults(
    max_results=1,
    description='tavily_search_results_json(query="the search query") - a search engine.',
)

tools = [search, calculate]

INFO:numexpr.utils:Note: NumExpr detected 10 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
INFO:numexpr.utils:NumExpr defaulting to 8 threads.
  warn_deprecated(


In [5]:
calculate.invoke(
    {
        "problem": "What's the temp of sf + 5?",
        "context": ["The tempreature of sf is 32 degrees"],
    }
)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


'37'

# Part 2: Planner


Largely adapted from [the original source code](https://github.com/SqueezeAILab/LLMCompiler/blob/main/src/llm_compiler/output_parser.py), the planner  accepts the input question and generates a task list to execute.

If it is provided with a previous plan, it is instructed to re-plan, which is useful if, upon completion of the first batch of tasks, the agent must take more actions.

The code below composes constructs the prompt template for the planner and composes it with LLM and output parser, defined in [output_parser.py](./output_parser.py). The output parser processes a task list in the following form:

```plaintext
1. tool_1(arg1="arg1", arg2=3.5, ...)
Thought: I then want to find out Y by using tool_2
2. tool_2(arg1="", arg2="${1}")'
3. join()<END_OF_PLAN>"
```

The "Thought" lines are optional. The `${#}` placeholders are variables. These are used to route tool (task) outputs to other tools.

In [6]:
from typing import Sequence

from langchain_core.language_models import BaseChatModel
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableBranch
from langchain_core.tools import BaseTool
from langchain_core.messages import (
    BaseMessage,
    FunctionMessage,
    HumanMessage,
    SystemMessage,
)

from output_parser import LLMCompilerPlanParser, Task
from langchain import hub
from langchain_openai import ChatOpenAI


prompt = hub.pull("wfh/llm-compiler")
print(prompt.pretty_print())


Given a user query, create a plan to solve it with the utmost parallelizability. Each plan should comprise an action from the following [33;1m[1;3m{num_tools}[0m types:
[33;1m[1;3m{tool_descriptions}[0m
[33;1m[1;3m{num_tools}[0m. join(): Collects and combines results from prior actions.

 - An LLM agent is called upon invoking join() to either finalize the user query or wait until the plans are executed.
 - join should always be the last action in the plan, and will be called in two scenarios:
   (a) if the answer can be determined by gathering the outputs from tasks to generate the final response.
   (b) if the answer cannot be determined in the planning phase before you execute the plans. Guidelines:
 - Each action described above contains input/output types and description.
    - You must strictly adhere to the input and output types for each action.
    - The action descriptions contain the guidelines. You MUST strictly follow those guidelines when you use the actions.
 -

In [7]:
def create_planner(
    llm: BaseChatModel, tools: Sequence[BaseTool], base_prompt: ChatPromptTemplate
):
    tool_descriptions = "\n".join(
        f"{i+1}. {tool.description}\n"
        for i, tool in enumerate(
            tools
        )  # +1 to offset the 0 starting index, we want it count normally from 1.
    )
    planner_prompt = base_prompt.partial(
        replan="",
        num_tools=len(tools)
        + 1,  # Add one because we're adding the join() tool at the end.
        tool_descriptions=tool_descriptions,
    )
    replanner_prompt = base_prompt.partial(
        replan=' - You are given "Previous Plan" which is the plan that the previous agent created along with the execution results '
        "(given as Observation) of each plan and a general thought (given as Thought) about the executed results."
        'You MUST use these information to create the next plan under "Current Plan".\n'
        ' - When starting the Current Plan, you should start with "Thought" that outlines the strategy for the next plan.\n'
        " - In the Current Plan, you should NEVER repeat the actions that are already executed in the Previous Plan.\n"
        " - You must continue the task index from the end of the previous one. Do not repeat task indices.",
        num_tools=len(tools) + 1,
        tool_descriptions=tool_descriptions,
    )

    def should_replan(state: list):
        # Context is passed as a system message
        return isinstance(state[-1], SystemMessage)

    def wrap_messages(state: list):
        return {"messages": state}

    def wrap_and_get_last_index(state: list):
        next_task = 0
        for message in state[::-1]:
            if isinstance(message, FunctionMessage):
                next_task = message.additional_kwargs["idx"] + 1
                break
        state[-1].content = state[-1].content + f" - Begin counting at : {next_task}"
        return {"messages": state}

    return (
        RunnableBranch(
            (should_replan, wrap_and_get_last_index | replanner_prompt),
            wrap_messages | planner_prompt,
        )
        | llm
        | LLMCompilerPlanParser(tools=tools)
    )

In [8]:
llm = ChatOpenAI(model="gpt-4o")
# This is the primary "agent" in our application
planner = create_planner(llm, tools, prompt)

In [9]:
example_question = "What's the temperature in SF raised to the 3rd power?"

for task in planner.stream([HumanMessage(content=example_question)]):
    print(task["tool"], task["args"])
    print("---")

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


description='tavily_search_results_json(query="the search query") - a search engine.' max_results=1 {'query': 'current temperature in San Francisco'}
---
name='math' description='math(problem: str, context: Optional[list[str]]) -> float:\n - Solves the provided math problem.\n - `problem` can be either a simple math problem (e.g. "1 + 3") or a word problem (e.g. "how many apples are there if there are 3 apples and 2 apples").\n - You cannot calculate multiple expressions in one call. For instance, `math(\'1 + 3, 2 + 4\')` does not work. If you need to calculate multiple expressions, you need to call them separately like `math(\'1 + 3\')` and then `math(\'2 + 4\')`\n - Minimize the number of `math` actions as much as possible. For instance, instead of calling 2. math("what is the 10% of $1") and then call 3. math("$1 + $2"), you MUST call 2. math("what is the 110% of $1") instead, which will reduce the number of math actions.\n - You can optionally provide a list of strings as `context`

## 3. Task Fetching Unit

This component schedules the tasks. It receives a stream of tools of the following format:

```typescript
{
    tool: BaseTool,
    dependencies: number[],
}
```


The basic idea is to begin executing tools as soon as their dependencies are met. This is done through multi-threading. We will combine the task fetching unit and executor below:

![diagram](./img/diagram.png)

In [10]:
from typing import Any, Union, Iterable, List, Tuple, Dict
from typing_extensions import TypedDict
import re

from langchain_core.runnables import (
    chain as as_runnable,
)

from concurrent.futures import ThreadPoolExecutor, wait
import time


def _get_observations(messages: List[BaseMessage]) -> Dict[int, Any]:
    # Get all previous tool responses
    results = {}
    for message in messages[::-1]:
        if isinstance(message, FunctionMessage):
            results[int(message.additional_kwargs["idx"])] = message.content
    return results


class SchedulerInput(TypedDict):
    messages: List[BaseMessage]
    tasks: Iterable[Task]


def _execute_task(task, observations, config):
    tool_to_use = task["tool"]
    if isinstance(tool_to_use, str):
        return tool_to_use
    args = task["args"]
    try:
        if isinstance(args, str):
            resolved_args = _resolve_arg(args, observations)
        elif isinstance(args, dict):
            resolved_args = {
                key: _resolve_arg(val, observations) for key, val in args.items()
            }
        else:
            # This will likely fail
            resolved_args = args
    except Exception as e:
        return (
            f"ERROR(Failed to call {tool_to_use.name} with args {args}.)"
            f" Args could not be resolved. Error: {repr(e)}"
        )
    try:
        return tool_to_use.invoke(resolved_args, config)
    except Exception as e:
        return (
            f"ERROR(Failed to call {tool_to_use.name} with args {args}."
            + f" Args resolved to {resolved_args}. Error: {repr(e)})"
        )


def _resolve_arg(arg: Union[str, Any], observations: Dict[int, Any]):
    # $1 or ${1} -> 1
    ID_PATTERN = r"\$\{?(\d+)\}?"

    def replace_match(match):
        # If the string is ${123}, match.group(0) is ${123}, and match.group(1) is 123.

        # Return the match group, in this case the index, from the string. This is the index
        # number we get back.
        idx = int(match.group(1))
        return str(observations.get(idx, match.group(0)))

    # For dependencies on other tasks
    if isinstance(arg, str):
        return re.sub(ID_PATTERN, replace_match, arg)
    elif isinstance(arg, list):
        return [_resolve_arg(a, observations) for a in arg]
    else:
        return str(arg)


@as_runnable
def schedule_task(task_inputs, config):
    task: Task = task_inputs["task"]
    observations: Dict[int, Any] = task_inputs["observations"]
    try:
        observation = _execute_task(task, observations, config)
    except Exception:
        import traceback

        observation = traceback.format_exception()  # repr(e) +
    observations[task["idx"]] = observation


def schedule_pending_task(
    task: Task, observations: Dict[int, Any], retry_after: float = 0.2
):
    while True:
        deps = task["dependencies"]
        if deps and (any([dep not in observations for dep in deps])):
            # Dependencies not yet satisfied
            time.sleep(retry_after)
            continue
        schedule_task.invoke({"task": task, "observations": observations})
        break


@as_runnable
def schedule_tasks(scheduler_input: SchedulerInput) -> List[FunctionMessage]:
    """Group the tasks into a DAG schedule."""
    # For streaming, we are making a few simplifying assumption:
    # 1. The LLM does not create cyclic dependencies
    # 2. That the LLM will not generate tasks with future deps
    # If this ceases to be a good assumption, you can either
    # adjust to do a proper topological sort (not-stream)
    # or use a more complicated data structure
    tasks = scheduler_input["tasks"]
    args_for_tasks = {}
    messages = scheduler_input["messages"]
    # If we are re-planning, we may have calls that depend on previous
    # plans. Start with those.
    observations = _get_observations(messages)
    task_names = {}
    originals = set(observations)
    # ^^ We assume each task inserts a different key above to
    # avoid race conditions...
    futures = []
    retry_after = 0.25  # Retry every quarter second
    with ThreadPoolExecutor() as executor:
        for task in tasks:
            deps = task["dependencies"]
            task_names[task["idx"]] = (
                task["tool"] if isinstance(task["tool"], str) else task["tool"].name
            )
            args_for_tasks[task["idx"]] = task["args"]
            if (
                # Depends on other tasks
                deps
                and (any([dep not in observations for dep in deps]))
            ):
                futures.append(
                    executor.submit(
                        schedule_pending_task, task, observations, retry_after
                    )
                )
            else:
                # No deps or all deps satisfied
                # can schedule now
                schedule_task.invoke(dict(task=task, observations=observations))
                # futures.append(executor.submit(schedule_task.invoke dict(task=task, observations=observations)))

        # All tasks have been submitted or enqueued
        # Wait for them to complete
        wait(futures)
    # Convert observations to new tool messages to add to the state
    new_observations = {
        k: (task_names[k], args_for_tasks[k], observations[k])
        for k in sorted(observations.keys() - originals)
    }
    tool_messages = [
        FunctionMessage(
            name=name, content=str(obs), additional_kwargs={"idx": k, "args": task_args}
        )
        for k, (name, task_args, obs) in new_observations.items()
    ]
    return tool_messages

In [11]:
import itertools


@as_runnable
def plan_and_schedule(messages: List[BaseMessage], config):
    tasks = planner.stream(messages, config)
    # Begin executing the planner immediately
    try:
        tasks = itertools.chain([next(tasks)], tasks)
    except StopIteration:
        # Handle the case where tasks is empty.
        tasks = iter([])
    scheduled_tasks = schedule_tasks.invoke(
        {
            "messages": messages,
            "tasks": tasks,
        },
        config,
    )
    return scheduled_tasks

#### Example Plan

We still haven't introduced any cycles in our computation graph, so this is all easily expressed in LCEL.

In [12]:
tool_messages = plan_and_schedule.invoke([HumanMessage(content=example_question)])

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


In [13]:
tool_messages

[FunctionMessage(content="[{'url': 'http://www.sfchronicle.com/weather-forecast/article/sf-bay-area-heat-19468160.php', 'content': 'Advertisement Article continues below this ad Highs will reach the mid-60s in the Richmond and Sunset districts, upper 60s in the Presidio and the low 70s in Pacific Heights, Glen Park, the Marina District and the Embarcadero. He joins the Chronicle from the University of Washington where he was previously the president of the campus weather forecasting team and an editor at the student newspaper, The Daily UW.  It will be warmer away from the beaches, in the low 70s in San Bruno and South San Francisco, the mid-70s in San Mateo and near 80 in Redwood City and Menlo Park. Tuesday will likely be the city’s warmest day since May 10, when the high was 78. Advertisement Article continues below this ad Inland portions of the North Bay, East Bay and South Bay will be even hotter. The sea breeze should become stronger as the week progresses, and Thursday afternoo

## 4. "Joiner" 

So now we have the planning and initial execution done. We need a component to process these outputs and either:

1. Respond with the correct answer.
2. Loop with a new plan.

The paper refers to this as the "joiner". It's another LLM call. We are using function calling to improve parsing reliability.

In [15]:
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain.chains.openai_functions import create_structured_output_runnable
from langchain_core.messages import AIMessage


class FinalResponse(BaseModel):
    """The final response/answer."""

    response: str


class Replan(BaseModel):
    feedback: str = Field(
        description="Analysis of the previous attempts and recommendations on what needs to be fixed."
    )


class JoinOutputs(BaseModel):
    """Decide whether to replan or whether you can return the final response."""

    thought: str = Field(
        description="The chain of thought reasoning for the selected action"
    )
    action: Union[FinalResponse, Replan]


joiner_prompt = hub.pull("wfh/llm-compiler-joiner").partial(
    examples=""
)  # You can optionally add examples
llm = ChatOpenAI(model="gpt-4o")

runnable = create_structured_output_runnable(JoinOutputs, llm, joiner_prompt)

We will select only the most recent messages in the state, and format the output to be more useful for
the planner, should the agent need to loop.

In [16]:
def _parse_joiner_output(decision: JoinOutputs) -> List[BaseMessage]:
    response = [AIMessage(content=f"Thought: {decision.thought}")]
    if isinstance(decision.action, Replan):
        return response + [
            SystemMessage(
                content=f"Context from last attempt: {decision.action.feedback}"
            )
        ]
    else:
        return response + [AIMessage(content=decision.action.response)]


def select_recent_messages(messages: list) -> dict:
    selected = []
    for msg in messages[::-1]:
        selected.append(msg)
        if isinstance(msg, HumanMessage):
            break
    return {"messages": selected[::-1]}


joiner = select_recent_messages | runnable | _parse_joiner_output

In [17]:
input_messages = [HumanMessage(content=example_question)] + tool_messages

In [18]:
joiner.invoke(input_messages)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


[AIMessage(content="Thought: The temperature in San Francisco is in the range of low 70s. For calculation purposes, we'll use 70°F. Raising it to the 3rd power results in 70^3, which is 343,000."),
 AIMessage(content='The temperature in San Francisco raised to the 3rd power is approximately 343,000.')]

## 5. Compose using LangGraph

We'll define the agent as a stateful graph, with the main nodes being:

1. Plan and execute (the DAG from the first step above)
2. Join: determine if we should finish or replan
3. Recontextualize: update the graph state based on the output from the joiner

In [36]:
from langgraph.graph import MessageGraph, END
from typing import Dict

graph_builder = MessageGraph()

# 1.  Define vertices
# We defined plan_and_schedule above already
# Assign each node to a state variable to update
graph_builder.add_node("plan_and_schedule", plan_and_schedule)
graph_builder.add_node("join", joiner)


## Define edges
graph_builder.add_edge("plan_and_schedule", "join")

### This condition determines looping logic


def should_continue(state: List[BaseMessage]):
    if isinstance(state[-1], AIMessage):
        return END
    return "plan_and_schedule"


graph_builder.add_conditional_edges(
    "join",
    # Next, we pass in the function that will determine which node is called next.
    should_continue,
)
graph_builder.set_entry_point("plan_and_schedule")
chain = graph_builder.compile()

In [37]:
from IPython.display import Image, display

display(Image(chain.get_graph(xray=True).draw_mermaid_png()))

ValueError: Failed to render the graph using the Mermaid.INK API. Status code: 400.

#### Simple question

Let's ask a simple question of the agent.

In [27]:
for step in chain.stream([HumanMessage(content="What's the GDP of New York?")]):
    print(step)
    print("---")

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'plan_and_schedule': [FunctionMessage(content='[{\'url\': \'https://www.nbcnewyork.com/news/national-international/many-large-u-s-cities-are-in-deep-financial-trouble-heres-why/5352780/\', \'content\': \'"If I don\\\'t pay that invoice, I don\\\'t have to include it in my balanced budget," said Sheila Weinberg, the group\\\'s founder and CEO. Truth in Accounting estimates that 53 of the largest cities in the U.S. were not generating enough revenue to pay their bills at the end of fiscal year 2022. For example, New York City had a total public debt of $177.6 billion at the end of fiscal year 2022, according to researchers at Truth in Accounting, a nonprofit that partners with the University of Denver to promote transparency in public accounting. Lander in 2024 voiced support for a $12 billion expansion of New York City\\\'s debt limit to fund existing city services like community colleges and the police department, alongside an expansionary capital program in the face of issues such as

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'join': [AIMessage(content="Thought: The search results provided information on New York City's public debt but did not include the GDP of New York.", id='b14a712c-cefc-4d74-9b2f-f95512969b34'), SystemMessage(content="Context from last attempt: The search results did not provide the GDP of New York. We need to search again specifically for New York's GDP.", id='0715ff2c-5994-48c1-8f07-acf2c3fdf8c6')]}
---


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'plan_and_schedule': [FunctionMessage(content='[{\'url\': \'https://www.nbcnewyork.com/news/national-international/many-large-u-s-cities-are-in-deep-financial-trouble-heres-why/5352780/\', \'content\': \'"If I don\\\'t pay that invoice, I don\\\'t have to include it in my balanced budget," said Sheila Weinberg, the group\\\'s founder and CEO. Truth in Accounting estimates that 53 of the largest cities in the U.S. were not generating enough revenue to pay their bills at the end of fiscal year 2022. For example, New York City had a total public debt of $177.6 billion at the end of fiscal year 2022, according to researchers at Truth in Accounting, a nonprofit that partners with the University of Denver to promote transparency in public accounting. Lander in 2024 voiced support for a $12 billion expansion of New York City\\\'s debt limit to fund existing city services like community colleges and the police department, alongside an expansionary capital program in the face of issues such as

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'join': [AIMessage(content="Thought: The search results still did not provide the GDP of New York. We need to search again specifically for New York's GDP.", id='749983b6-ae37-4370-a2b2-d33ed1623338'), SystemMessage(content="Context from last attempt: The search results provided information on New York City's public debt and fiscal challenges but did not include the GDP of New York. We need to refine the search to find the specific GDP information.", id='e8f86c70-c5fe-4371-9fad-322743277590')]}
---


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'plan_and_schedule': [FunctionMessage(content='[{\'url\': \'https://www.nbcnewyork.com/news/national-international/many-large-u-s-cities-are-in-deep-financial-trouble-heres-why/5352780/\', \'content\': \'"If I don\\\'t pay that invoice, I don\\\'t have to include it in my balanced budget," said Sheila Weinberg, the group\\\'s founder and CEO. Truth in Accounting estimates that 53 of the largest cities in the U.S. were not generating enough revenue to pay their bills at the end of fiscal year 2022. For example, New York City had a total public debt of $177.6 billion at the end of fiscal year 2022, according to researchers at Truth in Accounting, a nonprofit that partners with the University of Denver to promote transparency in public accounting. Lander in 2024 voiced support for a $12 billion expansion of New York City\\\'s debt limit to fund existing city services like community colleges and the police department, alongside an expansionary capital program in the face of issues such as

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'join': [AIMessage(content="Thought: The search results repeatedly provided information on New York City's public debt and fiscal challenges but did not include the GDP of New York. We need to refine the search to find the specific GDP information.", id='94a13556-6e53-4d9d-b084-7477e6d80483'), SystemMessage(content="Context from last attempt: The search results continue to provide information on New York City's public debt and fiscal issues without addressing the GDP. A more targeted search or different source may be necessary to retrieve the correct data.", id='d246df95-e8b8-4ad7-b6e6-9aae6046da76')]}
---


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'plan_and_schedule': [FunctionMessage(content='[{\'url\': \'https://www.nbcnewyork.com/news/national-international/many-large-u-s-cities-are-in-deep-financial-trouble-heres-why/5352780/\', \'content\': \'"If I don\\\'t pay that invoice, I don\\\'t have to include it in my balanced budget," said Sheila Weinberg, the group\\\'s founder and CEO. Truth in Accounting estimates that 53 of the largest cities in the U.S. were not generating enough revenue to pay their bills at the end of fiscal year 2022. For example, New York City had a total public debt of $177.6 billion at the end of fiscal year 2022, according to researchers at Truth in Accounting, a nonprofit that partners with the University of Denver to promote transparency in public accounting. Lander in 2024 voiced support for a $12 billion expansion of New York City\\\'s debt limit to fund existing city services like community colleges and the police department, alongside an expansionary capital program in the face of issues such as

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'join': [AIMessage(content='Thought: The repeated search results have not provided the GDP of New York, only details on its public debt and fiscal issues. Further attempts have not yielded the needed information.', id='fbd58e9c-7704-4f4b-8952-2c34004ab94f'), SystemMessage(content="Context from last attempt: The previous attempts to find the GDP of New York have repeatedly resulted in information about the city's public debt and fiscal issues. A different approach or source is needed to retrieve the correct data.", id='a440bfa9-7d06-47e8-ba83-a55cde9b56cf')]}
---


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'plan_and_schedule': [FunctionMessage(content='[{\'url\': \'https://www.nbcnewyork.com/news/national-international/many-large-u-s-cities-are-in-deep-financial-trouble-heres-why/5352780/\', \'content\': \'"If I don\\\'t pay that invoice, I don\\\'t have to include it in my balanced budget," said Sheila Weinberg, the group\\\'s founder and CEO. Truth in Accounting estimates that 53 of the largest cities in the U.S. were not generating enough revenue to pay their bills at the end of fiscal year 2022. For example, New York City had a total public debt of $177.6 billion at the end of fiscal year 2022, according to researchers at Truth in Accounting, a nonprofit that partners with the University of Denver to promote transparency in public accounting. Lander in 2024 voiced support for a $12 billion expansion of New York City\\\'s debt limit to fund existing city services like community colleges and the police department, alongside an expansionary capital program in the face of issues such as

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'join': [AIMessage(content="Thought: Repeated searches have not provided the GDP of New York, only details on its public debt and fiscal issues. I will inform the user about the difficulty in finding the exact GDP information and suggest they search for specific sources like the U.S. Bureau of Economic Analysis or New York City's official economic reports.", id='b7aa595b-77a4-4180-89ea-e49eb96b7c46'), AIMessage(content="I wasn't able to find the specific GDP of New York from the search results. You might want to check official sources like the U.S. Bureau of Economic Analysis or New York City's official economic reports for the most accurate information.", id='f6f4c593-b87e-4c15-bffe-e028b4de449f')]}
---


In [28]:
# Final answer
print(step[END][-1].content)

KeyError: '__end__'

#### Multi-hop question

This question requires that the agent perform multiple searches.

In [38]:
steps = chain.stream(
    [
        HumanMessage(
            content="What's the oldest parrot alive, and how much longer is that than the average?"
        )
    ],
    {
        "recursion_limit": 100,
    },
)
for step in steps:
    print(step)
    print("---")

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'plan_and_schedule': [FunctionMessage(content='[{\'url\': \'https://www.foxnews.com/world/worlds-oldest-known-gorilla-turns-67-berlin-zoo\', \'content\': \'Berlin\\\'s zoo is celebrating the 67th birthday of Fatou the gorilla, its oldest resident, who it believes is also the oldest gorilla in the world.  Berlin\\\'s zoo is celebrating the 67th birthday of Fatou the gorilla, its oldest resident, who it believes is also the oldest gorilla in the world. (Paul Zinken/dpa via AP) Vet Andre Schüle said there is no gorilla older than Fatou in any other zoo, "and we have to assume that there is no animal older than her in the wild," where animals do not live so long.  CLICK HERE TO GET THE FOX NEWS APP Fatou became the zoo\\\'s oldest resident only recently, following the death earlier this year of Ingo the flamingo. NEW ENGLAND AQUARIUM\\\'S 500-POUND, 95-YEAR-OLD SEA TURTLE GETS CLEAN BILL OF HEALTH Fatou was born in 1957 and came to the zoo in what was then West Berlin in 1959.\'}]', addit

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'join': [AIMessage(content='Thought: The search results do not provide any information about the oldest parrot alive or how much longer it has lived compared to the average lifespan of parrots.', id='2232465b-dab1-4319-8b73-03309bde1ecb'), SystemMessage(content='Context from last attempt: The search results did not return relevant information about the oldest parrot alive or its age compared to the average lifespan of parrots. A more specific search focused on parrots is needed.', id='1685770e-ce68-4c5d-b9fd-c640191f5ed0')]}
---


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'plan_and_schedule': [FunctionMessage(content='[{\'url\': \'https://www.foxnews.com/world/worlds-oldest-known-gorilla-turns-67-berlin-zoo\', \'content\': \'Berlin\\\'s zoo is celebrating the 67th birthday of Fatou the gorilla, its oldest resident, who it believes is also the oldest gorilla in the world.  Berlin\\\'s zoo is celebrating the 67th birthday of Fatou the gorilla, its oldest resident, who it believes is also the oldest gorilla in the world. (Paul Zinken/dpa via AP) Vet Andre Schüle said there is no gorilla older than Fatou in any other zoo, "and we have to assume that there is no animal older than her in the wild," where animals do not live so long.  CLICK HERE TO GET THE FOX NEWS APP Fatou became the zoo\\\'s oldest resident only recently, following the death earlier this year of Ingo the flamingo. NEW ENGLAND AQUARIUM\\\'S 500-POUND, 95-YEAR-OLD SEA TURTLE GETS CLEAN BILL OF HEALTH Fatou was born in 1957 and came to the zoo in what was then West Berlin in 1959.\'}]', addit

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'join': [AIMessage(content='Thought: The search results still do not provide any information about the oldest parrot alive or its age compared to the average lifespan of parrots. A more targeted search is needed to find this specific information.', id='13caecba-0a36-459f-aadf-d7f6ec0439ef'), SystemMessage(content='Context from last attempt: The current search results are irrelevant. A more focused search on the oldest parrot alive and the average lifespan of parrots is required.', id='fe521f22-0a31-4f62-ad77-99bbde04c0fc')]}
---


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'plan_and_schedule': [FunctionMessage(content='[{\'url\': \'https://www.foxnews.com/world/worlds-oldest-known-gorilla-turns-67-berlin-zoo\', \'content\': \'Berlin\\\'s zoo is celebrating the 67th birthday of Fatou the gorilla, its oldest resident, who it believes is also the oldest gorilla in the world.  Berlin\\\'s zoo is celebrating the 67th birthday of Fatou the gorilla, its oldest resident, who it believes is also the oldest gorilla in the world. (Paul Zinken/dpa via AP) Vet Andre Schüle said there is no gorilla older than Fatou in any other zoo, "and we have to assume that there is no animal older than her in the wild," where animals do not live so long.  CLICK HERE TO GET THE FOX NEWS APP Fatou became the zoo\\\'s oldest resident only recently, following the death earlier this year of Ingo the flamingo. NEW ENGLAND AQUARIUM\\\'S 500-POUND, 95-YEAR-OLD SEA TURTLE GETS CLEAN BILL OF HEALTH Fatou was born in 1957 and came to the zoo in what was then West Berlin in 1959.\'}]', addit

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'join': [AIMessage(content='Thought: The search results are still irrelevant and do not provide information on the oldest parrot alive or its age relative to the average lifespan of parrots. As we have made multiple attempts without success, it is best to inform the user of our current status.', id='6dbf683a-3ead-411c-88ab-1fb0911b2f28'), AIMessage(content="I wasn't able to find specific information on the oldest parrot alive or how much longer it has lived compared to the average lifespan of parrots. You might want to try a more targeted search on this topic or consult a reliable database or expert on parrot lifespans.", id='87bb461a-8358-4697-816d-82096f2efd47')]}
---


In [39]:
# Final answer
print(step[END][-1].content)

KeyError: '__end__'

#### Multi-step  math

In [40]:
for step in chain.stream(
    [
        HumanMessage(
            content="What's ((3*(4+5)/0.5)+3245) + 8? What's 32/4.23? What's the sum of those two values?"
        )
    ]
):
    print(step)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'plan_and_schedule': [FunctionMessage(content='3299.0', additional_kwargs={'idx': 1, 'args': {'problem': '3*(4+5)/0.5 + 3245'}}, name='math', id='996c6417-b012-4584-bc66-7467b6688666'), FunctionMessage(content='7.565011820330969', additional_kwargs={'idx': 2, 'args': {'problem': '32/4.23'}}, name='math', id='4ee57a16-eaee-4fda-ab68-5cafb626ab19'), FunctionMessage(content='join', additional_kwargs={'idx': 3, 'args': ()}, name='join', id='2b6df2f4-645e-491c-919d-efec894a76d9')]}


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


{'join': [AIMessage(content='Thought: I have calculated the two values successfully. Now, I need to add them together to give the final answer.', id='fd5df46f-b757-4fae-8653-1854071f6ee2'), AIMessage(content='The sum of ((3*(4+5)/0.5)+3245) + 8 and 32/4.23 is 3306.565011820330969.', id='9a6b1a47-7d64-44c7-a88b-0f0a7f12eb29')]}


In [41]:
# Final answer
print(step[END][-1].content)

KeyError: '__end__'

## Conclusion

Congrats on building your first LLMCompiler agent! I'll leave you with some known limitations to the implementation above:

1. The planner output parsing format is fragile if your function requires more than 1 or 2 arguments. We could make it more robust by using streaming tool calling.
2. Variable substitution is fragile in the example above. It could be made more robust by using a fine-tuned model and a more robust syntax (using e.g., Lark or a tool calling schema)
3. The state can grow quite long if you require multiple re-planning runs. To handle, you could add a message compressor once you go above a certain token limit.
