## Solving a complex task with a multi-agent hierarchy

The reception is approaching! With your help, Alfred is now nearly finished with the preparations.

But now there's a problem: the Batmobile has disappeared. Alfred needs to find a replacement, and find it quickly.

Fortunately, a few biopics have been done on Bruce Wayne's life, so maybe Alfred could get a car left behind on one of the movie set, and re-engineer it up to modern standards, which certainly would include a full self-driving option.

But this could be anywhere in the filming locations around the world - which could be numerous.

So Alfred wants your help. Could you build an agent able to solve this task?

    👉 Find all Batman filming locations in the world, calculate the time to transfer via boat to there, and represent them on a map, with a color varying by boat transfer time. Also represent some supercar factories with the same boat transfer time.

Let's build this!

Additional Packages needed: `pip install 'smolagents[litellm]' plotly geopandas shapely kaleido -q`


In [None]:
OPENAI_API_KEY = ''
RUNPOD_BASE_URL = 'https://copiv0isvfz56v-11434.proxy.runpod.net'
CODE_MODEL = 'deepseek-coder:33b'
TEXT_MODEL = 'deepseek-r1:32b'

### Tool to get the cargo plane transfer time

In [2]:
import math
from typing import Optional, Tuple
from smolagents import tool

@tool
def calculate_cargo_travel_time(
    origin_coords: Tuple[float, float],
    destination_coords: Tuple[float, float],
    cruising_speed: Optional[float] = 750.0,    # average speed for cargo planes
) -> float:
    """
    Calculate the travel time for a cargo plane between two points on Earth using Great Circle Distance.

    Args:
        origin_coords: Tuple of (latitude, longitude) for the starting point
        destination_coords: Tuple of (latitude, longitude) for the destination
        cruising_speed: Optional cruising speed of the cargo plane in km/h (defaults to 750 km/h for typical cargo planes)

    Returns:
        float: Travel time in hours

    Example:
        >>> # Chicago (41.8781° N, 87.6298° W) to Sydney (33.8688° S, 151.2093° E)
        >>> result = calculate_cargo_travel_time((41.8781, -87.6298), (-33.8688, 151.2093))
    """

    def to_radians(degrees: float) -> float:
        return degrees * (math.pi / 180)

    # Extract coordinates
    lat1, lon1 = map(to_radians, origin_coords)
    lat2, lon2 = map(to_radians, destination_coords)

    # Earth's radius in kilometers
    EARTH_RADIUS = 6371

    # Calculate great-circle distance using the haversine formula
    dlon = lon2 - lon1
    dlat = lat2 - lat1

    a = (
        math.sin(dlat/2) **2 
        + math.cos(lat1) * math.cos(lat2) * math.sin(dlon/2) **2
    )
    c = 2 * math.asin(math.sqrt(a))
    distance = EARTH_RADIUS * c

    # Add 10% to account for non-direct routes and air traffic controls
    actual_distance = distance * 1.1

    # Calculate flight time
    # Add 1 hour for takeoff and landing procedures
    flight_time = (actual_distance / cruising_speed) + 1.0

    return round(flight_time, 2)



In [3]:
print(calculate_cargo_travel_time((41.8781, -87.6298), (-33.8688, 151.2093)))

22.82


### Set up the agent

In [9]:
import os
from PIL import Image
from smolagents import CodeAgent, DuckDuckGoSearchTool, VisitWebpageTool
from smolagents import HfApiModel, LiteLLMModel, OpenAIServerModel
import litellm

# model = HfApiModel(model_id="Qwen/Qwen2.5-Coder-32B-Instruct", provider="together")   # HF ran otu of requests
# Use Ollama model using the LiteLLMModel API as stated here https://www.reddit.com/r/AI_Agents/comments/1iip2wx/building_a_smolagent_with_ollama_and_external/
# litellm._turn_on_debug()
litellm._turn_on_json()
model = LiteLLMModel(
    model_id = 'ollama_chat/krith/qwen2.5-coder-32b-instruct:IQ2_M',
    api_base = RUNPOD_BASE_URL,
    max_tokens=8096
    # model_id = 'ollama/deepseek-r1:7b',
)


agent = CodeAgent(
    model = model,
    tools = [DuckDuckGoSearchTool(), VisitWebpageTool(), calculate_cargo_travel_time],
    additional_authorized_imports=["pandas", 'numpy'],
    max_steps = 20
)

task = """Find all Batman filming locations in the world, calculate the time to transfer via cargo plane to here (we're in Gotham, 40.7128° N, 74.0060° W), and return them to me as a pandas dataframe.
Also give me some supercar factories with the same cargo plane transfer time."""
result = agent.run(task)

In [10]:
result

6.77

### Adding more prompting, some planning steps
A system prompt helps the agent provide more contextualised results



In [11]:
### Adding more prompting, some planning steps
agent.planning_interval = 4
detailed_report = agent.run(f"""
You're an expert analyst. You make comprehensive reports after visiting many websites.
Don't hesitate to search for many queries at once in a for loop.
For each data point that you find, visit the source url to confirm numbers.

{task}
""")
print(detailed_report)


[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new[0m
LiteLLM.Info: If you need to debug this error, use `litellm._turn_on_debug()'.




[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new[0m
LiteLLM.Info: If you need to debug this error, use `litellm._turn_on_debug()'.



Error in generating final LLM output:
litellm.APIConnectionError: Ollama_chatException - Server error '524 ' for url 'https://copiv0isvfz56v-11434.proxy.runpod.net/api/chat'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/524


In [12]:
print(detailed_report)

Error in generating final LLM output:
litellm.APIConnectionError: Ollama_chatException - Server error '524 ' for url 'https://copiv0isvfz56v-11434.proxy.runpod.net/api/chat'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/524


### Splitting the task between multiple agents

Multi-agent structures allow to separate memories between different sub-tasks, with two great benefits:
- Each agent is more focused on its core task, thus more performant
- Separating memories reduces the count of input tokens at each step, thus reducing latency and cost.

Let's create a team with a dedicated web search agent, managed by another agent.

#### Web Agent

In [25]:
# model = HfApiModel(
#     "Qwen/Qwen2.5-Coder-32B-Instruct", provider="together", max_tokens=8096
# )
model = LiteLLMModel(
    model_id = f'ollama_chat/{CODE_MODEL}',
    api_base= RUNPOD_BASE_URL,
    max_tokens=8096
)
web_agent = CodeAgent(
    model=model,
    tools=[
        DuckDuckGoSearchTool(),
        VisitWebpageTool(),
        calculate_cargo_travel_time,
    ],
    name="web_agent",
    description="Browses the web to find information",
    verbosity_level=0,
    max_steps=10,
)

#### Manager agent
Will need to do mental heavy lifting -- planning, vissualisation ...

In [26]:
from smolagents.utils import encode_image_base64, make_image_url
from smolagents import OpenAIServerModel

def check_reasoning_and_plot(final_answer, agent_memory):
    multimodal_model = OpenAIServerModel("gpt-4o-mini", api_key=OPENAI_API_KEY, max_tokens=8096)
    filepath = 'saved_map.png'
    assert os.path.exists(filepath), "Make sure to save the plot under 'saved_map.png'"

    image = Image.open(filepath)

    prompt = (
        f"Here is a user-given task and the agent steps: {agent_memory.get_succinct_steps()}. Now here is the plot that was made."
        "Please check that the reasoning process and plot are correct: do they correctly answer the given task?"
        "First list reasons why yes/no, then write your final decision: PASS in caps lock if it is satisfactory, FAIL if it is not."
        "Don't be harsh: if the plot mostly solves the task, it should pass."
        "To pass, a plot should be made using px.scatter_map and not any other method (scatter_map looks nicer)."
    )

    message = [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": prompt},
                {"type": "image_url", "image_url": {"url": make_image_url(encode_image_base64(image))}},
            ],
        }
    ]
    
    output = multimodal_model(messages).content
    print("Feedback: ", output)


    if "FAIL" in output:
        raise Exception(output)
    return True
    

manager_agent = CodeAgent(
    # model=HfApiModel('deepseek-ai/DeepSeek-R1', provider="together", max_tokens=8096),
    model=LiteLLMModel(f"ollama_chat/{TEXT_MODEL}", max_tokens=8096, api_base=RUNPOD_BASE_URL),
    tools=[calculate_cargo_travel_time],
    managed_agents=[web_agent],
    additional_authorized_imports=[
        "geopandas",
        "plotly",
        "shapely",
        "json",
        "pandas",
        "numpy"
    ],
    planning_interval=4,
    verbosity_level=2,
    final_answer_checks=[check_reasoning_and_plot],
    max_steps=15

)

In [27]:
manager_agent.visualize()

In [28]:
manager_agent.run("""
    Find all Batman filming locations in the world, calculate the time to transfer via cargo plane to here (we're in Gotham, 40.7128° N, 74.0060° W).
Also give me some supercar factories with the same cargo plane transfer time. You need at least 6 points in total.
Represent this as spatial map of the world, with the locations represented as scatter points with a color that depends on the travel time, and save it to saved_map.png!

Here's an example of how to plot and return a map:
import plotly.express as px
df = px.data.carshare()
fig = px.scatter_map(df, lat="centroid_lat", lon="centroid_lon", text="name", color="peak_hour", size=100,
     color_continuous_scale=px.colors.sequential.Magma, size_max=15, zoom=1)
fig.show()
fig.write_image("saved_image.png")
final_answer(fig)

Never try to process strings using code: when you have a string to read, just print it and you'll see it.
"""
)


[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new[0m
LiteLLM.Info: If you need to debug this error, use `litellm._turn_on_debug()'.






[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new[0m
LiteLLM.Info: If you need to debug this error, use `litellm._turn_on_debug()'.



'<think>\nAlright, I\'m trying to figure out how to help the user with their query about Batman filming locations and supercar factories. Let\'s break down what they\'re asking for.\n\nFirst, the user wants all Batman filming locations worldwide. Since Gotham is a fictional city, but in reality, we can use New York City as its approximate location (40.7128° N, 74.0060° W). So, I\'ll need to list these filming locations along with their coordinates.\n\nNext, they want the travel time via cargo plane from Gotham to each of these locations and also to some supercar factories. The user specified that there should be at least six points in total. That means if there are three Batman filming locations, we need three supercar factories to meet the six-point requirement.\n\nI recall that the previous attempt used the \'shapely\' library but ran into an error because it didn\'t have the \'geodesic\' attribute. I think this might be a misunderstanding; perhaps the user meant to use another metho

In [29]:
manager_agent.python_executor.state['fig']

KeyError: 'fig'