👉 Find all Batman filming locations in the world, calculate the time to transfer via boat to there, and represent them on a map, with a color varying by boat transfer time. Also represent some supercar factories with the same boat transfer time.

In [16]:
!pip install 'smolagents[litellm]' matplotlib geopandas shapely kaleido -q duckduckgo_search

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.3/3.3 MB[0m [31m27.2 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25h

In [2]:
from huggingface_hub import notebook_login
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [3]:
#We first make a tool to get cargo plane transfer time
import math
from typing import Optional , Tuple
from smolagents import tool

In [5]:
@tool
def calculate_cargo_travel_time(
    origin_coords:Tuple[float,float],
    destination_coords:Tuple[float,float],
    cruising_speed_kmh:Optional[float]=750.0 #Average speed for cargo planes
    ) -> float:

    """ Calculate the travel time for a cargo plane between two points on Earth using great-circle distance.
    
    Args:
        origin_coords: Tuple of (latitude,longitude) for starting point
        destination_coords: Tuple of (latitude and longitude) for end point
        cruising_speed_kmh: Optional cruising speed in km/h (defaults to 750.0 km/h for typical cargo planes)
    
    Returns:
        float:  The estimated travel time in hours

    Example:
        >>> #Ikeja (6.6018° N, 3.3515° E) TO Apapa (6.4483° N ,3.3675° E)
        >>> result = calculate_cargo_travel_time((6.6018, 3.3515), (6.4483, 3.3675))
    """
    def to_radians(degrees:float) -> float:
        return degrees * (math.pi/180)

    #Extract coordinates
    lat1, lon1 = map(to_radians, origin_coords)
    lat2, lon2 = map(to_radians, destination_coords)

    #Earth's radius in km
    Earth_radius_km = 6371.0

    #Calculate great_circle distance using haversine formula
    dlon= lon2 - lon1
    dlat= lat2 - lat1

    a = (
        math.sin(dlat/2) **2 
        + math.cos(lat1) * math.cos(lat2) * math.sin(dlon/2)**2 
    )
    c = 2 * math.asin(math.sqrt(a))
    distance = Earth_radius_km * c

    #Add 10% to account for routes that are not direct and air-traffic controls
    actual_distance = distance * 1.1

    #Calculate flight time
    #Add 1 hour for take off and landing procedure
    flight_time = (actual_distance/cruising_speed_kmh) + 1.0
    #format the results
    return round(flight_time,2)
    

In [8]:
print(calculate_cargo_travel_time((41.8781, -87.6298), (-33.8688, 151.2093))) #Chicago to Sydney

22.82


For the model provider, we use Together AI, one of the new inference providers on the Hub!

Regarding the GoogleSearchTool: this requires either having setup env variable SERPAPI_API_KEY and passing provider="serpapi" or having SERPER_API_KEY and passing provider=serper.

If you don't have any Serp API provider setup, you can use DuckDuckGoSearchTool but beware that it has a rate limit.

In [12]:
import os
from PIL import Image
from smolagents import CodeAgent ,GoogleSearchTool, HfApiModel, VisitWebpageTool,DuckDuckGoSearchTool

model = HfApiModel(model_id = "Qwen/Qwen2.5-Coder-32B-Instruct", provider = "together")

We can start with creating a baseline, simple agent to give us a simple report.

In [11]:
task = """Find all Batman filming locations in the world, calculate the time to transfer via cargo
plane to here (we're in Gotham, 40.7128° N, 74.0060° W), and return them to me as a pandas dataframe.
Also give me some supercar factories with the same cargo plane transfer time."""

In [18]:
agent = CodeAgent(model = model, tools = [DuckDuckGoSearchTool(),VisitWebpageTool(),calculate_cargo_travel_time],
                 additional_authorized_imports = ["pandas"],max_steps = 20)

In [19]:
result = agent.run(task)

In [20]:
result

(    Location   Latitude  Longitude  Travel Time (hours)
 0     London  51.511531  -0.112160             7.428140
 1    Glasgow  55.862116  -4.230996             6.910755
 2  Liverpool  53.408366  -2.991572             7.098441
 3    Chicago  41.878113 -87.629799             1.525722,
                            Location  Latitude  Longitude  Travel Time (hours)
 0                 Ferrari Maranello   44.5964    10.8856             8.836413
 1  Lamborghini Sant'Agata Bolognese   44.6975    11.1948             8.857686
 2     Pagani San Cesario sul Panaro   44.7500    10.9667             8.832893
 3                    McLaren Woking   51.3778    -0.5714             7.394046
 4            Koenigsegg Gemeneboden   58.4110    15.6215             8.312714)

We could already improve this a bit by throwing in some dedicated planning steps, and adding more prompting.

In [21]:
agent.planning_interval = 4

detailed_report = agent.run(f"""You're an expert analyst. You make comprehensive reports after visiting many websites.
Don't hesitate to search for many queries at once in a for loop.
For each data point that you find, visit the source url to confirm numbers.

{task}""")
print(detailed_report)

                          Location          Coordinates  Travel Time (hours)
0  Royal Liver Building, Liverpool   (53.4084, -2.9916)                 8.23
1      Glasgow Necropolis, Glasgow    (55.8621, -4.231)                 8.04
2           2 Temple Place, London   (51.5115, -0.1122)                 8.56
3      Holborn Tram Tunnel, London      (51.52, -0.105)                 8.56
4                          Chicago  (41.8781, -87.6298)                 2.55


In [22]:
detailed_report

Unnamed: 0,Location,Coordinates,Travel Time (hours)
0,"Royal Liver Building, Liverpool","(53.4084, -2.9916)",8.23
1,"Glasgow Necropolis, Glasgow","(55.8621, -4.231)",8.04
2,"2 Temple Place, London","(51.5115, -0.1122)",8.56
3,"Holborn Tram Tunnel, London","(51.52, -0.105)",8.56
4,Chicago,"(41.8781, -87.6298)",2.55


Thanks to these quick changes, we obtained a much more concise report by simply providing our agent a detailed prompt, and giving it planning capabilities!

💸 But as you can see, the context window is quickly filling up. So if we ask our agent to combine the results of detailed search with another, it will be slower and quickly ramp up tokens and costs.

➡️ We need to improve the structure of our system.

✌️ Splitting the task between two agents
Multi-agent structures allow to separate memories between different sub-tasks, with two great benefits:

Each agent is more focused on its core task, thus more performant
Separating memories reduces the count of input tokens at each step, thus reducing latency and cost.
Let's create a team with a dedicated web search agent, managed by another agent.

The manager agent should have plotting capabilities to redact its final report: so let us give it access to additional imports, including matplotlib, and geopandas + shapely for spatial plotting.

In [24]:
model = HfApiModel(
    "Qwen/Qwen2.5-Coder-32B-Instruct", provider="together", max_tokens=8096
)

web_agent = CodeAgent(
    model=model,
    tools=[
        DuckDuckGoSearchTool(),
        VisitWebpageTool(),
        calculate_cargo_travel_time,
    ],
    name="web_agent",
    description="Browses the web to find information",
    verbosity_level=0,
    max_steps=10,
)

In [25]:
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
secret_value_0 = user_secrets.get_secret("OPENAI_API_KEY")

In [28]:
from smolagents.utils import encode_image_base64, make_image_url
from smolagents import OpenAIServerModel


def check_reasoning_and_plot(final_answer, agent_memory):
    final_answer
    multimodal_model = OpenAIServerModel("gpt-4o", max_tokens=8096)
    filepath = "saved_map.png"
    assert os.path.exists(filepath), "Make sure to save the plot under saved_map.png!"
    image = Image.open(filepath)
    prompt = (
        f"Here is a user-given task and the agent steps: {agent_memory.get_succinct_steps()}. Now here is the plot that was made."
        "Please check that the reasoning process and plot are correct: do they correctly answer the given task?"
        "First list reasons why yes/no, then write your final decision: PASS in caps lock if it is satisfactory, FAIL if it is not."
        "Don't be harsh: if the plot mostly solves the task, it should pass."
        "To pass, a plot should be made using px.scatter_map and not any other method (scatter_map looks nicer)."
    )
    messages = [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": prompt,
                },
                {
                    "type": "image_url",
                    "image_url": {"url": make_image_url(encode_image_base64(image))},
                },
            ],
        }
    ]
    output = multimodal_model(messages).content
    print("Feedback: ", output)
    if "FAIL" in output:
        raise Exception(output)
    return True


manager_agent = CodeAgent(
    model=HfApiModel("deepseek-ai/DeepSeek-R1", provider="together", max_tokens=8096),
    tools=[calculate_cargo_travel_time],
    managed_agents=[web_agent],
    additional_authorized_imports=[
        "geopandas",
        "plotly",
        "shapely",
        "json",
        "pandas",
        "numpy",
    ],
    planning_interval=5,
    verbosity_level=2,
    final_answer_checks=[check_reasoning_and_plot],
    max_steps=15,
)

In [29]:
manager_agent.visualize()

In [30]:
manager_agent.run("""
Find all Batman filming locations in the world, calculate the time to transfer via cargo plane to here 
(we're in Gotham, 40.7128° N, 74.0060° W).
Also give me some supercar factories with the same cargo plane transfer time. You need at least
6 points in total.
Represent this as spatial map of the world, with the locations represented as scatter points with 
a color that depends on the travel time, and save it to saved_map.png!

Here's an example of how to plot and return a map:
import plotly.express as px
df = px.data.carshare()
fig = px.scatter_map(df, lat="centroid_lat", lon="centroid_lon", text="name", color="peak_hour", size=100,
     color_continuous_scale=px.colors.sequential.Magma, size_max=15, zoom=1)
fig.show()
fig.write_image("saved_image.png")
final_answer(fig)

Never try to process strings using code: when you have a string to read, just print it and you'll see it.
""")

'<think>\n\nOkay, let\'s tackle this problem step by step. The user wants a map showing Batman filming locations and supercar factories with similar cargo plane transfer times from Gotham (NYC coordinates). \n\nFirst, I need to gather Batman filming locations. The previous attempts showed that using the web_agent to search for these locations worked, but there were issues with parsing the data. The agent returned a list of locations with coordinates. However, when trying to access them, there were errors because the data wasn\'t properly converted from a string to a dictionary. So, I need to make sure that the data is correctly parsed. Since using eval or ast wasn\'t allowed, using regex to extract the tuples seems like the right approach.\n\nNext, calculate the travel times from Gotham to each Batman location using the calculate_cargo_travel_time function. This function likely computes the great-circle distance and divides by the cargo plane speed (assumed 750 km/h). \n\nThen, find su

In [33]:
manager_agent.python_executor.state["fig"]

{'data': [{'type': 'scattermapbox',
   'lat': [53.408366,
    51.507351,
    51.653811,
    52.132633,
    55.864237,
    55.804719,
    41.878113,
    51.508014,
    40.712776,
    37.774929,
    52.520007,
    41.385064,
    43.65107,
    53.349805,
    47.497912,
    44.5323,
    44.6586,
    48.5318,
    51.3334,
    44.9524,
    57.4907],
   'lon': [-2.991572,
    -0.127758,
    -0.406253,
    -0.431811,
    -4.251806,
    -3.782781,
    -87.629799,
    -0.128035,
    -74.005974,
    -122.419418,
    13.404954,
    2.173403,
    -79.347015,
    -6.26031,
    19.040235,
    10.8641,
    11.1277,
    7.5124,
    -0.7425,
    9.9936,
    12.6735],
   'mode': 'markers',
   'marker': {'size': 9,
    'color': [8.81,
     9.17,
     9.14,
     9.11,
     8.6,
     8.64,
     2.68,
     9.17,
     1.0,
     7.06,
     10.36,
     10.04,
     1.8,
     8.5,
     11.28,
     10.72,
     10.74,
     10.09,
     9.12,
     10.6,
     9.97],
    'colorscale': 'Magma',
    'showscale': True},
 