### **Swarm-Style Agent Tutorial with LanceDB for Retrieval**  

In this tutorial, we are using an **MSwarm-style agent system**, where multiple agents communicate with each other to collaboratively generate answers. **LanceDB** is used for efficient information retrieval, including full-text search (FTS) and other advanced query capabilities.  



![Trip Planner Agent](https://github.com/akashAD98/vectordb-recipes/blob/new_planner_swarm_lancedb/assets/trip_planner_agent.png?raw=1)


### **Key Features**  
🚀 **Multi-Agent Collaboration** – Specialized agents work together, seamlessly passing context to one another.  
🔧 **Customizable Handoff Tools** – Built-in mechanisms for smooth communication between agents.  
📂 **LanceDB for Data Retrieval** – High-performance vector search and full-text search for accurate and fast information retrieval.  
🌍 **Travel Agent Use Case** – Agents collaborate to handle different aspects of travel planning, ensuring efficient and context-aware responses.  



### Importing the relevant libraires

In [None]:
%%capture
!pip install lancedb pandas sentence-transformers requests openai tantivy langchain_openai langgraph langgraph_swarm

In [None]:
import datetime
from typing import Callable, List, Dict, Any
from langchain_core.runnables import RunnableConfig
from langchain_openai import ChatOpenAI
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.prebuilt import create_react_agent
from langgraph_swarm import create_handoff_tool, create_swarm

import lancedb
from lancedb.embeddings import EmbeddingFunctionRegistry
from lancedb.pydantic import LanceModel, Vector
import pandas as pd
from openai import OpenAI

import os
import json
import ast
import pandas as pd

# Set OpenAI API key
OPENAI_API_KEY = "sk-proj-your_key"


os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
client = OpenAI(api_key=OPENAI_API_KEY)

# Initialize LanceDB and load data
db = lancedb.connect("/tmp/plannerh&f")

# Define directory and file paths
save_dir = "/content/data"
flight_url = "https://raw.githubusercontent.com/akashAD98/dummy_data/main/CSV_Data/final_flight_data.csv"
hotel_url = "https://raw.githubusercontent.com/akashAD98/dummy_data/main/CSV_Data/final_hotels_dataN.csv"

flight_path = os.path.join(save_dir, "final_flight_data.csv")
hotel_path = os.path.join(save_dir, "final_hotels_dataN.csv")

# Create directory if not exists
os.makedirs(save_dir, exist_ok=True)

# Download files if not already present
if not os.path.exists(flight_path):
    os.system(f"wget -O {flight_path} {flight_url}")

if not os.path.exists(hotel_path):
    os.system(f"wget -O {hotel_path} {hotel_url}")

# Load data
flight_data = pd.read_csv(flight_path, index_col=0)
df_hotel = pd.read_csv(hotel_path, index_col=0)

# Print confirmation
print("Files downloaded and loaded successfully!")

Files downloaded and loaded successfully!


In [None]:
# Set up embeddings
registry = EmbeddingFunctionRegistry.get_instance()
func = registry.get("sentence-transformers").create(device="cpu")


# Define schemas
class FlightWords(LanceModel):
    id: str = func.SourceField()  # Index column
    Flight_Number: str = func.SourceField()
    Airline: str = func.SourceField()
    Origin_City: str = func.SourceField()
    Origin_Airport: str = func.SourceField()
    Destination_City: str = func.SourceField()
    Destination_Airport: str = func.SourceField()
    Departure_Time: str = func.SourceField()
    Arrival_Time: str = func.SourceField()
    Duration: str = func.SourceField()
    Price: str = func.SourceField()
    Available_Seats: str = func.SourceField()
    Class: str = func.SourceField()
    Flight_details: str = func.SourceField()
    vector: Vector(func.ndims()) = func.VectorField()


class HotelWords(LanceModel):
    id: str = func.SourceField()  # Index column
    ID: str = func.SourceField()
    Name: str = func.SourceField()
    City: str = func.SourceField()
    Neighborhood: str = func.SourceField()
    Address: str = func.SourceField()
    Stars: str = func.SourceField()
    Price_Per_Night: str = func.SourceField()
    Amenities: str = func.SourceField()
    Rating: str = func.SourceField()

    all_hotel_details: str = func.SourceField()
    vector: Vector(func.ndims()) = func.VectorField()


# Reset index to make it a column
flight_data = flight_data.reset_index().rename(columns={"index": "id"})
df_hotel = df_hotel.reset_index().rename(columns={"index": "id"})

# Create and populate tables
flight_table = db.create_table(
    "hotel_recommandation1s", schema=FlightWords, mode="overwrite"
)
flight_table.create_fts_index(
    ["Airline", "Flight_details"], use_tantivy=True, replace=True
)
flight_table.add(data=flight_data)

hotel_table = db.create_table(
    "hotel_recommendations", schema=HotelWords, mode="overwrite"
)
hotel_table.create_fts_index(
    ["City", "all_hotel_details"], use_tantivy=True, replace=True
)
hotel_table.add(data=df_hotel)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

#### prosessing user query to strctured data

In [None]:
def format_flight_details(flight_details):
    """Extract and format flight details from the string"""
    parts = {}
    try:
        # Split the flight details string into components
        details = flight_details["Flight_details"].split("|")
        parts["flight_info"] = details[0].strip()
        parts["route"] = details[1].strip()
        parts["timing"] = details[2].strip()
        parts["price_seats"] = details[3].strip()
        parts["class"] = details[4].strip() if len(details) > 4 else "N/A"
    except:
        parts["raw_details"] = flight_details["Flight_details"]
    return parts


def format_search_results(results):
    """Format search results in a clean, structured way"""
    formatted_results = {"total_results": len(results), "flights": []}

    for flight in results:
        details = format_flight_details(flight)
        formatted_flight = {
            "flight_number": flight["Flight_Number"],
            "airline": flight.get("Airline", "N/A"),
            "origin": flight.get("Origin_City", "N/A"),
            "destination": flight.get("Destination_City", "N/A"),
            "departure": flight.get("Departure_Time", "N/A"),
            "price": flight.get("Price", "N/A"),
            "details": details,
        }
        formatted_results["flights"].append(formatted_flight)

    return formatted_results

In [None]:
# Define tools for flight search
def search_flights(query: str) -> Dict:
    """Search flights using LanceDB with improved formatting"""
    try:
        # Get structured parameters from GPT
        completion = client.chat.completions.create(
            model="gpt-4",
            messages=[
                {
                    "role": "user",
                    "content": f"""Query String: {query}

        Now from the query string above extract these following entities for flight search:
        1. Origin City/Airport: Extract the departure city or airport
        2. Destination City/Airport: Extract the destination city or airport
        3. Travel Date/Time (if mentioned): Extract any date or time preferences
        4. Class Preference (if mentioned): Extract class preference (Economy, Business, etc.)
        5. Airline Preference (if mentioned): Extract specific airline if mentioned
        6. Price Range (if mentioned): Extract any price preferences or budget constraints
        7. Other Preferences: Extract any other specific requirements

        Return ONLY a Python dictionary with these keys. If any value is not mentioned in the query, set it as None. Do not include any additional text or explanation.""",
                }
            ],
        )

        # Parse the GPT response using ast.literal_eval for safety
        try:
            params = ast.literal_eval(completion.choices[0].message.content)
        except:
            # Fallback to using the raw query if parsing fails
            params = {"query": query}

        # Search flights using the parameters
        results = (
            flight_table.search(str(params))
            .limit(5)
            .select(
                [
                    "Flight_Number",
                    "Airline",
                    "Origin_City",
                    "Destination_City",
                    "Departure_Time",
                    "Price",
                    "Flight_details",
                ]
            )
            .to_list()
        )

        # Format the results using our formatting functions
        formatted_results = format_search_results(results)
        return formatted_results
    except Exception as e:
        # Return a structured error response
        return {"error": str(e), "status": "error", "total_results": 0, "flights": []}


def search_hotels(query: str) -> Dict:
    """Search hotels using LanceDB with improved formatting"""
    try:
        # Get structured parameters from GPT
        completion = client.chat.completions.create(
            model="gpt-4",
            messages=[
                {
                    "role": "user",
                    "content": f"""Query String: {query}

        Now from the query string above extract these following entities for hotel search:
        1. City: Extract the city where the user wants to stay
        2. Neighborhood (if mentioned): Extract specific area or neighborhood preference
        3. Price Range (if mentioned): Extract budget constraints or price preferences per night
        4. Star Rating (if mentioned): Extract preferred hotel star rating
        5. Amenities (if mentioned): Extract specific amenities requirements
        6. Guest Rating (if mentioned): Extract minimum rating requirements
        7. Special Requirements (if mentioned): Extract any other specific needs

        Return ONLY a Python dictionary with these keys. If any value is not mentioned in the query, set it as None. Do not include any additional text or explanation.""",
                }
            ],
        )

        # Parse the GPT response using ast.literal_eval for safety
        try:
            params = ast.literal_eval(completion.choices[0].message.content)
        except:
            # Fallback to using the raw query if parsing fails
            params = {"query": query}

        # Search hotels using the parameters
        results = (
            hotel_table.search(str(params))
            .limit(5)
            .select(
                [
                    "Name",
                    "City",
                    "Neighborhood",
                    "Stars",
                    "Price_Per_Night",
                    "Rating",
                    "Amenities",
                    "all_hotel_details",
                ]
            )
            .to_list()
        )

        # Format the results
        formatted_results = {
            "total_results": len(results),
            "hotels": [
                {
                    "name": hotel["Name"],
                    "city": hotel["City"],
                    "neighborhood": hotel["Neighborhood"],
                    "stars": hotel["Stars"],
                    "price_per_night": hotel["Price_Per_Night"],
                    "rating": hotel["Rating"],
                    "amenities": hotel["Amenities"],
                    "details": hotel["all_hotel_details"],
                }
                for hotel in results
            ],
        }

        return formatted_results
    except Exception as e:
        # Return a structured error response
        return {"error": str(e), "status": "error", "total_results": 0, "hotels": []}

In [None]:
# Create handoff tools
transfer_to_hotel_assistant = create_handoff_tool(
    agent_name="hotel_assistant",
    description="Transfer user to the hotel-booking assistant that can search for and book hotels.",
)

transfer_to_flight_assistant = create_handoff_tool(
    agent_name="flight_assistant",
    description="Transfer user to the flight-booking assistant that can search for and book flights.",
)


# Define agent prompt
def make_prompt(base_system_prompt: str) -> Callable:
    def prompt(state: dict, config: RunnableConfig) -> list:
        system_prompt = base_system_prompt + f"\nToday is: {datetime.datetime.now()}"
        return [{"role": "system", "content": system_prompt}] + state["messages"]

    return prompt


# Initialize the LLM
model = ChatOpenAI(model="gpt-4")


# Define flight assistant
flight_assistant = create_react_agent(
    model,
    [search_flights, transfer_to_hotel_assistant],
    prompt=make_prompt(
        """You are a flight booking assistant for domestic Indian flights only. You help users find flights within India using the search_flights tool.
    When users ask about hotels, transfer them to the hotel assistant.

    Important constraints and behaviors:
    1. You ONLY handle domestic flights within India
    2. You have deep knowledge of Indian geography and can identify Indian cities vs international locations
    3. For international flight requests:
       - Politely explain you only handle domestic Indian flights
       - Suggest they try a different service for international travel
    4. For non-travel queries:
       - Politely redirect them to ask about Indian domestic flights or hotels

    The search_flights tool returns structured data with flight details. For each flight, you'll receive:
    - Basic info (flight number, airline, origin, destination, departure time, price)
    - Detailed breakdown of the flight including route, timing, price/seats, and class information

    When handling flight queries:
    1. Use your knowledge to verify both cities are valid Indian destinations
    2. If you're unsure about a location being in India, ask the user for clarification
    3. Present flight information in a clear, organized manner:
       - Highlight key details like timing, price, and special features
       - Point out relevant matches to user preferences
    4. Always ask if they'd like to:
       - Proceed with booking
       - See more options
       - Try different search criteria

    Remember to:
    - Transfer hotel-related queries to the hotel assistant
    - Use your knowledge to validate Indian locations
    - Maintain a helpful and informative tone"""
    ),
    name="flight_assistant",
)

# Define hotel assistant
hotel_assistant = create_react_agent(
    model,
    [search_hotels, transfer_to_flight_assistant],
    prompt=make_prompt(
        """You are a hotel booking assistant for hotels in India only. You help users find hotels using the search_hotels tool.
    When users ask about flights, transfer them to the flight assistant.

    Important constraints and behaviors:
    1. You ONLY handle hotels within India
    2. You have deep knowledge of Indian geography and can identify Indian cities vs international locations
    3. For international hotel requests:
       - Politely explain you only handle Indian hotels
       - Suggest they try a different service for international accommodations
    4. For non-travel queries:
       - Politely redirect them to ask about Indian hotels or flights

    When handling hotel queries:
    1. Use your knowledge to verify the requested city is in India
    2. If you're unsure about a location being in India, ask the user for clarification
    3. For valid Indian cities:
       - Search for exact matches in the requested city
       - If no hotels found, suggest nearby popular Indian cities
    4. Present hotel information clearly, highlighting:
       - Hotel name and rating
       - Location and neighborhood
       - Price per night
       - Key amenities
       - Guest ratings
    5. Filter and validate results:
       - Only show hotels in the requested city
       - Ensure all locations are within India
    6. If limited options are found:
       - Clearly state the limitation
       - Suggest alternative nearby Indian cities

    Always ask if they'd like to:
    - See specific hotel details
    - Search with different criteria
    - Try nearby cities
    - Proceed with booking

    Remember to:
    - Transfer flight-related queries to the flight assistant
    - Use your knowledge to validate Indian locations
    - Maintain a helpful and informative tone"""
    ),
    name="hotel_assistant",
)

# Create and compile the swarm
checkpointer = InMemorySaver()
builder = create_swarm(
    [flight_assistant, hotel_assistant], default_active_agent="flight_assistant"
)

app_planer = builder.compile(checkpointer=checkpointer)

# Example usage
if __name__ == "__main__":
    config = {"configurable": {"thread_id": "11", "user_id": "1"}}

    # Test flight query
    result = app_planer.invoke(
        {
            "messages": [
                {
                    "role": "user",
                    "content": "I need a flight from Bangalore to Mumbai tomorrow",
                }
            ]
        },
        config,
    )
    print("Flight Query Result:", result)

Flight Query Result: {'messages': [HumanMessage(content='I need a flight from Bangalore to Mumbai tomorrow', additional_kwargs={}, response_metadata={}, id='e26ff013-6a20-4b6e-ab42-a69a9bc06445'), AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_pG6xNJ4lsXNP9g3eReIxQ2I9', 'function': {'arguments': '{\n  "query": "flights from Bangalore to Mumbai on 2025-03-09"\n}', 'name': 'search_flights'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 30, 'prompt_tokens': 451, 'total_tokens': 481, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4-0613', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, name='flight_assistant', id='run-7b828d97-3571-46ad-abbb-7f98a6eb18e5-0', tool_calls=[{'name': 'search_flights', 'args': {'query':

###### We are using the same `thread_id` to maintain continuity in memory for streaming purposes.

In [None]:
input_message = {"role": "user", "content": "can you check delhi hotels"}
for chunk in app_planer.stream(
    {"messages": [input_message]},
    {"configurable": {"thread_id": "2"}},
    stream_mode="values",
):
    chunk["messages"][-1].pretty_print()


can you check delhi hotels
Name: transfer_to_hotel_assistant

Successfully transferred to hotel_assistant
Name: hotel_assistant

Here are some hotels in Delhi and nearby areas:

**Delhi**

1. [Hotel Delhi Residency, Saket](link)
   - Price per night: ₹16368.2
   - Guest rating: 4.2/5
   - Amenities: Free Wi-Fi, Fitness Center, Airport Shuttle, Room Service
   - Neighborhood: Saket

**Nearby cities**

1. [Luxury Ahmedabad Inn, Ahmedabad](link)
   - Price per night: ₹2450.96
   - Guest rating: 4.5/5
   - Amenities: Free Wi-Fi, Restaurant, Spa, Parking, Swimming Pool, Laundry Service
   - Neighborhood: CG Road

2. [The Ahmedabad Residency, Ahmedabad](link)
   - Price per night: ₹9278.69
   - Guest rating: 3.2/5
   - Amenities: Room Service, Conference Room, Swimming Pool, Bar, Restaurant
   - Neighborhood: Satellite

3. [Luxury Ahmedabad Palace, Ahmedabad](link)
   - Price per night: ₹11028.44
   - Guest rating: 5/5
   - Amenities: Airport Shuttle, Room Service, Business Center, Breakfas

In [None]:
input_message = {"role": "user", "content": "give me pune hotels"}
for chunk in app_planer.stream(
    {"messages": [input_message]},
    {"configurable": {"thread_id": "2"}},
    stream_mode="values",
):
    chunk["messages"][-1].pretty_print()


give me pune hotels
Name: hotel_assistant

Here are some hotels in Pune:

1. [Hotel Pune Resort, Baner](link)
   - Price per night: ₹10486.14
   - Guest rating: 3.9/5
   - Amenities: Restaurant, Conference Room, Laundry Service, Swimming Pool, Bar, Parking, Business Center
   - Neighborhood: Baner

2. [Luxury Pune Plaza, Baner](link)
   - Price per night: ₹3236.25
   - Guest rating: 4.7/5
   - Amenities: Breakfast Included, Business Center, Laundry Service, Spa
   - Neighborhood: Baner

3. [The Pune Hotel, Hinjewadi](link)
   - Price per night: ₹11987.88
   - Guest rating: 4.8/5
   - Amenities: Concierge, Fitness Center, Parking, Bar, Business Center, Conference Room
   - Neighborhood: Hinjewadi

4. [Luxury Pune Resort, Viman Nagar](link)
   - Price per night: ₹6429.43
   - Guest rating: 3.8/5
   - Amenities: Conference Room, Laundry Service, Concierge
   - Neighborhood: Viman Nagar

5. [Grand Pune Resort, Baner](link)
   - Price per night: ₹11043.21
   - Guest rating: 3.7/5
   - Amen

In [None]:
input_message = {"role": "user", "content": "now check flights from pune to delhi"}
for chunk in app_planer.stream(
    {"messages": [input_message]},
    {"configurable": {"thread_id": "2"}},
    stream_mode="values",
):
    chunk["messages"][-1].pretty_print()


now check flights from pune to delhi
Name: transfer_to_flight_assistant

Successfully transferred to flight_assistant
Name: flight_assistant

I found some flights from Pune to Delhi for you:

1. Flight SP992 by SpiceJet
   - Departure: 21:45
   - Duration: 2h 00m
   - Price: ₹6563.82
   - Seats Left: 66
   - Class: Economy

Would you like to proceed with this flight or see more options?
