### Using MCP to get accessible stay options based on multiple data sources

In [0]:
%pip install -U -qqq llama-index llama-index-llms-databricks mlflow python-dotenv llama-index-tools-mcp
%restart_python

[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


We will use the LlamaIndex `BasicMCPClient` to get the list of tools from the MCP server and create a `McpToolSpec` object. We can then convert this to a list of tools using the `to_tool_list_async` method.

In [0]:
import os
os.environ["NIMBLE_API_KEY"] = "d7f132fad8064fa19163eb596ecfd3a2c39f675baa9449fea6e93ef4e179f72c"

In [0]:
from llama_index.llms.databricks import Databricks
from databricks.sdk import WorkspaceClient
import mlflow

w = WorkspaceClient()

tmp_token = w.tokens.create(comment="for model serving", lifetime_seconds=1200).token_value

llm = Databricks(
    model="databricks-llama-4-maverick",
    api_key=tmp_token,
    api_base=f"{w.config.host}/serving-endpoints/"
)

None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Unexpected internal error when monkey patching `PreTrainedModel.from_pretrained`: 
PreTrainedModel requires the PyTorch library but it was not found in your environment. Checkout the instructions on the
installation page: https://pytorch.org/get-started/locally/ and follow the ones that match your environment.
Please note that you may need to restart your runtime after installation.

Unexpected internal error when monkey patching `Trainer.train`: 
Trainer requires the PyTorch library but it was not found in your environment. Checkout the instructions on the
installation page: https://pytorch.org/get-started/locally/ and follow the ones that match your environment.
Please note that you may need to restart your runtime after installation.



In [0]:
from dotenv import load_dotenv
from llama_index.tools.mcp import BasicMCPClient, McpToolSpec
import os

load_dotenv()

async def setup_nimble_tools():
    mcp_client = BasicMCPClient(
        "https://mcp.nimbleway.com/sse",
        headers={"Authorization": f"Bearer {os.environ['NIMBLE_API_KEY']}"}
    )
    
    mcp_tool_spec = McpToolSpec(
        client=mcp_client,
        # Optional: filter to specific tools
        # allowed_tools=["nimble_web_search", "nimble_google_maps_search"]
    )
    
    tools = await mcp_tool_spec.to_tool_list_async()
    
    print(f"Loaded {len(tools)} Nimble tools:")
    for tool in tools:
        print(f"- {tool.metadata.name}: {tool.metadata.description}")
    
    return tools

# Get the tools
nimble_tools = await setup_nimble_tools()
nimble_tools

Loaded 9 Nimble tools:
- nimble_targeted_engines: 
    Fetch lists of available Nimble Web Agent templates for targeted data extraction.

    Returns templates with: id, name, display_name, description, domain, entity_type (PDP/SERP).
    Use the 'id' field as template_number in targeted_retrieval.
    
- nimble_targeted_retrieval: 
    Executes data extraction using Nimble Web Agent's pre-trained templates.

    Leverages Nimble's intelligent extraction technology for accurate, structured data
    from e-commerce sites, job boards, and other supported domains.

    Args:
        input: Format depends on template entity_type:
            • SERP: Search query (e.g., "wireless headphones")  
            • PDP: Product identifier - Amazon: ASIN only ("B08N5WRWNW"), 
            Best Buy: SKU, Target: TCIN, Walmart: Item ID, Others: product ID or path
            • CLP: Category path or identifier
            
        template_number: Template ID from nimble_targeted_engines

    Check tem

[<llama_index.core.tools.function_tool.FunctionTool at 0x7efd7efe7090>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x7efd7e61ff50>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x7efd7efe6e50>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x7efd7e637090>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x7efd7f00dc90>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x7efd7f00fa10>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x7efd7e640710>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x7efd7e641dd0>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x7efd7e641e90>]

We will use the LlamaIndex `ReActAgent` to build our agent. We pass in the list of tools, the LLM, and a system prompt.

In [0]:
from llama_index.core.agent.workflow import ReActAgent

agent = ReActAgent(
    tools=nimble_tools,
    llm=llm,
    system_prompt="""You are a helpful research assistant with access to web scraping and data collection tools. 
    Always explain your answer in the final output and structure the output as a valid json array.""",
)

### Running the agent

We can run the agent with `await agent.run()`, passing in a question.


In [0]:
mlflow.llama_index.autolog()
# await agent.run("What are the most accessible stay options for a vacation in Chicago?")
response = await agent.run("What are the most accessible stay options for a vacation in Chicago that allow service dog. Also list the Google review ratings of each of the recommended hotels? Structure the response in json such that it can be converted to pandas dataframe. Be concise. Just respond with a valid json response")

Trace(request_id=tr-51170c945061461bb38e41fa6576eb55)

In [0]:
blocks = response.response.blocks
response.final_response_text = "\n".join(block.text for block in blocks if hasattr(block, "text"))


In [0]:

### ✅ Step-by-Step Code

import json
import pandas as pd

# 1. Get the raw text from the block
text_block = response.response.blocks[0].text

# 2. Strip the code block wrapper (```json\n ... \n```)
clean_json_str = text_block.strip("`").strip()
if clean_json_str.startswith("json"):
    clean_json_str = clean_json_str[4:]  # remove the 'json\n' prefix

# 3. Parse JSON
try:
    data = json.loads(clean_json_str)
except json.JSONDecodeError as e:
    print("JSON decode error:", e)
    data = []

# 4. Convert to DataFrame
df = pd.DataFrame(data)
print(df)


                             hotel_name  ... num_reviews
0          Hotel Lincoln - JDV by Hyatt  ...        2382
1                  The Guesthouse Hotel  ...         370
2  Hotel Saint Clair - Magnificent Mile  ...        1430
3                        Claridge House  ...         992
4         The Villa Toscana Guest House  ...         142
5     Hyatt Place Chicago / Wicker Park  ...         399

[6 rows x 4 columns]


In [0]:
df

Unnamed: 0,hotel_name,address,rating,num_reviews
0,Hotel Lincoln - JDV by Hyatt,"1816 N Clark St, Chicago, IL 60614",4.3,2382
1,The Guesthouse Hotel,"4872 N Clark St, Chicago, IL 60640",4.8,370
2,Hotel Saint Clair - Magnificent Mile,"162 E Ontario St, Chicago, IL 60611",3.5,1430
3,Claridge House,"1244 N Dearborn Pkwy, Chicago, IL 60610",4.0,992
4,The Villa Toscana Guest House,"3447 N Halsted St, Chicago, IL 60657",4.5,142
5,Hyatt Place Chicago / Wicker Park,"1551 W North Ave Ashland, Chicago, IL 60622",4.5,399
