# Putting it all together - Preparing the Environment

Welcome back! In the previous sections, you've explored the foundational building blocks of developing with Llama Stack: understanding basic agents, delving into the power of ReAct agents, and seeing how the Model Context Protocol (MCP) allows seamless integration of external services as tools.

Now, it's time to bring these concepts together and prepare our environment for building a more sophisticated agent capable of interacting with multiple external services and leveraging internal knowledge. This section is dedicated to setting the stage, ensuring all the necessary components are up and running and configured correctly within Llama Stack.

By the end of this preparatory section, you will have:

* Understood the setup required for a multi-tool, RAG-enabled agent.
* Launched and verified the operation of multiple MCP servers exposing diverse functionalities.
* Configured your Llama Stack client to interact with these new tools.
* Registered the MCP servers as discoverable toolgroups within Llama Stack.
* Created and populated a Vector Database, essential for providing our agent with domain-specific knowledge via Retrieval Augmented Generation (RAG).

Think of this section as gathering and organizing all the ingredients and setting up your workspace before you start cooking a complex and exciting dish! Let's get everything ready so we can build something truly powerful in the next section.

***

### Setting up our Environment

Before we dive into the exciting part of building our multi-tool agent, let's make sure our environment is set up correctly. We'll start by importing the essential libraries and configuring our connection to the Llama Stack server, just like we did in previous labs.

In [10]:
!pip install -U llama-stack-client==0.2.5 dotenv > /dev/null 2>&1 && echo "pip Python Prerequisites installed succesfuly"
import os

# Load environment variables from .env file
from dotenv import load_dotenv
load_dotenv()
from src.utils import step_printer
# for communication with Llama Stack
from llama_stack_client import LlamaStackClient

# These libraries are just here to print the results from the agent in a more human-readable way 
from termcolor import cprint
import uuid
from llama_stack_client.lib.agents.event_logger import EventLogger
stream=False ## Defaulting to False, you can change this throughout the section to "True" if you wanted to see the output in another format (Using EventLogger)

# for our lab, we will just define our variables manualy here, in a regular application, this would be ready directly from the local .env file and we would comment these lines out
os.environ['LLAMA_STACK_SERVER'] = 'http://localhost:8321'

# We will be using the Tavily web search service (docs.tavily.com/)
tavily_search_api_key='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'
provider_data = {"tavily_search_api_key": tavily_search_api_key}

LLAMA_STACK_SERVER=os.getenv("LLAMA_STACK_SERVER")

# List available models and select from allowed models list
allowed_models_list=["meta-llama/Llama-3.2-3B-Instruct"]
#allowed_models_list=["granite3.2:8b"]

selected_model = None


from llama_stack_client import LlamaStackClient
LLAMA_STACK_SERVER='http://localhost:8321'
client = LlamaStackClient(
    base_url=LLAMA_STACK_SERVER,
)

models = client.models.list()

print("--- Available models: ---")
for m in models:
    print(f"{m.identifier} - {m.provider_id} - {m.provider_resource_id}")
    # Check if the model identifier contains any of the allowed substrings
    if any(substring in m.identifier for substring in allowed_models_list):
        # Only set selected_model if it hasn't been set yet
        if selected_model is None:
            selected_model = m.identifier
           
# If no allowed model was found, you might want to handle that case
if selected_model is None:
    print("No allowed model found in the list.")


print(f"Selected model (from allowed list): {selected_model}")
            # Removed the break here to show all available models, but the selection logic remains picking the first one


SELECTED_MODEL = selected_model


pip Python Prerequisites installed succesfuly
--- Available models: ---
all-MiniLM-L6-v2 - ollama - all-minilm:latest
granite3.2:8b - ollama - granite3.2:8b
meta-llama/Llama-3.2-3B-Instruct - ollama - llama3.2:3b-instruct-fp16
Selected model (from allowed list): meta-llama/Llama-3.2-3B-Instruct


### Launching our External Services (MCP Servers)

To create a truly capable agent, we need to connect it to the outside world! We'll be using a few specialized external services, exposed via the Model Context Protocol (MCP), to give our agent powers like checking the weather, finding locations, and accessing specific park information.

Run the following commands in your terminal to start these essential MCP servers in the background:

* `mcp-weather`: Adds real-time weather capabilities.
* `mcp-googlemaps`: Connects to Google Maps for location and direction data.
* `mcp-parks-info`: Offers RAG-based information about our fictional parks using RAG.

```bash
# MCP-weather configuration
REMOTE_REGISTRY="[quay.dev.demo.redhat.com/rhdp/](https://quay.dev.demo.redhat.com/rhdp/)"

podman run -d --name mcp-weather --network=host ${REMOTE_REGISTRY}mcp-weather:latest --port 8005  

# MCP-google-maps configuration

YOUR_API_KEY="AIzaSyAYcxmnVi7ODNOT_A_REAL_KEY_REPLACE_ME"
podman run -d --name mcp-googlemaps-sse --network=host -e Maps_API_KEY="ENTER_YOUR_TOKEN" -e MCP_PORT=8006 ${REMOTE_REGISTRY}mcp-googlemaps-sse:latest

# MCP-parks-info

podman run -d --name mcp-parks-info --network=host -e PORT=8007 [quay.dev.demo.redhat.com/rhdp/mcp-parks-info:latest](https://quay.dev.demo.redhat.com/rhdp/mcp-parks-info:latest)
```

### Verify Servers are Running

It's always a good practice to quickly check if our external services have started successfully. We can do this by attempting to connect to their status endpoints. Run the code below to verify they are live and ready.

In [11]:
!echo Testing MCP-WEATHER ; curl --max-time 1 http://localhost:8005/sse 2>/dev/null
!echo Testing MCP-GOOGLEMAPS-sse ; curl --max-time 1 http://localhost:8006/sse 2>/dev/null 
!echo Testing mcp-parks-info ; curl --max-time 1 http://localhost:8007/sse 2>/dev/null 


Testing MCP-WEATHER
event: endpoint
data: /messages/?session_id=346da1b78ad24cd3a43ba9f329f7fa5a

Testing MCP-GOOGLEMAPS-sse
event: endpoint
data: /message?sessionId=97181020-f683-4cf5-83c1-cacb0d1eb53e

Testing mcp-parks-info
event: endpoint
data: /messages/?session_id=a36ce378f8cc409ba46bddb23458c1aa



### Connect to Llama Stack and Select Our Model

Now that our external services are running, let's establish the connection from our notebook to the Llama Stack server and select the powerful language model our agent will use for reasoning and action.

In [12]:
!pip install -U llama-stack-client==0.2.5 dotenv > /dev/null 2>&1 && echo "pip Python Prerequisites installed succesfuly"
import os

# Load environment variables from .env file
from dotenv import load_dotenv
load_dotenv()
from src.utils import step_printer
# for communication with Llama Stack
from llama_stack_client import LlamaStackClient

# These libraries are just here to print the results from the agent in a more human-readable way 
from termcolor import cprint
import uuid
from llama_stack_client.lib.agents.event_logger import EventLogger
stream=False ## Defaulting to False, you can change this throughout the section to "True" if you wanted to see the output in another format (Using EventLogger)

# for our lab, we will just define our variables manualy here, in a regular application, this would be ready directly from the local .env file and we would comment these lines out
os.environ['LLAMA_STACK_SERVER'] = 'http://localhost:8321'

# We will be using the Tavily web search service (docs.tavily.com/)
tavily_search_api_key='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'
provider_data = {"tavily_search_api_key": tavily_search_api_key}

LLAMA_STACK_SERVER=os.getenv("LLAMA_STACK_SERVER")

# List available models and select from allowed models list
allowed_models_list=["meta-llama/Llama-3.2-3B-Instruct"]
#allowed_models_list=["granite3.2:8b"]

selected_model = None


from llama_stack_client import LlamaStackClient
LLAMA_STACK_SERVER='http://localhost:8321'
client = LlamaStackClient(
    base_url=LLAMA_STACK_SERVER,
)

models = client.models.list()

print("--- Available models: ---")
for m in models:
    print(f"{m.identifier} - {m.provider_id} - {m.provider_resource_id}")
    # Check if the model identifier contains any of the allowed substrings
    if any(substring in m.identifier for substring in allowed_models_list):
        # Only set selected_model if it hasn't been set yet
        if selected_model is None:
            selected_model = m.identifier
           
# If no allowed model was found, you might want to handle that case
if selected_model is None:
    print("No allowed model found in the list.")


print(f"Selected model (from allowed list): {selected_model}")
            # Removed the break here to show all available models, but the selection logic remains picking the first one


SELECTED_MODEL = selected_model


pip Python Prerequisites installed succesfuly
--- Available models: ---
all-MiniLM-L6-v2 - ollama - all-minilm:latest
granite3.2:8b - ollama - granite3.2:8b
meta-llama/Llama-3.2-3B-Instruct - ollama - llama3.2:3b-instruct-fp16
Selected model (from allowed list): meta-llama/Llama-3.2-3B-Instruct


### Making External Services Available as Tools

This is where the magic happens! We need to tell Llama Stack about the external services we just started. By registering them as 'toolgroups', Llama Stack understands their capabilities and can make them available for our agent to discover and use in its problem-solving process.

In [19]:
client.toolgroups.register(
        toolgroup_id="mcp::mcp-weather",
        provider_id="model-context-protocol",
        mcp_endpoint={"uri":"http://localhost:8005/sse"},
    )

client.toolgroups.register(
        toolgroup_id="mcp::mcp-googlemaps",
        provider_id="model-context-protocol",
        mcp_endpoint={"uri":"http://localhost:8006/sse"},
    )
client.toolgroups.register(
        toolgroup_id="mcp::mcp-parks-info",
        provider_id="model-context-protocol",
        mcp_endpoint={"uri":"http://localhost:8007/sse"},
    )

## Building Our Agent's Knowledge Base (RAG)

A truly smart agent can also access specific, domain-level knowledge beyond its initial training. We'll achieve this using Retrieval Augmented Generation (RAG). The first step is to create and populate a Vector Database with the information our agent might need – in this case, details about our parks!

In [20]:
#Unregistering vector database in case we are running this over and over again as part of a learning experience

for vector_db_id in client.vector_dbs.list():
    print(f"Unregistering vector database: {vector_db_id.identifier}")
    client.vector_dbs.unregister(vector_db_id=vector_db_id.identifier)

# Select a VectorDB provider from availalbe providers
providers = client.providers.list()
vector_providers = []
for provider in client.providers.list():
    if provider.api == "vector_io":
        #print(f"Found VectorDB provider: {provider.provider_id}\n")  # Simple print
        vector_providers.append(provider)

# In this example, we only have one provider, but on other server we might have many. here, we simply select the first one.
selected_vector_provider = vector_providers[0]

# register our DB
vector_db_id = "Our_Parks_DB"
client.vector_dbs.register(
    vector_db_id=vector_db_id,
    embedding_model="all-MiniLM-L6-v2",
    embedding_dimension=384,
    provider_id=selected_vector_provider.provider_id,
)

# Injest documents from local directory (this is so you can test inserting changes easily)
from llama_stack_client.types import Document

files = [
    "Azure_Mongrove_Wilderness.md",
    "Crimson_Basin.md",
    "Granite_Spire.md",
    "Obsidian_Rainforest.md",
    "Prismatic_Painted_Prairie.md",
]

document_dirctory="assets/Parks"

# Read documents into the "documents" array
documents = [
    Document(
        document_id=f"num-{i}",
        content=open(os.path.join(document_dirctory, file), 'r', encoding='utf-8').read(),
        mime_type="text/plain",
        metadata={},
    )
    for i, file in enumerate(files)
]

# Insert the documents into the vectorDB
client.tool_runtime.rag_tool.insert(
    documents=documents,
    vector_db_id=vector_db_id,
    chunk_size_in_tokens=300,
)

print(f"Created {len(documents)} documents from local files into {vector_db_id}")


#print(documents)

Unregistering vector database: Our_Parks_DB
Created 5 documents from local files into Our_Parks_DB


# Summary: Putting it all together - Preparing the Environment

In this preparatory section, you've successfully laid the groundwork for building a more advanced agent within the Llama Stack framework. You've taken the essential steps to integrate multiple external services and prepare a knowledge base for your agent to utilize.

You have accomplished the following key tasks:

* **Deployed Multiple MCP Servers:** You initiated and confirmed the operation of several MCP-enabled containers (`mcp-weather`, `mcp-googlemaps`, `mcp-parks-info`), demonstrating how Llama Stack can connect to a variety of external functionalities.
* **Configured the Llama Stack Client:** You set up your Python client, specifying the server address and selecting the appropriate language model that will power your agent.
* **Registered External Tools:** You registered the different MCP servers as toolgroups within Llama Stack, making the functions they expose available for discovery and use by agents.
* **Established a Knowledge Base:** You created and populated a Vector Database with relevant documents, setting up the critical component for Retrieval Augmented Generation (RAG) to give your agent access to specific information.

By completing these steps, your environment is now ready for you to build and experiment with an agent that can combine reasoning, acting with diverse tools (both external services via MCP and internal RAG), and leveraging a custom knowledge base. This is a significant step towards creating more intelligent and capable AI applications with Llama Stack!

### Cleaning Up (Optional)

If you need to stop and remove the MCP server containers later, you can use these commands. This is helpful for resetting your environment.

In [21]:
DON"T RUN THIS IF YOU DON"T MEAN TO, IT WILL DELETE YOUR DATABASE

toolgroups_to_unregister = [
    "mcp::mcp-weather",
    "mcp::mcp-googlemaps",
    "mcp::mcp-parks-info",
]

for toolgroup_id in toolgroups_to_unregister:
    try:
        print(f"Attempting to unregister toolgroup: {toolgroup_id}")
        client.toolgroups.unregister(toolgroup_id=toolgroup_id)
        print(f"Successfully unregistered toolgroup: {toolgroup_id}")
    except Exception as e:
        print(f"Failed to unregister toolgroup {toolgroup_id}. Error: {e}")

print("\nFinished attempting to unregister toolgroups.")

SyntaxError: invalid syntax (48070729.py, line 1)