# Level 6: Agents, MCP and RAG 

This notebook is an extension of the [Level 5 Agentic & MCP notebook](./Level5_agents_and_mcp.ipynb) with the addition of RAG.
This tutorial is for developers who are already familiar with [agentic RAG workflows](./Level4_RAG_agent.ipynb). This tutorial will highlight a couple of slightly more advanced use cases for agents where a single tool call is insufficient to complete the required task. Here we will rely on both agentic RAG and MCP server to expand our agents capabilities.

## Overview

This tutorial covers the following steps:
1. Review OpenShift logs for a failing pod.
2. Categorize the pod and summarize its error.
3. Search available troubleshooting documentations for ideas on how to resolve the error.
4. Send a Slack message to the ops team with a brief summary of the error and next steps to take.

### MCP Tools:

#### OpenShift MCP Server
Throughout this notebook we will be relying on the [kubernetes-mcp-server](https://github.com/manusa/kubernetes-mcp-server) by [manusa](https://github.com/manusa) to interact with our OpenShift cluster. Please see installation instructions below if you do not already have this deployed in your environment. 

* [OpenShift MCP installation instructions](../../../kubernetes/mcp-servers/openshift-mcp/README.md)

#### Slack MCP Server
We will also be using the [Slack MCP Server](https://github.com/modelcontextprotocol/servers-archived/tree/main/src/slack) in this notebook. Please see installation instructions below if you do not already have this deployed in your environment. 

* [Slack MCP installation instructions](../../../kubernetes/mcp-servers/slack-mcp/README.md)

### Pre-Requisites

Before starting, ensure you have the following:
- A running Llama Stack server
- A running Slack MCP server. Refer to our [documentation](https://github.com/opendatahub-io/llama-stack-demos/tree/main/kubernetes/mcp-servers/slack-mcp) on how you can set this up on your OpenShift cluster
- Access to an OpeShift cluster with a deployment of the [OpenShift MCP server](../../../kubernetes/mcp-servers/openshift-mcp) (see the [deployment manifests](https://github.com/opendatahub-io/llama-stack-on-ocp/tree/main/kubernetes/mcp-servers/openshift-mcp) for assistance with this).

## Setting Up this Notebook
We will initialize our environment as described in detail in our [\"Getting Started\" notebook](./Level0_getting_started_with_Llama_Stack.ipynb). Please refer to it for additional explanations.

In [3]:
# for communication with Llama Stack
from llama_stack_client import LlamaStackClient
from llama_stack_client import Agent
from llama_stack_client.lib.agents.event_logger import EventLogger
from llama_stack_client import RAGDocument
from llama_stack_client.lib.agents.react.agent import ReActAgent
from llama_stack_client.lib.agents.react.tool_parser import ReActOutput

# pretty print of the results returned from the model/agent
from termcolor import cprint
import sys
sys.path.append('..')  
from src.utils import step_printer
import uuid
import os

from dotenv import load_dotenv
load_dotenv()

True

In [4]:
base_url = os.getenv("REMOTE_BASE_URL")


# Tavily search API key is required for some of our demos and must be provided to the client upon initialization.
# We will cover it in the agentic demos that use the respective tool. Please ignore this parameter for all other demos.
tavily_search_api_key = os.getenv("TAVILY_SEARCH_API_KEY")
if tavily_search_api_key is None:
    provider_data = None
else:
    provider_data = {"tavily_search_api_key": tavily_search_api_key}


client = LlamaStackClient(
    base_url=base_url,
    provider_data=provider_data
)

print(f"Connected to Llama Stack server")

# model_id will later be used to pass the name of the desired inference model to Llama Stack Agents/Inference APIs
# model_id = "granite"
model_id = "qwen"

temperature = float(os.getenv("TEMPERATURE", 0.0))
if temperature > 0.0:
    top_p = float(os.getenv("TOP_P", 0.95))
    strategy = {"type": "top_p", "temperature": temperature, "top_p": top_p}
else:
    strategy = {"type": "greedy"}

max_tokens = int(os.getenv("MAX_TOKENS", 512))

# sampling_params will later be used to pass the parameters to Llama Stack Agents/Inference APIs
sampling_params = {
    "strategy": strategy,
    "max_tokens": max_tokens,
}

# For this demo, we are using Milvus Lite, which is our preferred solution. Any other Vector DB supported by Llama Stack can be used.

# RAG vector DB settings
VECTOR_DB_EMBEDDING_MODEL = os.getenv("VDB_EMBEDDING")
VECTOR_DB_EMBEDDING_DIMENSION = int(os.getenv("VDB_EMBEDDING_DIMENSION", 384))
VECTOR_DB_CHUNK_SIZE = int(os.getenv("VECTOR_DB_CHUNK_SIZE", 512))
VECTOR_DB_PROVIDER_ID = os.getenv("VDB_PROVIDER")

# Unique DB ID for session
vector_db_id = f"test_vector_db_{uuid.uuid4()}"

stream_env = os.getenv("STREAM", "False")
# the Boolean 'stream' parameter will later be passed to Llama Stack Agents/Inference APIs
# any value non equal to 'False' will be considered as 'True'
stream = (stream_env != "False")

print(f"Inference Parameters:\n\tModel: {model_id}\n\tSampling Parameters: {sampling_params}\n\tstream: {stream}")

Connected to Llama Stack server
Inference Parameters:
	Model: qwen
	Sampling Parameters: {'strategy': {'type': 'greedy'}, 'max_tokens': 2048}
	stream: False


### Validate tools are available in our Llama Stack instance

We will be using the [OpenShift MCP Server](https://github.com/manusa/kubernetes-mcp-server) and the [Slack MCP Server](https://github.com/modelcontextprotocol/servers/tree/main/src/slack) for performing this task. If you haven't already, you can follow the instructions [here](https://github.com/opendatahub-io/llama-stack-demos/tree/main/kubernetes/mcp-servers/slack-mcp#setting-up-on-ocp) to install the Slack MCP server and the instructions [here](https://github.com/opendatahub-io/llama-stack-demos/tree/main/kubernetes/mcp-servers/openshift-mcp#steps-for-deploying-the-openshift-mcp-server-on-openshift) to install the OpenShift MCP server. Once we confirm that the OpenShift and Slack MCP servers are running and configured to your Llama Stack server, we can then move on to defining an agent to help with our task.

#### Registering tools on Llama Stack
When an instance of llama stack is redeployed your tools need to re-registered. Also if a tool is already registered with a llama stack instance, if you try to register one with the same `toolgroup_id`, llama stack will throw you an error.

For this reason it is recommended to include some code to validate your tools and toolgroups. This is where the `mcp_url` comes into play. The following code will check that the `builtin::rag`,`mcp::openshift`  and `mcp::slack` tools are registered as tools, but if any mcp tool is not listed there, it will attempt to register it using the mcp url.

If you are running the MCP server from source, the default value for this is: `http://localhost:8000/sse`.

If you are running the MCP server from a container, the default value for this is: `http://host.containers.internal:8000/sse`.

Make sure to pass the corresponding MCP URL for the server you are trying to register/validate tools for.

In [5]:
# Optional: Enter your MCP server URL here
ocp_mcp_url = os.getenv("REMOTE_OCP_MCP_URL") # Optional: enter your MCP server url here
slack_mcp_url = os.getenv("REMOTE_SLACK_MCP_URL") # Optional: enter your MCP server url here

# Get list of registered tools and extract their toolgroup IDs
registered_tools = client.tools.list()
registered_toolgroups = [tool.toolgroup_id for tool in registered_tools]

if  "builtin::rag" not in registered_toolgroups: # Required
    client.toolgroups.register(
        toolgroup_id="builtin::rag",
        provider_id="milvus"
    )

if "mcp::openshift" not in registered_toolgroups: # required
    client.toolgroups.register(
        toolgroup_id="mcp::openshift",
        provider_id="model-context-protocol",
        mcp_endpoint={"uri":ocp_mcp_url},
    )

if "mcp::slack" not in registered_toolgroups: # required
    client.toolgroups.register(
        toolgroup_id="mcp::slack",
        provider_id="model-context-protocol",
        mcp_endpoint={"uri":slack_mcp_url},
    )

# Log the current toolgroups registered
print(f"Your Llama Stack server is already registered with the following tool groups: {set(registered_toolgroups)}\n")

INFO:httpx:HTTP Request: GET http://llamastack:8321/v1/tools "HTTP/1.1 200 OK"


Your Llama Stack server is already registered with the following tool groups: {'builtin::websearch', 'builtin::rag', 'mcp::slack', 'mcp::openshift'}



## Discover and report solutions to errors that might be happening on my OpenShift Cluster

We will now try to use an agent to:
- **Retrieve the logs of a failing pod running on OpenShift**
- **Use RAG to find solutions to the errors in the pod logs based on OpenShift troubleshooting documents**
- **Send a summary of the solution as a message on Slack.**

### Pod with simulated error logs

**Important** - In the [Level5 Agents and MCP notebook](./Level5_agents_and_mcp.ipynb) we deployed a pod called `slack-test` on an OpenShift cluster that generates simulated error logs. For the purpose of testing the pod log retrieval, we will be using that pod for this notebook and assume that it is already running on your cluster. **If you haven't already, please make sure to go through the Level5 notebook before continuing this notebook.**

### RAG: Indexing the Documents

We will be leveraging RAG to help us find solutions from OpenShift troubleshooting documents for debugging the logs of erroneous pods running on OpenShift. We will need to:
- Initialize a new document collection in the target vector DB. All parameters related to the vector DB, such as the embedding model and dimension, must be specified here.
- Provide a list of document URLs to the RAG tool. Llama Stack will handle fetching, conversion and chunking of the documents' content.

In [6]:
# define and register the document collection to be used
client.vector_dbs.register(
    vector_db_id=vector_db_id,
    embedding_model=VECTOR_DB_EMBEDDING_MODEL,
    embedding_dimension=VECTOR_DB_EMBEDDING_DIMENSION,
    provider_id=VECTOR_DB_PROVIDER_ID,
)

# ingest the documents into the newly created document collection
urls = [
    ("https://docs.redhat.com/en/documentation/openshift_container_platform/4.11/pdf/support/OpenShift_Container_Platform-4.11-Support-en-US.pdf", "application/pdf"),
]
documents = [
    RAGDocument(
        document_id=f"num-{i}",
        content=url,
        mime_type=url_type,
        metadata={},
    )
    for i, (url, url_type) in enumerate(urls)
]
client.tool_runtime.rag_tool.insert(
    documents=documents,
    vector_db_id=vector_db_id,
    chunk_size_in_tokens=VECTOR_DB_CHUNK_SIZE,
)

INFO:httpx:HTTP Request: POST http://llamastack:8321/v1/vector-dbs "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://llamastack:8321/v1/tool-runtime/rag-tool/insert "HTTP/1.1 200 OK"


### System Prompt for LLM model

**Note:** If you have multiple models configured with your Llama Stack server, you can choose which one to run your queries against. When switching to a different model, you may need to adjust the system prompt to align with that model’s expected behavior. Many models provide recommended system prompts for optimal and reliable outputs these are typically documented on their respective websites.

In [7]:
model_prompt= """You are a helpful assistant. You have access to a number of tools.
Whenever a tool is called, be sure return the Response in a friendly and helpful tone."""

## Defining our Agent - Prompt Chaining

In [8]:
# Create simple agent with tools
agent = Agent(
    client,
    model=model_id, # replace this with your choice of model
    instructions = model_prompt , # update system prompt based on the model you are using
    tools=[dict(
            name="builtin::rag",
            args={
                "vector_db_ids": [vector_db_id],  # list of IDs of document collections to consider during retrieval
            },
        ),"mcp::openshift", "mcp::slack"],
    tool_config={"tool_choice":"auto"},
    sampling_params=sampling_params
)

user_prompts = ["View the logs for pod slack-test in the llama-serve OpenShift namespace. Categorize it as normal or error.",
                "Search for solutions on this error and provide a summary of the steps to take in just 1-2 sentences.",
                "Send a message with the summarization to the demos channel on Slack with a tool call"]
session_id = agent.create_session(session_name="OCP_Slack_demo")
for i, prompt in enumerate(user_prompts):
    response = agent.create_turn(
        messages=[
            {
                "role":"user",
                "content": prompt
            }
        ],
        session_id=session_id,
        stream=stream,
    )
    if stream:
        for log in EventLogger().log(response):
            log.print()
    else:
        step_printer(response.steps) # print the steps of an agent's response in a formatted way. 

INFO:httpx:HTTP Request: POST http://llamastack:8321/v1/agents "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://llamastack:8321/v1/tools?toolgroup_id=builtin%3A%3Arag "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://llamastack:8321/v1/tools?toolgroup_id=mcp%3A%3Aopenshift "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://llamastack:8321/v1/tools?toolgroup_id=mcp%3A%3Aslack "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://llamastack:8321/v1/agents/b9f4bcbd-1d38-4aee-99ad-c2ea23ad1126/session "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://llamastack:8321/v1/agents/b9f4bcbd-1d38-4aee-99ad-c2ea23ad1126/session/04a77331-70a3-4013-bbee-3a7bc7044962/turn "HTTP/1.1 200 OK"



---------- 📍 Step 1: InferenceStep ----------
🛠️ Tool call Generated:
[35mTool call: pods_log, Arguments: {'name': 'slack-test', 'namespace': 'llama-serve'}[0m

---------- 📍 Step 2: ToolExecutionStep ----------
🔧 Executing tool...


INFO:httpx:HTTP Request: POST http://llamastack:8321/v1/agents/b9f4bcbd-1d38-4aee-99ad-c2ea23ad1126/session/04a77331-70a3-4013-bbee-3a7bc7044962/turn "HTTP/1.1 200 OK"



---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[35m<think>
Okay, let's see. The user asked to view the logs for the pod named slack-test in the llama-serve namespace and categorize them as normal or error. I called the pods_log function with the correct name and namespace. The response from the tool shows some log entries. 

Looking at the logs: "Starting container...\nFailure: Unknown Error\nError details: Container failed due to an unexpected issue during startup.\nPotential cause: Missing dependencies, configuration errors, or permission issues.\n" 

These entries indicate that the container started but then failed with an unknown error. The error details mention it's an unexpected issue during startup, and possible causes are listed. Since there's a "Failure" and "Error" in the logs, this clearly falls under the error category.

I need to present this to the user by first showing the logs and then categorizing them. I should explain the categorization based on th

INFO:httpx:HTTP Request: POST http://llamastack:8321/v1/agents/b9f4bcbd-1d38-4aee-99ad-c2ea23ad1126/session/04a77331-70a3-4013-bbee-3a7bc7044962/turn "HTTP/1.1 200 OK"



---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[35m<think>
Okay, let's see. The user is asking for a summary of steps to resolve the Kubernetes pod failure with an unknown error during startup, mentioning missing dependencies, configuration errors, or permission issues.

First, I need to go through the knowledge_search results provided. The results include various OpenShift and Kubernetes troubleshooting steps. Let me parse each result to find relevant information.

Result 1 talks about draining nodes, stopping kubelet and containers, and cleaning up storage. That seems related to node issues but maybe not directly the pod's configuration.

Result 2 mentions checking pod logs, inspecting base images, and reviewing events. This is directly relevant because the user's issue could be due to image or configuration problems. The steps here include using 'oc logs' and 'oc get events' to diagnose.

Result 3 discusses checking SDN pods and network configurations. If the pod 


---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[35m<think>
Okay, the user tried to send a message to the Slack #demos channel using the slack_post_message tool, but it failed with an "invalid_auth" error. Hmm, that usually means the authentication token is missing or incorrect. Let me check the tool's parameters. The function requires channel_id and text. The user provided the channel_id as "#demos" and the text. But maybe the channel_id should be just "demos" without the # symbol? Wait, the Slack API typically uses channel IDs like "C1234567890" or names without the #. Maybe the tool expects the channel name without the #. Let me see the tool's description again. The slack_post_message function's parameters for channel_id say "The ID of the channel to post to." Oh, right, the ID is usually the numeric ID, not the name with a #. So using "#demos" is incorrect. The correct channel ID would be something like "C0123456789". The user probably used the channel name with t

### Output Analysis

Lets step through the output to further understands whats happening in this Agentic demo.

1. First the LLM sends off a tool call to the `pods_log` tool configured with the OpenShift MCP server, to fetch the logs for the pod specified from the OpenShift cluster.
2. The tool successfully retrieves the logs for the pod.
3. The LLM recieves the response from the tool call, which are the pod logs, along with the original query and using its own knowledge tries to categorize it as 'Normal or 'Error'.
4. Next, the LLM sends a tool call to the RAG tool, to query the vector DB on the documents for the error from the pod logs and look for a solution to resolve the error from the document.
5. The LLM recieves the response from the tool call, and summarizes the information it retreived from the document.
6. Finally, the LLM sends the summarization as a message on Slack using the `slack_post_message` tool call configured with the Slack MCP server.

## Defining our Agent - ReAct

Now that we've shown that we can successfully accomplish this multi-step multi-tool task using prompt chaining, let's see if we can give our agent a bit more autonomy to perform the same task but with a single prompt instead of a chian. To do this, we will instantiate a ReAct agent (which is included in the llama stack python client by default). The ReAct agent is a variant of the simple agent but with the ability to loop through "Reason then Act" iterations, thinking through the problem and then using tools until it determines that it's task has been completed successfully.

Unlike prompt chaining which follows fixed steps, ReAct dynamically breaks down tasks and adapts its approach based on the results of each step. This makes it more flexible and capable of handling complex, real-world queries effectively.

Below you will see the slight differences in the agent definition and the prompt used to accomplish our task.

In [9]:
builtin_rag = dict(
            name="builtin::rag",
            args={
                "vector_db_ids": [vector_db_id],  # list of IDs of document collections to consider during retrieval
            },
        )



agent = ReActAgent(
            client=client,
            model=model_id,
            tools=["mcp::slack","mcp::openshift", builtin_rag],
            response_format={
                "type": "json_schema",
                "json_schema": ReActOutput.model_json_schema(),
            },
            sampling_params={"max_tokens":512},
        )

user_prompts = ["""View the logs for pod slack-test in the llama-serve OpenShift namespace. Categorize it as normal or error, 
                search for solutions on this error and provide a summary of the steps to take in just 1-2 sentences.
                Send a message with the summarization to the demos channel on Slack. Let me know when you've sent the message.
                """
                ]

session_id = agent.create_session("web-session")
for prompt in user_prompts:
    print("\n"+"="*50)
    cprint(f"Processing user query: {prompt}", "blue")
    print("="*50)
    response = agent.create_turn(
        messages=[
            {
                "role": "user",
                "content": prompt,
            }
        ],
        session_id=session_id,
        stream=stream
    )
    if stream:
        for log in EventLogger().log(response):
            log.print()
    else:
        step_printer(response.steps) # print the steps of an agent's response in a formatted way. 

INFO:httpx:HTTP Request: GET http://llamastack:8321/v1/tools?toolgroup_id=mcp%3A%3Aslack "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://llamastack:8321/v1/tools?toolgroup_id=mcp%3A%3Aopenshift "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://llamastack:8321/v1/tools?toolgroup_id=builtin%3A%3Arag "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://llamastack:8321/v1/agents "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://llamastack:8321/v1/tools?toolgroup_id=mcp%3A%3Aslack "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://llamastack:8321/v1/tools?toolgroup_id=mcp%3A%3Aopenshift "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://llamastack:8321/v1/tools?toolgroup_id=builtin%3A%3Arag "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://llamastack:8321/v1/agents/773687cb-d023-41e0-b692-801e0adb5789/session "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://llamastack:8321/v1/agents/773687cb-d023-41e0-b692-801e0adb5789/session/ac0e47c8-06a8-4678-8f89-ddcf590


[34mProcessing user query: View the logs for pod slack-test in the llama-serve OpenShift namespace. Categorize it as normal or error, 
                search for solutions on this error and provide a summary of the steps to take in just 1-2 sentences.
                Send a message with the summarization to the demos channel on Slack. Let me know when you've sent the message.
                [0m


INFO:httpx:HTTP Request: POST http://llamastack:8321/v1/tool-runtime/invoke "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://llamastack:8321/v1/agents/773687cb-d023-41e0-b692-801e0adb5789/session/ac0e47c8-06a8-4678-8f89-ddcf59073242/turn/9d422617-fc92-4893-8640-dbbf9eb0fe19/resume "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://llamastack:8321/v1/tool-runtime/invoke "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://llamastack:8321/v1/agents/773687cb-d023-41e0-b692-801e0adb5789/session/ac0e47c8-06a8-4678-8f89-ddcf59073242/turn/9d422617-fc92-4893-8640-dbbf9eb0fe19/resume "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://llamastack:8321/v1/tool-runtime/invoke "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://llamastack:8321/v1/agents/773687cb-d023-41e0-b692-801e0adb5789/session/ac0e47c8-06a8-4678-8f89-ddcf59073242/turn/9d422617-fc92-4893-8640-dbbf9eb0fe19/resume "HTTP/1.1 200 OK"



---------- 📍 Step 1: InferenceStep ----------
🤖 Model Response:
[35m{"thought": "I will proceed step by step: first, I will get the logs for the pod 'slack-test' in the 'llama-serve' namespace. Then, I will analyze the logs to categorize them as normal or error. After that, I will search for solutions related to any errors found. Finally, I will summarize the steps to take and send a message to the 'demos' channel on Slack.", "action": {"tool_name": "pods_log", "tool_params": [{"name": "name", "value": "slack-test"}, {"name": "namespace", "value": "llama-serve"}]}, "answer": null}
[0m

---------- 📍 Step 2: ToolExecutionStep ----------
🔧 Executing tool...



---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[35m{"thought": "The logs indicate an error with the container startup. I will categorize this as an error and proceed to search for solutions related to the 'Unknown Error' and 'Container failed during startup' issues.", "action": {"tool_name": "knowledge_search", "tool_params": [{"name": "query", "value": "Kubernetes container startup error unknown error missing dependencies configuration errors"}]}, "answer": null}
[0m

---------- 📍 Step 4: ToolExecutionStep ----------
🔧 Executing tool...



---------- 📍 Step 5: InferenceStep ----------
🤖 Model Response:
[35m{"thought": "The logs indicate an error during container startup, and the knowledge search results suggest common issues like missing dependencies, configuration errors, or permission problems. Based on this, I will summarize the steps to take and prepare a message to send to the 'demos' channel on Slack.", "action": {"tool_name": "slack_post_message", "tool_params": [{"name": "channel_id", "value": "demos"}, {"name": "text", "value": "The 'slack-test' pod in the 'llama-serve' namespace encountered an error during startup. The error is categorized as a container startup failure, likely due to missing dependencies, configuration errors, or permission issues. Steps to resolve include checking the pod's configuration, verifying dependencies, and ensuring proper permissions. For detailed troubleshooting, review the logs and consider redeploying the pod after fixing the underlying issues."}]}, "answer": null}
[0m

------


---------- 📍 Step 7: InferenceStep ----------
🤖 Model Response:
[35m{"thought": "The response indicates an authentication error when trying to post to Slack. I need to ensure that the correct authentication token is configured for the Slack API. Since I cannot reconfigure authentication directly, I will inform the user of the issue and suggest checking the Slack API credentials.", "action": null, "answer": "There was an authentication error when attempting to send the message to the 'demos' channel on Slack. Please verify that the correct Slack API credentials are configured and try again."}
[0m



### Output Analysis

Above, we can see that the ReAct agent took nearly an identical approach to the prompt chaining method above, but using a single prompt instead of a chain.

1. First, the agent reasoned that it needed to generated a tool call for the `pods_log` tool included in the OpenShift MCP server and did so to generate a tool call that fetched the logs for the specified pod.
2. Next, the agent reasoned that its next step was to query the available vector database for more information about the logs and acted using the built-in `knowledge_search` tool. 
4. Then, the agent summarized the findings from the the knowledge search step and reasoned it needed to send a slack message to share this information with the team and did so by using the `slack_post_message` tool call configured with the Slack MCP server.


## Key Takeaways

This tutorial demonstrates how to implement agentic RAG and MCP applications with Llama Stack. We do so by initializing an agent while giving it access to the MCP tools, and RAG tool configured with Llama Stack, then invoking the agent on the specified query. We showed that we can do this with more directed Prompt Chaining or with the more open ended ReAct pattern Please check out our other notebooks for more examples using Llama Stack.

#### Any Feedback?

If you have any feedback on this or any other notebook in this demo series we'd love to hear it! Please go to https://www.feedback.redhat.com/jfe/form/SV_8pQsoy0U9Ccqsvk and help us improve our demos. 