## LlamaStack - Simple search using local Milvus

#### Prerequests

1. Run the `mcp-autoai-template` from **Watsonx Developer Hub**: https://github.com/IBM/watsonx-developer-hub/tree/main/agents/community/mcp-autoai-template


2. Run llama-stack (in version 0.3.0) server with watsonx distribution in terminal
```
llama stack list-deps watsonx | xargs -L1 uv pip install
uv run llama stack run watsonx
```

Add environment variables 
- **WATSONX_PROJECT_ID**
- **WATSONX_BASE_URL**
- **WATSONX_API_KEY**


> [!WARNING]
> The llama-stack with version 0.3.2 does not support the embedding models from providers. Possible complication after upgrading the llama stack.
>
>
>


#### Install llama-stack-client

In [1]:
from importlib.metadata import files

from llama_stack.providers.utils.inference.embedding_mixin import EMBEDDING_MODELS
!uv pip install "llama-stack-client==0.3.0"
!uv pip install wget | tail -n 1

[2mUsing Python 3.12.11 environment at: /Users/dorotalaczak/Documents/GitHub/red_hat_sandbox/.venv_py312[0m
[2mAudited [1m1 package[0m [2min 22ms[0m[0m


In [1]:
!pip list | grep llama

llama_stack                              0.3.0
llama_stack_client                       0.3.2
ollama                                   0.6.1


#### Import dependencies

In [3]:
import os
from dotenv import load_dotenv
load_dotenv()

from llama_stack_client import LlamaStackClient

#### Create LlamaStackClient object

In [4]:
base_url = os.getenv("REMOTE_BASE_URL", "http://localhost:8321")

client = LlamaStackClient(base_url=base_url)

print("Client created!")

Client created!


#### Available Tools

In [5]:
client.toolgroups.list()

INFO:httpx:HTTP Request: GET http://127.0.0.1:8321/v1/toolgroups "HTTP/1.1 200 OK"


[ToolGroup(identifier='builtin::rag', provider_id='rag-runtime', type='tool_group', args=None, mcp_endpoint=None, provider_resource_id='builtin::rag'),
 ToolGroup(identifier='builtin::websearch', provider_id='tavily-search', type='tool_group', args=None, mcp_endpoint=None, provider_resource_id='builtin::websearch')]

In [9]:
tools_providers = []
for provider in client.providers.list():
    if "tool" in provider.api:
        tools_providers.append(provider)
tools_providers

INFO:httpx:HTTP Request: GET http://127.0.0.1:8321/v1/providers "HTTP/1.1 200 OK"


[ProviderInfo(api='tool_runtime', config={'api_key': '********', 'max_results': 3.0}, health={'status': 'Not Implemented', 'message': 'Provider does not implement health check'}, provider_id='brave-search', provider_type='remote::brave-search'),
 ProviderInfo(api='tool_runtime', config={'api_key': '********', 'max_results': 3.0}, health={'status': 'Not Implemented', 'message': 'Provider does not implement health check'}, provider_id='tavily-search', provider_type='remote::tavily-search'),
 ProviderInfo(api='tool_runtime', config={}, health={'status': 'Not Implemented', 'message': 'Provider does not implement health check'}, provider_id='rag-runtime', provider_type='inline::rag-runtime'),
 ProviderInfo(api='tool_runtime', config={}, health={'status': 'Not Implemented', 'message': 'Provider does not implement health check'}, provider_id='model-context-protocol', provider_type='remote::model-context-protocol')]

## Register MCP tool group


In [14]:
client.toolgroups.register(provider_id='model-context-protocol', toolgroup_id="mcp::wx-dev-hub", mcp_endpoint=dict(uri="http://localhost:8000/sse"))


INFO:httpx:HTTP Request: POST http://127.0.0.1:8321/v1/toolgroups "HTTP/1.1 200 OK"


## List MCP tools

In [15]:
client.toolgroups.list()

INFO:httpx:HTTP Request: GET http://127.0.0.1:8321/v1/toolgroups "HTTP/1.1 200 OK"


[ToolGroup(identifier='builtin::rag', provider_id='rag-runtime', type='tool_group', args=None, mcp_endpoint=None, provider_resource_id='builtin::rag'),
 ToolGroup(identifier='builtin::websearch', provider_id='tavily-search', type='tool_group', args=None, mcp_endpoint=None, provider_resource_id='builtin::websearch'),
 ToolGroup(identifier='mcp_wx_dev_hub', provider_id='model-context-protocol', type='tool_group', args=None, mcp_endpoint=McpEndpoint(uri='http://localhost:8000/'), provider_resource_id='mcp_wx_dev_hub'),
 ToolGroup(identifier='mcp::wx-dev-hub', provider_id='model-context-protocol', type='tool_group', args=None, mcp_endpoint=McpEndpoint(uri='http://localhost:8000/sse'), provider_resource_id='mcp::wx-dev-hub')]

In [17]:
tools_mcp = client.tools.list(toolgroup_id="mcp::wx-dev-hub")

INFO:httpx:HTTP Request: GET http://127.0.0.1:8321/v1/tools?toolgroup_id=mcp%3A%3Awx-dev-hub "HTTP/1.1 200 OK"


In [21]:
import pandas as pd

tools_mcp_data = [tool.model_dump() for tool in tools_mcp]
pd.DataFrame(tools_mcp_data)


Unnamed: 0,name,description,input_schema,metadata,output_schema,toolgroup_id
0,add,Add two numbers,"{'properties': {'a': {'title': 'A', 'type': 'i...",{'endpoint': 'http://localhost:8000/sse'},"{'properties': {'result': {'title': 'Result', ...",mcp::wx-dev-hub
1,sub,Subtract two numbers,"{'properties': {'a': {'title': 'A', 'type': 'i...",{'endpoint': 'http://localhost:8000/sse'},"{'properties': {'result': {'title': 'Result', ...",mcp::wx-dev-hub
2,invoke_credit_risk_deployemnt,Invoke deployment about credit risk informatio...,{'$defs': {'PersonInformation': {'properties':...,{'endpoint': 'http://localhost:8000/sse'},,mcp::wx-dev-hub


## Invoke MCP tool

In [35]:
result = client.tool_runtime.invoke_tool(tool_name="sub", kwargs={"a": 10, "b": 3})

INFO:httpx:HTTP Request: POST http://127.0.0.1:8321/v1/tool-runtime/invoke "HTTP/1.1 200 OK"


In [36]:
result.model_dump()

{'content': [{'text': '7', 'type': 'text'}],
 'error_code': 0,
 'error_message': None,
 'metadata': None}

#### Next Steps - build agent [WIP]

In [23]:
LLM_MODEL = "mistralai/mistral-medium-2505"

CREDIT_RISK_QUESTION = """Please answer, based on the each informations below about the person whether he or she has credit risk.

Marek is a 35-year-old man applying for a loan. His checking account status shows a small positive balance. He is requesting a loan to be repaid over a duration of 24 months. Marek has a good credit history, having previously repaid all loans successfully. The purpose of the loan is to purchase a new car. The loan amount he is applying for is 8,000 EUR. He has moderate existing savings in his savings account. Marek has been employed for over 4 years at his current job. He plans to allocate 20% of his monthly income to cover the loan installments.
Marek is male and is applying for the loan individually (no others on the loan). He has been living at his current residence for 5 years. He owns a house in the city. Marekâ€™s age is 35. He has no other installment plans ongoing. His housing situation is stable, as he owns his property. He currently has one existing credit with another financial institution.
Marek works as a skilled employee. He has one dependent child. He also owns a telephone. Lastly, Marek is a domestic worker, not a foreign worker."""

resp = client.responses.create(
    model=LLM_MODEL,
    input=CREDIT_RISK_QUESTION,
    tools=[{"type": "mcp"}],
)

print(resp.output[-1].content[-1].text)

In [37]:
resp = client.responses.create(
    model=LLM_MODEL,
    input=CREDIT_RISK_QUESTION,
    tools=[{"type": "mcp"}],
)

print(resp.output[-1].content[-1].text)

INFO:httpx:HTTP Request: POST http://127.0.0.1:8321/v1/responses "HTTP/1.1 400 Bad Request"


BadRequestError: Error code: 400 - {'error': {'detail': {'errors': [{'loc': ['body', 'tools', 0, 'mcp', 'server_label'], 'msg': 'Field required', 'type': 'missing'}, {'loc': ['body', 'tools', 0, 'mcp', 'server_url'], 'msg': 'Field required', 'type': 'missing'}]}}}