### MCP Server Usage (SSE and streamable-http)

This notebook showcases how to use the NVIDIA RAG MCP server via MCP transports instead of REST APIs. It covers:
- Launching the server (SSE and streamable-http)
- Connecting with the MCP Python client
- Listing tools
- Calling all MCP tools: `generate`, `search`, and `get_summary`

You can execute each cell in sequence to test the MCP server tools.

#### 1. Install Dependencies

Purpose:
Install the libraries needed to run the MCP server and client locally in this notebook environment.

- Ensure your environment has:
  - `mcp`, `anyio`, `httpx`, `httpx-sse`, `uvicorn`
- If using Workbench/docker, these may already be installed.

In [None]:
# %pip install -qq -r ../nvidia_rag_mcp/requirements.txt


Note: you may need to restart the kernel to use updated packages.


INFO:     127.0.0.1:36776 - "GET /sse HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:36792 - "GET /sse HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:54070 - "GET /sse HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:54072 - "GET /sse HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:54074 - "GET /sse HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:54080 - "GET /sse HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:54082 - "GET /sse HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:54088 - "GET /sse HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:54092 - "GET /sse HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:54102 - "GET /sse HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:54112 - "GET /sse HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:54116 - "GET /sse HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:42574 - "GET /sse HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:42578 - "GET /sse HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:42580 - "GET /sse HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:42586 - "GET /sse HTTP/1.1" 404 Not Found
INFO:   

#### 2. Setup Base Configuration

Purpose:
Capture API keys and environment variables that the server and client will rely on.

Configure keys and basic variables used by the rest of this notebook.


In [None]:
import os

NVIDIA_API_KEY = os.environ.get("NVIDIA_API_KEY", "")

#### Prerequisite: Create a collection and upload a sample document (with summary)

Purpose: Seed the knowledge base so search, generate, and get_summary produce meaningful results. Weâ€™ll upload `data/multimodal/woods_frost.pdf` with `generate_summary=true` after creating a collection.

Notes:
- Ensure the RAG server is running and reachable before using MCP tools. 
- Set `VITE_API_CHAT_URL`. If not set, using the default `http://127.0.0.1:8081`.
- Ensure the Ingestor server is running and reachable.
- Set `INGESTOR_URL`. If not set, using the default `http://127.0.0.1:8082`.
- This step creates/uses the collection `my_collection` used later in the MCP calls.

In [None]:
# Upload a document to the Ingestor with generate_summary=True
import os
import json
import subprocess

INGESTOR_URL = os.environ.get("INGESTOR_URL", "http://127.0.0.1:8082").rstrip("/")
COLLECTION = "my_collection"
PDF_PATH = os.path.join(os.path.dirname(os.getcwd()), "data", "multimodal", "woods_frost.pdf")

# 1) Ensure the collection exists
create_cmd = [
    "curl", "-sS", "-X", "POST", f"{INGESTOR_URL}/v1/collections",
    "-H", "Content-Type: application/json",
    "-d", json.dumps([COLLECTION]),
]
print("Creating collection:", " ".join(create_cmd))
create_res = subprocess.run(create_cmd, capture_output=True, text=True)
print(create_res.stdout or create_res.stderr)

# 2) Upload document with summary generation enabled
payload = {
    "collection_name": COLLECTION,
    "blocking": True,
    "custom_metadata": [],
    "generate_summary": True,
}

cmd = [
    "curl", "-sS", "-X", "POST", f"{INGESTOR_URL}/v1/documents",
    "-F", f"documents=@{PDF_PATH}",
    "-F", f"data={json.dumps(payload)}",
]
print("Uploading document:", " ".join(cmd))
res = subprocess.run(cmd, capture_output=True, text=True)
print(res.stdout or res.stderr)

## Launch MCP Server (SSE)

Purpose:
Start the MCP server locally over SSE so that the client can connect and call tools.

This launches the MCP server on `http://127.0.0.1:8000`.

In [None]:
# Kill any process listening on port 8000 to avoid port conflicts
import subprocess

PORT = 8000
print(f"Kill any process running on port {PORT} to start sse server in the next cell")

# Try fuser first (common on Linux)
try:
    subprocess.run(["fuser", "-k", f"{PORT}/tcp"], check=False)
except FileNotFoundError:
    print("'fuser' not found, skipping fuser-based cleanup.")
except Exception as e:
    print(f"Error while running fuser: {e}")

In [None]:
import os
import sys
import atexit
import time

sse_proc = None
try:
    env = dict(os.environ)
    if NVIDIA_API_KEY:
        env.setdefault("NVIDIA_API_KEY", NVIDIA_API_KEY)

    repo_root = os.path.abspath(os.path.join(os.getcwd(), ".."))
    server_path = os.path.join(repo_root, "nvidia_rag_mcp", "mcp_server.py")
    cmd = [
        sys.executable,
        server_path,
        "--transport",
        "sse",
        "--host",
        "127.0.0.1",
        "--port",
        "8000",
    ]

    print("Launching:", " ".join(cmd))
    sse_proc = subprocess.Popen(cmd, env=env)
    atexit.register(lambda: sse_proc and sse_proc.poll() is None and sse_proc.terminate())
    time.sleep(2.0)
    print("SSE server PID:", sse_proc.pid)
except Exception as e:
    print("Failed to start SSE server:", e)

## Connect with MCP Client (SSE), List Tools, and Call MCP Tools

Purpose:
Verify connectivity by listing available tools and invoking the MCP tools (`generate`, `search`, `get_summary`) using the SSE transport.

This uses the `mcp_client.py` CLI to connect over SSE, list tools, and invoke the RAG tools.

**Note:** Ensure the SSE server is running from Cell 6 before executing this cell.


In [None]:
import os
import sys
import json
import subprocess

SSE_URL = "http://127.0.0.1:8000/sse"
repo_root = os.path.abspath(os.path.join(os.getcwd(), ".."))
client_path = os.path.join(repo_root, "nvidia_rag_mcp", "mcp_client.py")

print("="*80)
print("Listing available tools...")
print("="*80)
subprocess.run([
    sys.executable,
    client_path,
    "list",
    "--transport=sse",
    f"--url={SSE_URL}",
])

print("\n" + "="*80)
print("Calling 'generate' tool...")
print("="*80)
generate_args = json.dumps({
    "messages": [{"role": "user", "content": "Hello from SSE demo"}],
    "collection_name": "my_collection",
})
subprocess.run([
    sys.executable,
    client_path,
    "call",
    "--transport=sse",
    f"--url={SSE_URL}",
    "--tool=generate",
    f"--json-args={generate_args}",
])

print("\n" + "="*80)
print("Calling 'search' tool...")
print("="*80)
search_args = json.dumps({
    "query": "Tell me about Robert Frost's poems",
    "collection_name": "my_collection",
    "reranker_top_k": 2,
    "vdb_top_k": 5,
    "enable_query_rewriting": False,
    "enable_reranker": True,
})
subprocess.run([
    sys.executable,
    client_path,
    "call",
    "--transport=sse",
    f"--url={SSE_URL}",
    "--tool=search",
    f"--json-args={search_args}",
])

print("\n" + "="*80)
print("Calling 'get_summary' tool...")
print("="*80)
summary_args = json.dumps({
    "collection_name": "my_collection",
    "file_name": "woods_frost.pdf",
    "blocking": False,
    "timeout": 60,
})
subprocess.run([
    sys.executable,
    client_path,
    "call",
    "--transport=sse",
    f"--url={SSE_URL}",
    "--tool=get_summary",
    f"--json-args={summary_args}",
])


## Launch MCP Server (streamable_http)

Purpose:
Start the MCP server locally using the **streamable_http** transport so that the client can connect and call tools.

This launches the MCP server with FastMCP's streamable-http support.

In [None]:
PORT = 8000
print(f"Kill any process running on port {PORT} to start streamable_http server in the next cell")

try:
    subprocess.run(["fuser", "-k", f"{PORT}/tcp"], check=False)
except FileNotFoundError:
    print("'fuser' not found, skipping fuser-based cleanup.")
except Exception as e:
    print(f"Error while running fuser: {e}")

In [None]:
import os
import sys
import atexit
import time

stream_proc = None
try:
    env = dict(os.environ)
    if NVIDIA_API_KEY:
        env.setdefault("NVIDIA_API_KEY", NVIDIA_API_KEY)

    repo_root = os.path.abspath(os.path.join(os.getcwd(), ".."))
    server_path = os.path.join(repo_root, "nvidia_rag_mcp", "mcp_server.py")
    cmd = [
        sys.executable,
        server_path,
        "--transport",
        "streamable_http",
    ]

    print("Launching streamable_http server:", " ".join(cmd))
    stream_proc = subprocess.Popen(cmd, env=env)
    atexit.register(lambda: stream_proc and stream_proc.poll() is None and stream_proc.terminate())
    time.sleep(2.0)
    print("streamable_http server PID:", stream_proc.pid)
except Exception as e:
    print("Failed to start streamable_http server:", e)

## Connect with MCP Client (streamable_http), List Tools, and Call MCP Tools

Purpose:
Verify connectivity by listing available tools and invoking the MCP tools (`generate`, `search`, `get_summary`) using the streamable_http transport.

This uses the `mcp_client.py` CLI to connect over streamable_http, list tools, and invoke the RAG tools.

**Note:** Ensure the streamable_http server is running from the cell above before executing this cell.



In [29]:
import os
import sys
import json
import subprocess

SSE_URL = "http://127.0.0.1:8000/mcp"
repo_root = os.path.abspath(os.path.join(os.getcwd(), ".."))
client_path = os.path.join(repo_root, "nvidia_rag_mcp", "mcp_client.py")

print("="*80)
print("Listing available tools...")
print("="*80)
subprocess.run([
    sys.executable,
    client_path,
    "list",
    "--transport=streamable_http",
    f"--url={SSE_URL}",
])

print("\n" + "="*80)
print("Calling 'generate' tool...")
print("="*80)
generate_args = json.dumps({
    "messages": [{"role": "user", "content": "Hello from SSE demo"}],
    "collection_name": "my_collection",
})
subprocess.run([
    sys.executable,
    client_path,
    "call",
    "--transport=streamable_http",
    f"--url={SSE_URL}",
    "--tool=generate",
    f"--json-args={generate_args}",
])

print("\n" + "="*80)
print("Calling 'search' tool...")
print("="*80)
search_args = json.dumps({
    "query": "Tell me about Robert Frost's poems",
    "collection_name": "my_collection",
    "reranker_top_k": 2,
    "vdb_top_k": 5,
    "enable_query_rewriting": False,
    "enable_reranker": True,
})
subprocess.run([
    sys.executable,
    client_path,
    "call",
    "--transport=streamable_http",
    f"--url={SSE_URL}",
    "--tool=search",
    f"--json-args={search_args}",
])

print("\n" + "="*80)
print("Calling 'get_summary' tool...")
print("="*80)
summary_args = json.dumps({
    "collection_name": "my_collection",
    "file_name": "woods_frost.pdf",
    "blocking": False,
    "timeout": 60,
})
subprocess.run([
    sys.executable,
    client_path,
    "call",
    "--transport=streamable_http",
    f"--url={SSE_URL}",
    "--tool=get_summary",
    f"--json-args={summary_args}",
])


Listing available tools...


[2;36m[12/04/25 13:23:59][0m[2;36m [0m[34mINFO    [0m Created new          ]8;id=330024;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/streamable_http_manager.py\[2mstreamable_http_manager.py[0m]8;;\[2m:[0m]8;id=447520;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/streamable_http_manager.py#239\[2m239[0m]8;;\
[2;36m                    [0m         transport with       [2m                              [0m
[2;36m                    [0m         session ID:          [2m                              [0m
[2;36m                    [0m         ac1e4821d04f45238b05 [2m                              [0m
[2;36m                    [0m         1088df19b561         [2m                              [0m
[2;36m                   [0m[2;36m [0m[34mINFO    [0m Processing request of type            ]8;id=187297;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/lowlevel/server.py\[2mserver.py[0m]8

INFO:     127.0.0.1:50702 - "POST /mcp HTTP/1.1" 200 OK
INFO:     127.0.0.1:50704 - "POST /mcp HTTP/1.1" 202 Accepted
INFO:     127.0.0.1:50710 - "GET /mcp HTTP/1.1" 200 OK
INFO:     127.0.0.1:50714 - "POST /mcp HTTP/1.1" 200 OK
generate: Generate an answer using NVIDIA RAG (optionally with knowledge base). Provide chat messages and optional generation parameters.
search: Search the vector database and return citations for a given query.
get_summary: Retrieve the pre-generated summary for a document from a collection. Set blocking=true to wait up to timeout seconds for summary generation.
INFO:     127.0.0.1:50726 - "DELETE /mcp HTTP/1.1" 200 OK

Calling 'generate' tool...


[2;36m                   [0m[2;36m [0m[34mINFO    [0m Created new          ]8;id=542499;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/streamable_http_manager.py\[2mstreamable_http_manager.py[0m]8;;\[2m:[0m]8;id=277821;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/streamable_http_manager.py#239\[2m239[0m]8;;\
[2;36m                    [0m         transport with       [2m                              [0m
[2;36m                    [0m         session ID:          [2m                              [0m
[2;36m                    [0m         1f40ea48c84546228fa4 [2m                              [0m
[2;36m                    [0m         e414a531657d         [2m                              [0m
[2;36m                   [0m[2;36m [0m[34mINFO    [0m Processing request of type            ]8;id=870457;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/lowlevel/server.py\[2mserver.py[0m]8

INFO:     127.0.0.1:50732 - "POST /mcp HTTP/1.1" 200 OK
INFO:     127.0.0.1:50736 - "POST /mcp HTTP/1.1" 202 Accepted
INFO:     127.0.0.1:50750 - "GET /mcp HTTP/1.1" 200 OK
INFO:     127.0.0.1:50760 - "POST /mcp HTTP/1.1" 200 OK
INFO:     127.0.0.1:50772 - "POST /mcp HTTP/1.1" 200 OK
{
  "meta": null,
  "content": [
    {
      "type": "text",
      "text": "Hello!",
      "annotations": null,
      "meta": null
    }
  ],
  "structuredContent": {
    "result": "Hello!"
  },
  "isError": false
}
INFO:     127.0.0.1:50774 - "DELETE /mcp HTTP/1.1" 200 OK

Calling 'search' tool...


[2;36m[12/04/25 13:24:01][0m[2;36m [0m[34mINFO    [0m Processing request of type            ]8;id=85064;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/lowlevel/server.py\[2mserver.py[0m]8;;\[2m:[0m]8;id=571905;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/lowlevel/server.py#709\[2m709[0m]8;;\
[2;36m                    [0m         ListToolsRequest                      [2m             [0m
[2;36m                   [0m[2;36m [0m[34mINFO    [0m Terminating session:         ]8;id=562088;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/streamable_http.py\[2mstreamable_http.py[0m]8;;\[2m:[0m]8;id=303642;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/streamable_http.py#750\[2m750[0m]8;;\
[2;36m                    [0m         1f40ea48c84546228fa4e414a531 [2m                      [0m
[2;36m                    [0m         657d                         [2m    

INFO:     127.0.0.1:50776 - "POST /mcp HTTP/1.1" 200 OK
INFO:     127.0.0.1:50778 - "POST /mcp HTTP/1.1" 202 Accepted
INFO:     127.0.0.1:50788 - "GET /mcp HTTP/1.1" 200 OK
INFO:     127.0.0.1:50796 - "POST /mcp HTTP/1.1" 200 OK
INFO:     127.0.0.1:50810 - "POST /mcp HTTP/1.1" 200 OK
{
  "meta": null,
  "content": [
    {
      "type": "text",
      "text": "{\n  \"total_results\": 2,\n  \"results\": [\n    {\n      \"document_id\": \"\",\n      \"content\": \"iVBORw0KGgoAAAANSUhEUgAAAmAAAAFdCAIAAAArBUZpAAAgAElEQVR4nOydBXgVRxeGF4m7YyFECBJcAgR3irsXCVK8xYpLC5RC0ULxYoXg7hAkBCdICJIASSBGiLtD/i97uPtfkguENkJuzvs8XPbOnZ2d3ezZ75yZ2Zki6enpAsMwDMMwH1M0vyvAMAzDMN8iLJAMwzAMowAWSIZhGIZRAAskwzAMwyiABZJhGIZhFMACyTAMwzAKYIFkGIZhGAWwQDIMwzCMAlggGYZhGEYBLJAMwzAMowAWSIZhGIZRAAskwzAMwyiABZJhGIZhFMACyTAMwzAKYIFkGIZhGAWwQDIMwzCMAlggGYZhGEYBLJAMwzAMowAWSIZhGIZRAAskwzAMwyiABZJhGIZhFMACyTAMwzAKYIFkGIZhGAWwQDIMwzCMAlggGYZhGEYBLJAMwzAMowAWSIZhGIZRAAskwzAMwyiABZJhGIZhFMACyTAMwzAKYIFkGIZhGAWwQDIMwzCMAlggGYZhGEYBLJA

[2;36m[12/04/25 13:24:02][0m[2;36m [0m[34mINFO    [0m Processing request of type            ]8;id=850956;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/lowlevel/server.py\[2mserver.py[0m]8;;\[2m:[0m]8;id=83991;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/lowlevel/server.py#709\[2m709[0m]8;;\
[2;36m                    [0m         ListToolsRequest                      [2m             [0m
[2;36m                   [0m[2;36m [0m[34mINFO    [0m Terminating session:         ]8;id=624919;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/streamable_http.py\[2mstreamable_http.py[0m]8;;\[2m:[0m]8;id=490696;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/streamable_http.py#750\[2m750[0m]8;;\
[2;36m                    [0m         8003c1237b7140f1a8d26ab9c5e9 [2m                      [0m
[2;36m                    [0m         b5ca                         [2m    

INFO:     127.0.0.1:50834 - "POST /mcp HTTP/1.1" 200 OK
INFO:     127.0.0.1:50850 - "POST /mcp HTTP/1.1" 202 Accepted
INFO:     127.0.0.1:50864 - "GET /mcp HTTP/1.1" 200 OK
INFO:     127.0.0.1:50878 - "POST /mcp HTTP/1.1" 200 OK
INFO:     127.0.0.1:50894 - "POST /mcp HTTP/1.1" 200 OK
{
  "meta": null,
  "content": [
    {
      "type": "text",
      "text": "{\n  \"message\": \"Summary retrieved successfully.\",\n  \"summary\": \"Here is the concise summary:\\n\\n**Stopping by Woods on a Snowy Evening, Poem by Robert Frost**\\nThe poem describes a serene winter scene where the narrator stops to observe woods filling with snow, prompting reflection on duties and journeys ahead. Key themes include the allure of nature, responsibility, and the passage of time. Notable lines highlight the contrast between the woods' beauty and the narrator's obligations: \\\"The woods are lovely, dark and deep, But I have promises to keep, And miles to go before I sleep.\\\" The poem was published as part 

[2;36m                   [0m[2;36m [0m[34mINFO    [0m Created new          ]8;id=732849;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/streamable_http_manager.py\[2mstreamable_http_manager.py[0m]8;;\[2m:[0m]8;id=787229;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/streamable_http_manager.py#239\[2m239[0m]8;;\
[2;36m                    [0m         transport with       [2m                              [0m
[2;36m                    [0m         session ID:          [2m                              [0m
[2;36m                    [0m         b70a604d68b74825a1a5 [2m                              [0m
[2;36m                    [0m         bd46e1e6d7ce         [2m                              [0m
[2;36m                   [0m[2;36m [0m[34mINFO    [0m Processing request of type            ]8;id=632272;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/lowlevel/server.py\[2mserver.py[0m]8

CompletedProcess(args=['/home/niyati/anaconda3/bin/python', '/home/niyati/rag/nvidia_rag_mcp/mcp_client.py', 'call', '--transport=streamable_http', '--url=http://127.0.0.1:8000/mcp', '--tool=get_summary', '--json-args={"collection_name": "my_collection", "file_name": "woods_frost.pdf", "blocking": false, "timeout": 60}'], returncode=0)

[2;36m[12/04/25 13:26:51][0m[2;36m [0m[34mINFO    [0m Created new          ]8;id=298045;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/streamable_http_manager.py\[2mstreamable_http_manager.py[0m]8;;\[2m:[0m]8;id=313609;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/streamable_http_manager.py#239\[2m239[0m]8;;\
[2;36m                    [0m         transport with       [2m                              [0m
[2;36m                    [0m         session ID:          [2m                              [0m
[2;36m                    [0m         dc2c33348f6b49dd92db [2m                              [0m
[2;36m                    [0m         cd336dcb47b2         [2m                              [0m
[2;36m                   [0m[2;36m [0m[34mINFO    [0m Processing request of type            ]8;id=794346;file:///home/niyati/anaconda3/lib/python3.13/site-packages/mcp/server/lowlevel/server.py\[2mserver.py[0m]8

INFO:     127.0.0.1:53774 - "POST /mcp HTTP/1.1" 200 OK
INFO:     127.0.0.1:53786 - "POST /mcp HTTP/1.1" 202 Accepted
INFO:     127.0.0.1:53798 - "GET /mcp HTTP/1.1" 200 OK
INFO:     127.0.0.1:53812 - "POST /mcp HTTP/1.1" 200 OK
INFO:     127.0.0.1:53824 - "DELETE /mcp HTTP/1.1" 200 OK


## Cleanup & Troubleshooting

Purpose:
Wrap up the session, stop background processes, and provide guidance for common errors (401/404) and environment/version mismatches.

- To stop the SSE or streamable-http server started above, restart the kernel or run the cell that terminates the `sse_proc`.
- If SSE returns 404, ensure you're connecting to the base URL (the client probes standard SSE endpoints).
- Ensure versions of `mcp`, `anyio`, and `uvicorn` match your environment constraints.
