# MCP Demo with RAG - Colab Adaptation

This notebook demonstrates the core concepts of the Model Context Protocol (MCP) with Retrieval-Augmented Generation (RAG), adapted from the [MCP Demo](https://github.com/thinklytics/mcp-demo).

It simulates an MCP server's tools (including RAG) as Python functions and shows how a client can interact with them.

**Note:** This version uses direct function calls, not actual network protocols (like MCP's stdio/SSE), for simplicity in Colab.

## 1. Setup & Data Models

Install necessary libraries and define Pydantic models for structured data exchange (adapted from `models.py`).

In [1]:
%pip install pydantic -q

from pydantic import BaseModel, Field
from typing import List, Dict, Any, Optional
from datetime import datetime
from enum import Enum

class DocumentMetadata(BaseModel):
    source: Optional[str] = None
    topic: Optional[str] = None
    created_at: Optional[str] = Field(default_factory=lambda: datetime.now().isoformat())
    class Config: extra = 'allow'

class DocumentStatus(str, Enum):
    SUCCESS = "success"; ERROR = "error"; DUPLICATE = "duplicate"

class Document(BaseModel):
    id: str; preview: str; metadata: Optional[DocumentMetadata] = None

class SearchResult(BaseModel):
    document: str; score: float; metadata: Optional[DocumentMetadata] = None

class AddDocumentRequest(BaseModel):
    content: str; metadata: Optional[Dict[str, Any]] = None

class AddDocumentResponse(BaseModel):
    status: DocumentStatus; id: Optional[str] = None; message: Optional[str] = None

class SearchRequest(BaseModel):
    query: str; top_k: int = 3

print("✅ Libraries installed and Models defined.")


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.
✅ Libraries installed and Models defined.


## 2. Simulate Server-Side Logic (Tools)

Define Python functions that mimic the MCP server's tools, using a simple in-memory dictionary for the RAG knowledge base.

In [2]:
import json
from datetime import datetime

# Simulate the in-memory document store
in_memory_docs: Dict[str, Dict[str, Any]] = {}

# --- Tool Definitions ---

def echo(message: str) -> str:
    # print(f"--- Tool: echo(message='{message}') ---")
    return f"Echo: {message}"

def add(a: float, b: float) -> float:
    # print(f"--- Tool: add(a={a}, b={b}) ---")
    return a + b

def _check_for_duplicate_document(content: str):
    for doc_id, doc_data in in_memory_docs.items():
        if doc_data.get("content") == content: return True, doc_id
    return False, None

def add_document(content: str, metadata: Dict[str, Any] = None) -> Dict[str, Any]:
    # print(f"--- Tool: add_document(content='...', metadata={metadata}) ---")
    try:
        metadata = metadata or {}
        validated_meta = DocumentMetadata(**metadata)
        metadata_dict = validated_meta.model_dump(exclude_none=True)
        exists, duplicate_id = _check_for_duplicate_document(content)
        if exists:
            response = AddDocumentResponse(status=DocumentStatus.DUPLICATE, id=duplicate_id,
                                         message=f"Duplicate content. Exists as ID: {duplicate_id}")
        else:
            doc_id = f"doc_{len(in_memory_docs) + 1}"
            in_memory_docs[doc_id] = {"content": content, "metadata": metadata_dict}
            response = AddDocumentResponse(status=DocumentStatus.SUCCESS, id=doc_id)
        return response.model_dump()
    except Exception as e:
        return AddDocumentResponse(status=DocumentStatus.ERROR, message=str(e)).model_dump()

def list_documents() -> List[Dict[str, Any]]:
    # print("--- Tool: list_documents() ---")
    results = []
    for doc_id, doc_data in in_memory_docs.items():
        content = doc_data.get("content", "")
        preview = content[:80] + "..." if len(content) > 80 else content
        metadata_obj = DocumentMetadata(**doc_data.get("metadata", {}))
        doc = Document(id=doc_id, preview=preview, metadata=metadata_obj)
        results.append(doc.model_dump(exclude_none=True))
    return results

def rag_search(query: str, top_k: int = 3) -> List[Dict[str, Any]]:
    # print(f"--- Tool: rag_search(query='{query}', top_k={top_k}) ---")
    query_terms = query.lower().split()
    matches = []
    for doc_id, doc_data in in_memory_docs.items():
        content = doc_data.get("content", "").lower()
        score = sum(1 for term in query_terms if term in content) / len(query_terms) if query_terms else 0
        if score > 0:
            metadata_obj = DocumentMetadata(**doc_data.get("metadata", {}))
            search_res = SearchResult(document=doc_data.get("content", ""), metadata=metadata_obj, score=score)
            matches.append(search_res.model_dump(exclude_none=True))

    matches.sort(key=lambda x: x["score"], reverse=True)
    results = matches[:top_k]
    if not results:
         return [SearchResult(document="No relevant documents found", metadata=None, score=0.0).model_dump()]
    return results

# Simulate the sample data resource
SAMPLE_DATA = "This is sample data.\nIt contains multiple lines."
def read_resource(uri: str) -> str:
    # print(f"--- Resource: read_resource(uri='{uri}') ---")
    return SAMPLE_DATA if uri == "sample://data" else f"Resource not found: {uri}"

print("✅ Simulated Server Tools Defined.")

✅ Simulated Server Tools Defined.


## 3. Client Test Helper

A helper function to parse the simulated dictionary responses back into Pydantic models for easier handling on the 'client' side.

In [5]:
# Helper function to parse responses
def parse_simulated_response(response_data: Any, model_class: type) -> Any:
    """Parses simulated tool response (dict/list) into Pydantic models."""
    try:
        if isinstance(response_data, list):
            return [model_class(**item) for item in response_data]
        elif isinstance(response_data, dict):
            return model_class(**response_data)
        elif isinstance(response_data, (str, int, float, bool)):
             return response_data # Primitive types
        else:
            print(f"Warning: Unexpected response type: {type(response_data)}.")
            return response_data
    except Exception as e:
        print(f"Error parsing response into {model_class.__name__}: {e}")
        return None if isinstance(response_data, dict) else []

print("✅ Client Helper Defined.")

✅ Client Helper Defined.


## 4. Client Interaction Tests

These cells simulate a client calling the server's tools. Run them sequentially.

### 4.1 Test Basic Tools (Echo, Add)

In [6]:
print("--- Testing Basic Tools ---")
echo_resp = echo(message="Hello from Colab!")
print(f"Echo Result: {parse_simulated_response(echo_resp, str)}")

add_resp = add(a=15, b=7.5)
print(f"Add Result: {parse_simulated_response(add_resp, float)}")

--- Testing Basic Tools ---
Echo Result: Echo: Hello from Colab!
Add Result: 22.5


### 4.2 Test Adding Documents

In [7]:
print("--- Testing Add Document ---")
# Clear any previous docs for clean test run
in_memory_docs.clear()

docs_to_add = [
    {"content": "MCP allows AI models to interact with external tools.", "metadata": {"topic": "MCP"}},
    {"content": "RAG combines LLMs with information retrieval.", "metadata": {"topic": "RAG"}},
    {"content": "Thinklytics focuses on strategic thinking and data analytics for decision making.", "metadata": {"topic": "Blog"}}
]

for doc in docs_to_add:
    add_raw = add_document(content=doc["content"], metadata=doc["metadata"])
    add_resp = parse_simulated_response(add_raw, AddDocumentResponse)
    if add_resp: print(f"Add Status: {add_resp.status}, ID: {add_resp.id}, Msg: {add_resp.message}")

# Test duplicate
print("\nTesting duplicate add:")
add_dup_raw = add_document(content=docs_to_add[0]["content"], metadata=docs_to_add[0]["metadata"])
add_dup_resp = parse_simulated_response(add_dup_raw, AddDocumentResponse)
if add_dup_resp: print(f"Add Status: {add_dup_resp.status}, ID: {add_dup_resp.id}, Msg: {add_dup_resp.message}")

--- Testing Add Document ---
Add Status: DocumentStatus.SUCCESS, ID: doc_1, Msg: None
Add Status: DocumentStatus.SUCCESS, ID: doc_2, Msg: None
Add Status: DocumentStatus.SUCCESS, ID: doc_3, Msg: None

Testing duplicate add:
Add Status: DocumentStatus.DUPLICATE, ID: doc_1, Msg: Duplicate content. Exists as ID: doc_1


### 4.3 Test Listing Documents

In [8]:
print("--- Testing List Documents ---")
list_raw = list_documents()
documents = parse_simulated_response(list_raw, Document)

if not documents:
    print("No documents found.")
else:
    print(f"Found {len(documents)} documents:")
    for doc in documents:
        meta_str = ", ".join([f"{k}={v}" for k, v in doc.metadata.model_dump(exclude_none=True).items()])
        print(f"- ID: {doc.id}, Preview: '{doc.preview}', Meta: [{meta_str}]")

--- Testing List Documents ---
Found 3 documents:
- ID: doc_1, Preview: 'MCP allows AI models to interact with external tools.', Meta: [topic=MCP, created_at=2025-04-11T18:48:42.008682]
- ID: doc_2, Preview: 'RAG combines LLMs with information retrieval.', Meta: [topic=RAG, created_at=2025-04-11T18:48:42.008754]
- ID: doc_3, Preview: 'Thinklytics focuses on data analytics for decision making.', Meta: [topic=Blog, created_at=2025-04-11T18:48:42.008786]


### 4.4 Test RAG Search

In [9]:
print("--- Testing RAG Search ---")
search_query = "What is MCP?"
search_req = SearchRequest(query=search_query, top_k=2)
search_raw = rag_search(query=search_req.query, top_k=search_req.top_k)
search_results = parse_simulated_response(search_raw, SearchResult)

print(f"Search results for '{search_query}':")
if not search_results or (len(search_results) == 1 and search_results[0].document == "No relevant documents found"):
    print("No relevant documents found.")
else:
    for item in search_results:
        meta_str = ", ".join([f"{k}={v}" for k, v in item.metadata.model_dump(exclude_none=True).items()]) if item.metadata else "None"
        print(f"- Doc: '{item.document[:50]}...', Score: {item.score:.2f}, Meta: [{meta_str}]")

--- Testing RAG Search ---
Search results for 'What is MCP?':
- Doc: 'Thinklytics focuses on data analytics for decision...', Score: 0.33, Meta: [topic=Blog, created_at=2025-04-11T18:48:42.008786]


### 4.5 Test Reading Resource

In [10]:
print("--- Testing Read Resource ---")
resource_uri = "sample://data"
resource_raw = read_resource(resource_uri)
resource_data = parse_simulated_response(resource_raw, str)
print(f"Content of resource '{resource_uri}':\n{resource_data}")

--- Testing Read Resource ---
Content of resource 'sample://data':
This is sample data.
It contains multiple lines.


## 5. Conclusion

This notebook simulated the MCP RAG demo flow:
1. Defined data models (Pydantic).
2. Implemented server-side tools (Python functions) with in-memory RAG.
3. Executed client-side calls to test adding, listing, searching (RAG), and resource access.

This illustrates how MCP enables structured interaction between AI models and external capabilities, even without the network layer.