# The Context-Aware MAS Implementation

Copyright 2025, Denis Rothman

This notebook implements an educational MAS architecture utilizing RAG via Pinecone with MCP.

This notebook implements the execution layer of our Context-Aware system. We'll build a complete Multi-Agent System (MAS) where specialized agents collaborate to fulfill a high-level goal, putting the data we previously ingested into Pinecone to work. The core architecture is designed to cleanly separate procedural instructions (the how) from factual data (the what), enabling highly flexible and controlled content generation.

Here's a breakdown of the plan:

Agent Definitions: We will code three specialized agents that form the core of the system:

The Context Librarian performs procedural RAG to fetch stylistic "Semantic Blueprints."

The Researcher uses factual RAG to retrieve and synthesize knowledge on a given topic.

The Writer intelligently combines the blueprint and the research to generate the final output.

The Orchestrator: This agent acts as the manager. It uses an LLM to analyze a user's goal, breaking it down into distinct intent and topic queries for the other agents.

Agent Communication: A simple Message Communication Protocol (MCP) is defined to ensure agents interact in a structured and traceable way.

End-to-End Execution: We'll run several examples to demonstrate how the MAS can generate unique outputs for various topics by dynamically retrieving the correct context and knowledge.

# 1.Installation and Setup

In [1]:
# 1.Installation and Setup
# -------------------------------------------------------------------------
# We install specific versions for stability and reproducibility.
# We include tiktoken for token-based chunking and tenacity for robust API calls.

In [2]:
!pip install tqdm==4.67.1 --upgrade
!pip install openai==2.8.1
!pip install pinecone==7.0.0 tqdm==4.67.1 tenacity==8.3.0

Collecting openai==2.8.1
  Downloading openai-2.8.1-py3-none-any.whl.metadata (29 kB)
Downloading openai-2.8.1-py3-none-any.whl (1.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m12.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: openai
  Attempting uninstall: openai
    Found existing installation: openai 1.109.1
    Uninstalling openai-1.109.1:
      Successfully uninstalled openai-1.109.1
Successfully installed openai-2.8.1
Collecting pinecone==7.0.0
  Downloading pinecone-7.0.0-py3-none-any.whl.metadata (9.5 kB)
Collecting tenacity==8.3.0
  Downloading tenacity-8.3.0-py3-none-any.whl.metadata (1.2 kB)
Collecting pinecone-plugin-interface<0.0.8,>=0.0.7 (from pinecone==7.0.0)
  Downloading pinecone_plugin_interface-0.0.7-py3-none-any.whl.metadata (1.2 kB)
Downloading pinecone-7.0.0-py3-none-any.whl (516 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m516.3/516.3 kB[0m [31m7.8 MB/s[0m eta [36m0:00:00[0

In [3]:
# Imports and API Key Setup
# We will use the OpenAI library to interact with the LLM and Google Colab's
# secret manager to securely access your API key.

import os
from openai import OpenAI
from google.colab import userdata

# Load the API key from Colab secrets, set the env var, then init the client
try:
    api_key = userdata.get("API_KEY")
    if not api_key:
        raise userdata.SecretNotFoundError("API_KEY not found.")

    # Set environment variable for downstream tools/libraries
    os.environ["OPENAI_API_KEY"] = api_key

    # Create client (will read from OPENAI_API_KEY)
    client = OpenAI()
    print("OpenAI API key loaded and environment variable set successfully.")

except userdata.SecretNotFoundError:
    print('Secret "API_KEY" not found.')
    print('Please add your OpenAI API key to the Colab Secrets Manager.')
except Exception as e:
    print(f"An error occurred while loading the API key: {e}")

# Configuration
EMBEDDING_MODEL = "text-embedding-3-small"
EMBEDDING_DIM = 1536 # Dimension for text-embedding-3-small
GENERATION_MODEL = "gpt-5.1"

OpenAI API key loaded and environment variable set successfully.


In [4]:
# Imports for this notebook
import json
import time
from tqdm.auto import tqdm
import tiktoken
from pinecone import Pinecone, ServerlessSpec
from tenacity import retry, stop_after_attempt, wait_random_exponential
# general imports required in the notebooks of this book
import re
import textwrap
from IPython.display import display, Markdown
import copy

In [5]:
try:
    # Standard way to access secrets securely in Google Colab
    from google.colab import userdata
    PINECONE_API_KEY = userdata.get('PINECONE_API_KEY')
    if not PINECONE_API_KEY:
        raise ValueError("API Keys not found in Colab secrets.")
    print("API Keys loaded successfully.")
except ImportError:
    # Fallback for non-Colab environments (e.g., local Jupyter)
    PINECONE_API_KEY = os.environ.get('PINECONE_API_KEY')
    if not PINECONE_API_KEY:
        print("Warning: API Keys not found. Ensure environment variables are set.")

API Keys loaded successfully.


## 2.Initialize Clients

In [6]:
# 2.Initialize Clients
# --- Initialize Clients (assuming this is already done) ---

# --- Initialize Pinecone Client ---
pc = Pinecone(api_key=PINECONE_API_KEY)

# --- Define Index and Namespaces (assuming this is already done) ---
INDEX_NAME = 'genai-mas-mcp-ch3'
NAMESPACE_KNOWLEDGE = "KnowledgeStore"
NAMESPACE_CONTEXT = "ContextLibrary"
spec = ServerlessSpec(cloud='aws', region='us-east-1')

# Check if index exists
if INDEX_NAME not in pc.list_indexes().names():
    print(f"Index '{INDEX_NAME}' not found. Creating new serverless index...")
    pc.create_index(
        name=INDEX_NAME,
        dimension=EMBEDDING_DIM, # Make sure EMBEDDING_DIM is defined
        metric='cosine',
        spec=spec
    )
    # Wait for index to be ready
    while not pc.describe_index(INDEX_NAME).status['ready']:
        print("Waiting for index to be ready...")
        time.sleep(1)
    print("Index created successfully. It is new and empty.")
else:
    # This block runs ONLY if the index already existed.
    print(f"Index '{INDEX_NAME}' already exists.")

    # Connect to the index to perform operations
    index = pc.Index(INDEX_NAME)

# Connect to the index for subsequent operations
index = pc.Index(INDEX_NAME)


Index 'genai-mas-mcp-ch3' already exists.


# 3.Helper Functions (LLM, Embeddings, and MCP)

In [7]:
#3.Helper Functions (LLM, Embeddings, and MCP)
# -------------------------------------------------------------------------
@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(6))
def call_llm(system_prompt, user_prompt, temperature=1, json_mode=False):
    """A centralized function to handle all LLM interactions with retries."""
    try:
        response_format = {"type": "json_object"} if json_mode else {"type": "text"}
        response = client.chat.completions.create(
            model=GENERATION_MODEL,
            response_format=response_format,
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": user_prompt}
            ],
            temperature=temperature
        )
        return response.choices[0].message.content.strip()
    except Exception as e:
        print(f"Error calling LLM: {e}")
        return f"LLM Error: {e}"

@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(6))
def get_embedding(text):
    """Generates embeddings for a single text query with retries."""
    text = text.replace("\n", " ")
    response = client.embeddings.create(input=[text], model=EMBEDDING_MODEL)
    return response.data[0].embedding

def create_mcp_message(sender, content, metadata=None):
    """Creates a standardized MCP message (Educational Version)."""
    return {
        "protocol_version": "1.1 (RAG-Enhanced)",
        "sender": sender,
        "content": content,
        "metadata": metadata or {}
    }

def display_mcp(message, title="MCP Message"):
    """Helper function to display MCP messages clearly during the trace."""
    print(f"\n--- {title} (Sender: {message['sender']}) ---")
    # Display content snippet or keys if content is complex
    if isinstance(message['content'], dict):
         print(f"Content Keys: {list(message['content'].keys())}")
    else:
        print(f"Content: {textwrap.shorten(str(message['content']), width=100)}")
    # Display metadata keys
    print(f"Metadata Keys: {list(message['metadata'].keys())}")
    print("-" * (len(title) + 25))

def query_pinecone(query_text, namespace, top_k=1):
    """Embeds the query text and searches the specified Pinecone namespace."""
    try:
        query_embedding = get_embedding(query_text)
        response = index.query(
            vector=query_embedding,
            namespace=namespace,
            top_k=top_k,
            include_metadata=True
        )
        return response['matches']
    except Exception as e:
        print(f"Error querying Pinecone (Namespace: {namespace}): {e}")
        return []

print("Helper functions and MCP structure defined.")

Helper functions and MCP structure defined.


In [8]:
#@title 4.Agent Definitions
# -------------------------------------------------------------------------

# === 4.1. Context Librarian Agent (Procedural RAG) ===
def agent_context_librarian(mcp_message):
    """
    Retrieves the appropriate Semantic Blueprint from the Context Library.
    """
    print("\n[Librarian] Activated. Analyzing intent...")
    requested_intent = mcp_message['content']['intent_query']

    # Query Pinecone Context Namespace
    results = query_pinecone(requested_intent, NAMESPACE_CONTEXT, top_k=1)

    if results:
        match = results[0]
        print(f"[Librarian] Found blueprint '{match['id']}' (Score: {match['score']:.2f})")
        # Retrieve the blueprint JSON string stored in metadata
        blueprint_json = match['metadata']['blueprint_json']
        content = {"blueprint": blueprint_json}
    else:
        print("[Librarian] No specific blueprint found. Returning default.")
        # Fallback default
        content = {"blueprint": json.dumps({"instruction": "Generate the content neutrally."})}

    return create_mcp_message("Librarian", content)

# === 4.2. Researcher Agent (Factual RAG) ===
def agent_researcher(mcp_message):
    """
    Retrieves and synthesizes factual information from the Knowledge Base.
    """
    print("\n[Researcher] Activated. Investigating topic...")
    topic = mcp_message['content']['topic_query']

    # Query Pinecone Knowledge Namespace
    results = query_pinecone(topic, NAMESPACE_KNOWLEDGE, top_k=3)

    if not results:
        print("[Researcher] No relevant information found.")
        return create_mcp_message("Researcher", {"facts": "No data found."})

    # Synthesize the findings (Retrieve-and-Synthesize)
    print(f"[Researcher] Found {len(results)} relevant chunks. Synthesizing...")
    source_texts = [match['metadata']['text'] for match in results]

    system_prompt = """You are an expert research synthesis AI.
    Synthesize the provided source texts into a concise, bullet-pointed summary relevant to the user's topic. Focus strictly on the facts provided in the sources. Do not add outside information."""

    user_prompt = f"Topic: {topic}\n\nSources:\n" + "\n\n---\n\n".join(source_texts)

    findings = call_llm(system_prompt, user_prompt)

    return create_mcp_message("Researcher", {"facts": findings})

# === 4.3. Writer Agent (Generation) ===
def agent_writer(mcp_message):
    """
    Combines the factual research with the semantic blueprint to generate the final output.
    """
    print("\n[Writer] Activated. Applying blueprint to facts...")

    facts = mcp_message['content']['facts']
    # The blueprint is passed as a JSON string
    blueprint_json_string = mcp_message['content']['blueprint']

    # The Writer's System Prompt incorporates the dynamically retrieved blueprint
    system_prompt = f"""You are an expert content generation AI.
    Your task is to generate content based on the provided RESEARCH FINDINGS.
    Crucially, you MUST structure, style, and constrain your output according to the rules defined in the SEMANTIC BLUEPRINT provided below.

    --- SEMANTIC BLUEPRINT (JSON) ---
    {blueprint_json_string}
    --- END SEMANTIC BLUEPRINT ---

    Adhere strictly to the blueprint's instructions, style guides, and goals. The blueprint defines HOW you write; the research defines WHAT you write about.
    """

    user_prompt = f"""
    --- RESEARCH FINDINGS ---
    {facts}
    --- END RESEARCH FINDINGS ---

    Generate the content now.
    """

    # Generate the final content (slightly higher temperature for potential creativity)
    final_output = call_llm(system_prompt, user_prompt)

    return create_mcp_message("Writer", {"output": final_output})

In [9]:
#@title 5.The Orchestrator
# -------------------------------------------------------------------------

def orchestrator(high_level_goal):
    """
    Manages the workflow of the Context-Aware MAS.
    Analyzes the goal, retrieves context and facts, and coordinates generation.
    """
    print(f"=== [Orchestrator] Starting New Task ===")
    print(f"Goal: {high_level_goal}")

    # Step 0: Analyze Goal (Determine Intent and Topic)
    # We use the LLM to separate the desired style (intent) from the subject matter (topic).
    print("\n[Orchestrator] Analyzing Goal...")
    analysis_system_prompt = """You are an expert goal analyst. Analyze the user's high-level goal and extract two components:
    1. 'intent_query': A descriptive phrase summarizing the desired style, tone, or format, optimized for searching a context library (e.g., "suspenseful narrative blueprint", "objective technical explanation structure").
    2. 'topic_query': A concise phrase summarizing the factual subject matter required (e.g., "Juno mission objectives and power", "Apollo 11 landing details").

    Respond ONLY with a JSON object containing these two keys."""

    # We request JSON mode for reliable parsing
    analysis_result = call_llm(analysis_system_prompt, high_level_goal, json_mode=True)

    try:
        analysis = json.loads(analysis_result)
        intent_query = analysis['intent_query']
        topic_query = analysis['topic_query']
    except (json.JSONDecodeError, KeyError):
        print(f"[Orchestrator] Error: Could not parse analysis JSON. Raw Analysis: {analysis_result}. Aborting.")
        return

    print(f"Orchestrator: Intent Query: '{intent_query}'")
    print(f"Orchestrator: Topic Query: '{topic_query}'")


    # Step 1: Get the Context Blueprint (Procedural RAG)
    mcp_to_librarian = create_mcp_message(
        sender="Orchestrator",
        content={"intent_query": intent_query}
    )
    # display_mcp(mcp_to_librarian, "Orchestrator -> Librarian")
    mcp_from_librarian = agent_context_librarian(mcp_to_librarian)
    display_mcp(mcp_from_librarian, "Librarian -> Orchestrator")

    context_blueprint = mcp_from_librarian['content'].get('blueprint')
    if not context_blueprint: return

    # Step 2: Get the Factual Knowledge (Factual RAG)
    mcp_to_researcher = create_mcp_message(
        sender="Orchestrator",
        content={"topic_query": topic_query}
    )
    # display_mcp(mcp_to_researcher, "Orchestrator -> Researcher")
    mcp_from_researcher = agent_researcher(mcp_to_researcher)
    display_mcp(mcp_from_researcher, "Researcher -> Orchestrator")

    research_findings = mcp_from_researcher['content'].get('facts')
    if not research_findings: return

    # Step 3: Generate the Final Output
    # Combine the outputs for the Writer Agent
    writer_task = {
        "blueprint": context_blueprint,
        "facts": research_findings
    }

    mcp_to_writer = create_mcp_message(
        sender="Orchestrator",
        content=writer_task
    )
    # display_mcp(mcp_to_writer, "Orchestrator -> Writer")
    mcp_from_writer = agent_writer(mcp_to_writer)
    display_mcp(mcp_from_writer, "Writer -> Orchestrator")

    final_result = mcp_from_writer['content'].get('output')

    print("\n=== [Orchestrator] Task Complete ===")
    return final_result

#  6.Running examples


In [10]:
#@title Example 1: Requesting a specific style (Suspense) for a topic (Apollo 11)
print("********  1: SUSPENSEFUL NARRATIVE **********")
goal_1 = "Write a short, suspenseful scene for a children's story about the Apollo 11 moon landing, highlighting the danger."
result_1 = orchestrator(goal_1)

print("\n******** FINAL OUTPUT 1 **********\n")
print(result_1)

print("\n\n" + "="*50 + "\n\n")

********  1: SUSPENSEFUL NARRATIVE **********
=== [Orchestrator] Starting New Task ===
Goal: Write a short, suspenseful scene for a children's story about the Apollo 11 moon landing, highlighting the danger.

[Orchestrator] Analyzing Goal...
Orchestrator: Intent Query: 'short suspenseful children's story scene structure'
Orchestrator: Topic Query: 'Apollo 11 moon landing dangers'

[Librarian] Activated. Analyzing intent...
[Librarian] Found blueprint 'blueprint_suspense_narrative' (Score: 0.57)

--- Librarian -> Orchestrator (Sender: Librarian) ---
Content Keys: ['blueprint']
Metadata Keys: []
--------------------------------------------------

[Researcher] Activated. Investigating topic...
[Researcher] Found 2 relevant chunks. Synthesizing...

--- Researcher -> Orchestrator (Sender: Researcher) ---
Content Keys: ['facts']
Metadata Keys: []
---------------------------------------------------

[Writer] Activated. Applying blueprint to facts...

--- Writer -> Orchestrator (Sender: Writer

In [11]:
#@title Example 2: Requesting a different style (Technical) for another topic (Juno)
print("******** 2: TECHNICAL EXPLANATION **********")
goal_2 = "Provide a clear technical explanation of the Juno mission's objectives and how it is powered."
result_2 = orchestrator(goal_2)

print("\n******** FINAL OUTPUT 2 **********\n")
print(result_2)

print("\n\n" + "="*50 + "\n\n")

******** 2: TECHNICAL EXPLANATION **********
=== [Orchestrator] Starting New Task ===
Goal: Provide a clear technical explanation of the Juno mission's objectives and how it is powered.

[Orchestrator] Analyzing Goal...
Orchestrator: Intent Query: 'clear technical explanation structure'
Orchestrator: Topic Query: 'Juno mission objectives and power system'

[Librarian] Activated. Analyzing intent...
[Librarian] Found blueprint 'blueprint_technical_explanation' (Score: 0.49)

--- Librarian -> Orchestrator (Sender: Librarian) ---
Content Keys: ['blueprint']
Metadata Keys: []
--------------------------------------------------

[Researcher] Activated. Investigating topic...
[Researcher] Found 2 relevant chunks. Synthesizing...

--- Researcher -> Orchestrator (Sender: Researcher) ---
Content Keys: ['facts']
Metadata Keys: []
---------------------------------------------------

[Writer] Activated. Applying blueprint to facts...

--- Writer -> Orchestrator (Sender: Writer) ---
Content Keys: ['

In [12]:
## Example 3: Requesting a casual style
print("******** 3: CASUAL SUMMARY **********")
goal_3 = "Give me a quick, casual summary of what Mars rovers do."
result_3 = orchestrator(goal_3)

print("\n******** FINAL OUTPUT 3 **********\n")
print(result_3)

******** 3: CASUAL SUMMARY **********
=== [Orchestrator] Starting New Task ===
Goal: Give me a quick, casual summary of what Mars rovers do.

[Orchestrator] Analyzing Goal...
Orchestrator: Intent Query: 'quick casual explanatory summary'
Orchestrator: Topic Query: 'purpose and activities of Mars rovers'

[Librarian] Activated. Analyzing intent...
[Librarian] Found blueprint 'blueprint_casual_summary' (Score: 0.64)

--- Librarian -> Orchestrator (Sender: Librarian) ---
Content Keys: ['blueprint']
Metadata Keys: []
--------------------------------------------------

[Researcher] Activated. Investigating topic...
[Researcher] Found 2 relevant chunks. Synthesizing...

--- Researcher -> Orchestrator (Sender: Researcher) ---
Content Keys: ['facts']
Metadata Keys: []
---------------------------------------------------

[Writer] Activated. Applying blueprint to facts...

--- Writer -> Orchestrator (Sender: Writer) ---
Content Keys: ['output']
Metadata Keys: []
---------------------------------