# Agentic RAG Chat System for Telecommunication Standards Training

This notebook demonstrates the creation of an **agentic Retrieval Augmented Generation (RAG) chat system** using **Langchain** and a pre-existing **Pinecone** vector database. The purpose of this system is to act as a **training agent for telecommunication standard contributors**, allowing them to query the database and receive answers based on relevant documents.

The system is configured to use:
- A **Pinecone** database named "telecom-rag-index" for storing and retrieving document embeddings.
- The **'all-MiniLM-L6-v2'** embedding model for vectorizing queries and documents.
- An **open-source Mistral language model** (`EYEDOL/teleLLM` via HuggingFacePipeline) for generating human-like responses based on retrieved information.
- A **Langchain agent** that orchestrates the process, deciding when to use a tool (in this case, a RAG tool) to answer user queries.
- A **custom RAG tool** that queries the Pinecone database and retrieves the top 3 relevant document chunks.
- A **simple chat interface** for user interaction, which also displays the source documents used to generate the answer.

The notebook walks through the steps of:
1. Installing necessary libraries.
2. Initializing the connection to Pinecone and connecting to the specific index.
3. Setting up the embedding model and the language model.
4. Creating the RAG retriever and wrapping the RAG process as an agent tool.
5. Defining and initializing the Langchain agent with the RAG tool.
6. Implementing a chat loop to interact with the agent, including displaying retrieved source documents.

This system serves as a foundation for a conversational AI that can provide information and training on telecommunication standards by leveraging a domain-specific knowledge base stored in Pinecone.

# Task
Create an agentic RAG chat system using Langchain, connected to a Pinecone database named "telecom-rag-index" with the provided API key and environment. The system should use the 'all-MiniLM-L6-v2' embedding model and the teleLLM model on Hugging Face. The purpose of the system is to act as a training agent for telecommunication standard contributors.

## Install necessary libraries

### Subtask:
Install Langchain, Pinecone, and any other required libraries, including libraries for using Hugging Face models and potentially for building an agentic system with Langchain.


In [20]:
%pip install --upgrade langchain pinecone sentence-transformers transformers accelerate



## Initialize pinecone

### Subtask:
Initialize the connection to Pinecone using the provided API key and environment.


**Reasoning**:
Initialize the connection to Pinecone using the provided API key and environment.



In [21]:
from huggingface_hub import login

# Replace 'your_huggingface_token_here' with your urs
hf_token = ""

# Login using the token
login(token=hf_token)

print("Successfully logged into Hugging Face Hub!")

Successfully logged into Hugging Face Hub!


## Initialize Pinecone

### Subtask:
Initialize the connection to Pinecone using the provided API key and environment.

**Reasoning**:
Initialize the connection to Pinecone using the provided API key and environment so we can connect to the existing index.

In [23]:
from pinecone import Pinecone

PINECONE_API_KEY = ""
PINECONE_ENVIRONMENT = ""

# Initialize Pinecone
pc = Pinecone(api_key=PINECONE_API_KEY, environment=PINECONE_ENVIRONMENT)

print("Pinecone initialized successfully.")

Pinecone initialized successfully.


## Connect to Pinecone index

### Subtask:
Connect to the existing Pinecone index with the specified name ("telecom-rag-index").

**Reasoning**:
Connect to the existing Pinecone index to be able to retrieve documents for the RAG system.

In [24]:
PINECONE_INDEX_NAME = "telecom-rag-index"

# Connect to the existing index
index = pc.Index(PINECONE_INDEX_NAME)

print(f"Connected to Pinecone index: {PINECONE_INDEX_NAME}")

Connected to Pinecone index: telecom-rag-index


## Set up the embedding model

### Subtask:
Initialize the 'all-MiniLM-L6-v2' embedding model.

**Reasoning**:
Initialize the same embedding model that was used to create the embeddings in the Pinecone index. This is crucial for ensuring that the queries are embedded in the same vector space for effective retrieval.

In [25]:
!pip install langchain_huggingface



In [26]:
from langchain_huggingface import HuggingFaceEmbeddings

# Instantiate the 'all-MiniLM-L6-v2' embedding model
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

print("Embedding model initialized successfully.")

Embedding model initialized successfully.


In [27]:
# from langchain_openai import ChatOpenAI # Commented out OpenAI import
# from langchain_huggingface import HuggingFaceEndpoint # Commented out Hugging Face Endpoint import
from langchain_huggingface import HuggingFacePipeline # Import HuggingFacePipeline
import os
from google.colab import userdata # Keep userdata just in case it's needed elsewhere, but not for mandatory token access here

# Optional: Access Hugging Face API token if set as a Colab secret for potential rate limiting or private models
# Removed mandatory userdata.get call that caused SecretNotFoundError

# Instantiate an open-source language model using HuggingFacePipeline
# Trying a smaller model as requested by the user
try:
    from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
    import torch

    model_id = "EYEDOL/teleLLM"

    # Load model and tokenizer
    # Use bfloat16 for reduced memory usage if possible, check GPU support
    # Added device_map="auto" to potentially offload to GPU
    tokenizer = AutoTokenizer.from_pretrained(model_id)
    model = AutoModelForCausalLM.from_pretrained(
        model_id,
        torch_dtype=torch.bfloat16, # Try bfloat16 again for efficiency if GPU supports it
        low_cpu_mem_usage=True,
        device_map="auto" # Automatically distribute model layers across available devices (like GPU)
    )

    # Create a text generation pipeline
    pipe = pipeline(
        "text-generation", # Keep text-generation for causal models
        model=model,
        tokenizer=tokenizer,
        max_new_tokens=512, # Adjust as needed
        do_sample=True,
        temperature=0.7, # Adjust as needed
        top_k=50, # Adjust as needed
        top_p=0.95 # Adjust as needed
        # Add device specification if needed, e.g., device=0 for GPU
        # device=0 if torch.cuda.is_available() else -1
    )

    # Instantiate HuggingFacePipeline
    llm = HuggingFacePipeline(pipeline=pipe)

except Exception as e:
    print(f"Error setting up Language model with HuggingFacePipeline: {e}")
    llm = None
    print("Language model setup failed.")


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Device set to use cuda:0


## Set up the retriever

### Subtask:
Create a retriever from the Pinecone index using the embedding model.

**Reasoning**:
Create a retriever instance that the RAG system will use to search the Pinecone index for relevant documents based on user queries.

In [28]:
!pip install langchain_pinecone



In [29]:
from langchain_pinecone import PineconeVectorStore

# Create a PineconeVectorStore instance using the existing index and the embedding model
vectorstore = PineconeVectorStore(index=index, embedding=embeddings)

# Get a retriever from the vectorstore
# Configure the retriever to return the top 3 documents
retriever = vectorstore.as_retriever(search_kwargs={'k': 3})

print("Retriever set up successfully to return top 3 chunks.")

Retriever set up successfully to return top 3 chunks.


## Build the agentic RAG system with Langchain

### Subtask:
Design and implement the agentic system using Langchain, incorporating the retriever and the language model to answer user queries about telecommunication standards.

**Reasoning**:
Now that we have the language model and the retriever, we can integrate them into an agentic system using Langchain. This will allow the system to not just retrieve documents but also reason and interact with the user in a more dynamic way. We will start by creating a tool that the agent can use to perform RAG lookups.

In [30]:
from langchain.agents import tool
from langchain.chains import RetrievalQA

# Create a RetrievalQA chain to be used as a tool by the agent

# If the LLM setup (cell 83a5766b) was skipped due to missing token, this cell will fail.
if llm and retriever:
    qa_chain = RetrievalQA.from_chain_type(
        llm=llm,
        chain_type="stuff",
        retriever=retriever
    )

    @tool
    def telecommunication_qa(query: str) -> str:
        """Answers questions about telecommunication standards by searching a knowledge base."""
        return qa_chain.run(query)

    print("RetrievalQA chain and telecommunication_qa tool created.")
else:
    print("Cannot create RetrievalQA chain and tool because LLM or Retriever is not available.")

RetrievalQA chain and telecommunication_qa tool created.


## Define the Agent

### Subtask:
Define a Langchain agent that can use the `telecommunication_qa` tool.

**Reasoning**:
Define the agent that will orchestrate the interaction, decide when to use the RAG tool, and generate the final response to the user.

In [31]:
from langchain.agents import AgentExecutor, initialize_agent, AgentType # Changed import: added initialize_agent and AgentType
from langchain_core.prompts import ChatPromptTemplate # Keep ChatPromptTemplate if needed for initialize_agent, though it might use a default prompt
# Removed create_tool_calling_agent as it's not compatible

# Define the tools the agent can use
tools = [telecommunication_qa]

# Define the prompt for the agent
# initialize_agent uses built-in prompts based on the agent_type, so a custom prompt might not be needed here initially.
# Let's remove the custom prompt for now and rely on the agent_type's default.
# prompt = ChatPromptTemplate.from_messages([
#     ("system", "You are a helpful AI assistant that answers questions about telecommunication standards using the provided tools."),
#     ("human", "{input}"),
#     ("placeholder", "{agent_scratchpad}"),
# ])

# Create the agent using initialize_agent
# We need to choose an agent type compatible with models that don't have .bind_tools
# 'zero-shot-react-description' is a common choice for models that can follow instructions and use tool descriptions.
# We also need to provide the LLM and the tools.
if llm and tools: # Ensure LLM and tools are available
    try:
        agent_executor = initialize_agent(
            tools,
            llm,
            agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, # Using a different agent type
            verbose=True # Keep verbose for debugging
            # handle_parsing_errors=True # Optional: Add error handling for parsing if needed
        )
        print("Agent and AgentExecutor created using initialize_agent.")
    except Exception as e:
        print(f"Error creating agent with initialize_agent: {e}")
        agent_executor = None
else:
    print("Cannot create AgentExecutor because LLM or tools are not available.")

# Removed the separate agent creation and then AgentExecutor creation as initialize_agent does both.
# agent = create_tool_calling_agent(llm, tools, prompt)
# agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

Agent and AgentExecutor created using initialize_agent.


## Create a chat interface

### Subtask:
Build a simple chat interface to interact with the agentic RAG system.

**Reasoning**:
Now that the agent is ready, I can create a simple loop to allow the user to input queries and get responses from the agentic RAG system.

## Create Custom Tools

### Subtask:
Define and implement custom tools for the agent.

**Reasoning**:
Define custom tools as Python functions and make them available to the agent using the `@tool` decorator.

In [32]:
from langchain.agents import tool

# Example of a simple custom tool
@tool
def get_current_time(query: str) -> str:
    """Returns the current time. Use this tool when the user asks about the current time."""
    from datetime import datetime
    # The 'query' parameter is required by the @tool decorator, but not used in this simple example
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S")

# You can define more custom tools here following the same pattern.
# Make sure each tool function has a clear docstring explaining when the agent should use it.

print("Example custom tool 'get_current_time' created.")

Example custom tool 'get_current_time' created.


## Add Custom Tools to the Agent

### Subtask:
Include the custom tools in the list of tools available to the agent.

**Reasoning**:
Update the list of tools provided to the AgentExecutor to include the newly defined custom tools.

In [33]:
# Add your custom tools to the tools list
# Assuming 'telecommunication_qa' is already defined and you want to keep it.
# Replace 'get_current_time' with the names of your actual custom tool functions.
custom_tools = [
    telecommunication_qa, # Keep the RAG tool
    get_current_time # Add the example custom tool
    # Add other custom tool functions here
]

# Re-create the agent executor with the updated list of tools
if llm and custom_tools: # Ensure LLM and tools are available
    try:
        # Using initialize_agent again with the updated tools list
        agent_executor = initialize_agent(
            custom_tools, # Use the updated list of tools
            llm,
            agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, # Keep the same agent type
            verbose=True # Keep verbose for debugging
            # handle_parsing_errors=True # Optional: Add error handling for parsing if needed
        )
        print("Agent and AgentExecutor updated with custom tools.")
    except Exception as e:
        print(f"Error updating agent with custom tools: {e}")
        agent_executor = None
else:
    print("Cannot update AgentExecutor because LLM or custom tools are not available.")

Agent and AgentExecutor updated with custom tools.


In [20]:
# Create chat loop
print("Chat with your agentic RAG system with sources (type 'quit' to exit):")
if 'agent_executor' in locals() and agent_executor is not None:
    while True:
        try:
            query = input("You: ")
            if query.lower() == "quit":
                break

            print("\nThinking...") # Indicate that the agent is processing

            # Use the agent_executor to handle the query
            # The verbose=True flag in agent_executor will show the agent's thought process and tool usage.
            response = agent_executor.invoke({"input": query}) # Use invoke with dictionary input

            # Extract the generated answer
            # The agent's final answer is typically in the 'output' key
            generated_answer = response.get('output', 'No answer generated.')
            print(f"\nBot: {generated_answer}")

            # Attempt to extract and print source documents
            # When the RetrievalQA tool with return_source_documents=True is used by the agent,
            # the source documents are often included in the final response dictionary,
            # sometimes nested or at the top level depending on the agent's structure.
            # Let's try accessing them directly from the response dictionary.
            source_docs = response.get('source_documents')

            if source_docs:
                print("\nSource Documents:")
                for i, doc in enumerate(source_docs):
                    print(f"Source {i+1}:")
                    print(f"  Content: {doc.page_content[:500]}...") # Print first 500 characters of content
                    if doc.metadata:
                        # Print metadata, format it nicely if possible
                        print(f"  Metadata: {doc.metadata}")
                    print("-" * 30) # Separator for sources
            elif source_docs is not None: # If key exists but is None or empty list
                 print("\nNo specific source documents found for this query in the direct response.")


        except Exception as e:
            print(f"\nAn error occurred during agent execution: {e}")
            # Print the traceback for more details during debugging
            import traceback
            traceback.print_exc() # Print the full traceback
else:
    print("\nAgentExecutor is not available. Please ensure the previous steps were completed successfully.")


Chat with your agentic RAG system with sources (type 'quit' to exit):

Question: what are the critical KPIs for major events  
Thought: I need to pinpoint where ITU‑T E.811 defines “critical KPIs for major events.”  
Action: pdf_loader  
Action Input: "/mnt/data/T-REC-E.811-201703-I!!PDF-E.pdf"  
Observation: PDF loaded with 50 pages.  

Thought: Search the table of contents for the relevant clause.  
Action: vector_store_search  
Action Input: query="major events KPIs clause", k=3  
Observation:  
1) "Clause 8.1 Key Performance Indicators for Major Events" (Page 22)  
2) "Clause 3 Definitions" (Page 5)  
3) "Clause 9 Reporting" (Page 25)  

Thought: Extract the full text of Clause 8.1.  
Action: text_extractor  
Action Input: page_nums=[22]  
Observation: Raw text of Clause 8.1.  

Thought: Now distill just the KPI list.  
Action: summarizer  
Action Input: text="…Clause 8.1 raw text…"  
Observation:  
> “voice block call rate; data session block rate; voice call drop rate; download/

# Task
Create a multi-agent system using Langchain for gap identification in telecommunication standards, drawing from a Pinecone knowledge base ("telecom-rag-index"). The system should consist of five agents: The Cartographer (topic modeling/summarization), The Analyst (comparative analysis), The Hypothesiser (hypothesis generation), The Verifier (evidence gathering), and The Synthesiser (report generation). Use the Pinecone index with the provided API key and environment, the 'all-MiniLM-L6-v2' embedding model, and an open-source teleLLM Model on Hugging Face. The system should output a structured report detailing identified gaps, supporting evidence, and implications.

## Set up the environment

### Subtask:
Ensure necessary libraries (Langchain, Pinecone, Hugging Face libraries for LLMs, etc.) are installed and dependencies are met.


**Reasoning**:
The subtask is to ensure necessary libraries are installed. The provided code cell executes the installation command. After the execution, I will check the output to confirm successful installation.



In [1]:
%pip install --upgrade langchain pinecone sentence-transformers transformers accelerate

Collecting pinecone
  Downloading pinecone-7.3.0-py3-none-any.whl.metadata (9.5 kB)
Collecting sentence-transformers
  Downloading sentence_transformers-5.0.0-py3-none-any.whl.metadata (16 kB)
Collecting transformers
  Downloading transformers-4.53.1-py3-none-any.whl.metadata (40 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m40.9/40.9 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
Collecting pinecone-plugin-assistant<2.0.0,>=1.6.0 (from pinecone)
  Downloading pinecone_plugin_assistant-1.7.0-py3-none-any.whl.metadata (28 kB)
Collecting pinecone-plugin-interface<0.0.8,>=0.0.7 (from pinecone)
  Downloading pinecone_plugin_interface-0.0.7-py3-none-any.whl.metadata (1.2 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cud

## Initialize pinecone and connect to index

### Subtask:
Initialize the connection to Pinecone using the provided API key and environment, and connect to the existing Pinecone index named "telecom-rag-index".


**Reasoning**:
Initialize the connection to Pinecone using the provided API key and environment, and connect to the existing Pinecone index.



In [2]:
from pinecone import Pinecone
import os

# Assuming the API key and environment are stored as environment variables or can be obtained securely.
# For demonstration purposes, let's assume they are defined here.
# In a real application, avoid hardcoding sensitive information directly.
PINECONE_API_KEY = os.environ.get("PINECONE_API_KEY", "pcsk_25pb2Q_HE9pyr4VUiD5hgUz1FGAS1QtVSyLZQLjp2841hzJyuTMgXEz9Atxe5kcCa7K5uA")
PINECONE_ENVIRONMENT = os.environ.get("PINECONE_ENVIRONMENT", "us-east-1-aws")
PINECONE_INDEX_NAME = "telecom-rag-index"

# Initialize Pinecone
pc = Pinecone(api_key=PINECONE_API_KEY, environment=PINECONE_ENVIRONMENT)

# Connect to the existing index
index = pc.Index(PINECONE_INDEX_NAME)

print(f"Connected to Pinecone index: {PINECONE_INDEX_NAME}")

Connected to Pinecone index: telecom-rag-index


## Set up shared resources

### Subtask:
Configure the embedding model ('all-MiniLM-L6-v2') and a suitable language model (teleLLM) that will be used by the agents.


**Reasoning**:
Instantiate the embedding model and initialize the language model using HuggingFacePipeline as per the instructions.



**Reasoning**:
The previous attempt to import `langchain_huggingface` failed because the module was not found. Although a previous cell attempted to install it, the environment might have reset or the installation was incomplete. Reinstalling the necessary libraries should resolve this issue.



In [3]:
%pip install --upgrade langchain pinecone sentence-transformers transformers accelerate langchain_huggingface

Collecting langchain_huggingface
  Downloading langchain_huggingface-0.3.0-py3-none-any.whl.metadata (996 bytes)
Downloading langchain_huggingface-0.3.0-py3-none-any.whl (27 kB)
Installing collected packages: langchain_huggingface
Successfully installed langchain_huggingface-0.3.0


**Reasoning**:
The libraries have been successfully reinstalled, including `langchain_huggingface`. Now, I can proceed with instantiating the embedding model and the language model as originally intended.



In [5]:
from huggingface_hub import login

# Replace 'your_huggingface_token_here' with your urs
hf_token = ""

# Login using the token
login(token=hf_token)

print("Successfully logged into Hugging Face Hub!")

Successfully logged into Hugging Face Hub!


In [6]:
from langchain_huggingface import HuggingFacePipeline, HuggingFaceEmbeddings
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch

# Instantiate the 'all-MiniLM-L6-v2' embedding model
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

print("Embedding model 'all-MiniLM-L6-v2' initialized successfully.")

# Initialize a suitable open-source language model using HuggingFacePipeline
# Using the same model as in previous execution for consistency
try:
    model_id = "EYEDOL/teleLLM"

    # Load model and tokenizer
    # Use bfloat16 for reduced memory usage if possible, check GPU support
    # Added device_map="auto" to potentially offload to GPU
    tokenizer = AutoTokenizer.from_pretrained(model_id)
    model = AutoModelForCausalLM.from_pretrained(
        model_id,
        torch_dtype=torch.bfloat16, # Try bfloat16 again for efficiency if GPU supports it
        low_cpu_mem_usage=True,
        device_map="auto" # Automatically distribute model layers across available devices (like GPU)
    )

    # Create a text generation pipeline
    pipe = pipeline(
        "text-generation", # Keep text-generation for causal models
        model=model,
        tokenizer=tokenizer,
        max_new_tokens=512, # Adjust as needed
        do_sample=True,
        temperature=0.7, # Adjust as needed
        top_k=50, # Adjust as needed
        top_p=0.95 # Adjust as needed
        # Add device specification if needed, e.g., device=0 for GPU
        # device=0 if torch.cuda.is_available() else -1
    )

    # Instantiate HuggingFacePipeline
    llm = HuggingFacePipeline(pipeline=pipe)

    print(f"Model set up using HuggingFacePipeline with task 'text-generation'.")

except Exception as e:
    print(f"Error setting up Language model with HuggingFacePipeline: {e}")
    llm = None
    print("Language model setup failed.")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Embedding model 'all-MiniLM-L6-v2' initialized successfully.


tokenizer_config.json:   0%|          | 0.00/54.5k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/296 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/878 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/20.9k [00:00<?, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/1.46G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/189 [00:00<?, ?B/s]

Device set to use cuda:0


Model set up using HuggingFacePipeline with task 'text-generation'.


## Define core tools

### Subtask:
Create essential tools that multiple agents might need, such as the RAG retrieval tool (`telecommunication_qa`) to query the Pinecone knowledge base.


**Reasoning**:
Create a RAG retriever from the Pinecone index using the initialized embedding model.



In [7]:
!pip install langchain_pinecone

Collecting langchain_pinecone
  Downloading langchain_pinecone-0.2.8-py3-none-any.whl.metadata (5.3 kB)
Collecting langchain-tests<1.0.0,>=0.3.7 (from langchain_pinecone)
  Downloading langchain_tests-0.3.20-py3-none-any.whl.metadata (3.3 kB)
Collecting langchain-openai>=0.3.11 (from langchain_pinecone)
  Downloading langchain_openai-0.3.27-py3-none-any.whl.metadata (2.3 kB)
Collecting pytest-asyncio<1,>=0.20 (from langchain-tests<1.0.0,>=0.3.7->langchain_pinecone)
  Downloading pytest_asyncio-0.26.0-py3-none-any.whl.metadata (4.0 kB)
Collecting syrupy<5,>=4 (from langchain-tests<1.0.0,>=0.3.7->langchain_pinecone)
  Downloading syrupy-4.9.1-py3-none-any.whl.metadata (38 kB)
Collecting pytest-socket<1,>=0.6.0 (from langchain-tests<1.0.0,>=0.3.7->langchain_pinecone)
  Downloading pytest_socket-0.7.0-py3-none-any.whl.metadata (6.7 kB)
Collecting pytest-benchmark (from langchain-tests<1.0.0,>=0.3.7->langchain_pinecone)
  Downloading pytest_benchmark-5.1.0-py3-none-any.whl.metadata (25 kB)


In [8]:
from langchain_pinecone import PineconeVectorStore

# Check if the 'index' and 'embeddings' objects are available
if 'index' in locals() and index is not None and 'embeddings' in locals() and embeddings is not None:
    # Create a PineconeVectorStore instance using the existing index and the embedding model
    vectorstore = PineconeVectorStore(index=index, embedding=embeddings)

    # Get a retriever from the vectorstore
    # Configure the retriever to return the top 3 documents as specified in the original task
    retriever = vectorstore.as_retriever(search_kwargs={'k': 3})

    print("Retriever set up successfully to return top 3 chunks.")
else:
    retriever = None
    print("Cannot set up retriever because Pinecone index or embeddings are not available.")

Retriever set up successfully to return top 3 chunks.


**Reasoning**:
Create the RAG retrieval tool for the agent to query the Pinecone knowledge base.



In [9]:
from langchain.agents import tool
from langchain.chains import RetrievalQA

# Check if the 'llm' and 'retriever' objects are available
if 'llm' in locals() and llm is not None and 'retriever' in locals() and retriever is not None:
    # Create a RetrievalQA chain instance
    qa_chain = RetrievalQA.from_chain_type(
        llm=llm,
        chain_type="stuff",  # Use the "stuff" chain type
        retriever=retriever,
        return_source_documents=True # Include source documents in the output
    )

    # Define a Python function and decorate it with @tool
    @tool
    def telecommunication_qa(query: str) -> str:
        """Answers questions about telecommunication standards by searching a knowledge base and returns the answer with source documents."""
        # Invoke the qa_chain with the query
        result = qa_chain.invoke({"query": query})
        # The result from RetrievalQA.from_chain_type(return_source_documents=True)
        # is a dictionary. The 'answer' key contains the generated response,
        # and 'source_documents' contains the relevant documents.
        # We need to decide how to format this for the tool's output.
        # A simple approach is to return the answer and source information as a string.
        answer = result.get("answer", "Could not find an answer.")
        source_docs = result.get("source_documents", [])
        sources_text = ""
        if source_docs:
            sources_text = "\n\nSources:"
            for i, doc in enumerate(source_docs):
                sources_text += f"\nSource {i+1}: {doc.metadata.get('source', 'N/A')}"
                # Optionally include content snippet: sources_text += f"\nContent: {doc.page_content[:200]}..."

        return f"{answer}{sources_text}"


    print("RetrievalQA chain and telecommunication_qa tool created successfully.")
else:
    print("Cannot create RetrievalQA chain and tool because LLM or Retriever is not available.")

RetrievalQA chain and telecommunication_qa tool created successfully.


## Develop agent 1: the cartographer

### Subtask:
Develop The Cartographer agent responsible for mapping the thematic landscape of the telecommunication standards, potentially involving topic modeling and summarization.


**Reasoning**:
Define the role and objectives of The Cartographer agent, identify and implement its tools, and create the agent logic using Langchain's agent framework.



In [10]:
from langchain.agents import AgentExecutor, initialize_agent, AgentType
from langchain_core.prompts import ChatPromptTemplate

# 1. Define the role and objectives of The Cartographer agent.
# Role: The Cartographer Agent
# Objectives:
# - Understand the overall scope or area of interest in telecommunication standards.
# - Explore the Pinecone knowledge base to identify key themes, topics, and prevalent concepts within the specified area.
# - Potentially perform topic modeling or identify clusters of related documents.
# - Generate a thematic overview or structured summary of the identified themes and their relationships.
# - Pass this thematic map or summary to the next agent (The Analyst).

# 2. Identify or create specific tools this agent will use.
# - telecommunication_qa: Already defined, useful for initial exploration and getting summaries of specific concepts.
# - (Potential future tools): Tools for more sophisticated topic modeling, clustering, or summarizing larger document sets. For this iteration, we will primarily rely on telecommunication_qa for initial exploration and the agent's LLM capabilities for synthesis.

# 3. Implement the agent's logic using Langchain's agent framework.
# 4. Define how this agent will process input and produce output.

# Input: An initial prompt defining the broad area of telecommunication standards to explore (e.g., "security in 5G networks", "IoT connectivity standards").
# Output: A structured summary or thematic map of the identified topics and their relationships within the specified area.

# Let's create a simpler agent using existing tools for this iteration.
# The Cartographer will use the telecommunication_qa tool to explore the topic
# and the LLM's reasoning ability to synthesize a thematic overview.

# Define the tools The Cartographer can use
# Assuming 'telecommunication_qa' is available from previous steps
cartographer_tools = [telecommunication_qa]

# Define the prompt for The Cartographer agent
# This prompt guides the agent to act as a cartographer of knowledge.
cartographer_prompt = ChatPromptTemplate.from_messages([
    ("system", """You are The Cartographer, an AI agent specializing in mapping the thematic landscape of telecommunication standards.
    Your goal is to identify key themes, topics, and concepts within a given area of telecommunication standards.
    You have access to a knowledge base through the 'telecommunication_qa' tool.
    Use this tool to explore relevant concepts and documents based on the user's query.
    Synthesize the information you retrieve to create a structured overview or thematic map of the key areas and their relationships.
    Your output should be a clear summary of the main topics and their connections, focusing on identifying the landscape of information available.
    Do not attempt to identify gaps yourself; focus solely on understanding and describing the existing thematic landscape.
    """),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])


# Create The Cartographer agent executor
# Using initialize_agent with a suitable agent type for tool use and reasoning.
# 'zero-shot-react-description' is a good choice here.
if 'llm' in locals() and llm is not None and cartographer_tools: # Ensure LLM and tools are available
    try:
        # initialize_agent expects tools and llm directly, and it builds the prompt internally based on agent_type
        # We will pass the prompt to the agent executor when invoking it, if the agent type supports it or if we switch to a custom runnable agent.
        # For initialize_agent, the prompt is often implicit, but we can guide it with the system message in the prompt template.
        the_cartographer_agent = initialize_agent(
            cartographer_tools,
            llm,
            agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, # Using a ReAct agent type
            verbose=True, # Keep verbose for debugging
            handle_parsing_errors=True # Add error handling for parsing
        )
        print("The Cartographer agent created successfully.")
    except Exception as e:
        print(f"Error creating The Cartographer agent: {e}")
        the_cartographer_agent = None
        print("The Cartographer agent creation failed.")
else:
    the_cartographer_agent = None
    print("Cannot create The Cartographer agent because LLM or tools are not available.")


The Cartographer agent created successfully.


  the_cartographer_agent = initialize_agent(


**Reasoning**:
The Cartographer agent has been defined. The next step is to define how it will process input and produce output as a standalone component, ready to pass its output to the next agent in the multi-agent system pipeline.



In [11]:
# 5. Define how this agent will process input and produce output.

# Define a function to run The Cartographer agent
def run_cartographer_agent(input_query: str) -> str:
    """
    Runs The Cartographer agent with a given input query.

    Args:
        input_query: The query defining the area of telecommunication standards to explore.

    Returns:
        A string representing the thematic overview or structured summary generated by the agent.
        Returns an error message if the agent is not initialized.
    """
    if the_cartographer_agent is not None:
        try:
            print(f"Running The Cartographer for query: '{input_query}'")
            # Invoke the agent with the input query.
            # The agent's output will be in the 'output' key of the result dictionary.
            result = the_cartographer_agent.invoke({"input": input_query})
            thematic_overview = result.get('output', 'The Cartographer could not generate a thematic overview.')
            print("The Cartographer finished processing.")
            return thematic_overview
        except Exception as e:
            print(f"An error occurred while running The Cartographer: {e}")
            # Print the traceback for debugging
            import traceback
            traceback.print_exc()
            return f"Error: An error occurred while running The Cartographer: {e}"
    else:
        return "Error: The Cartographer agent is not initialized."

# Example of how to use the function (for testing purposes)
# Note: This is just for demonstration; the actual agent will be part of a larger workflow.
# print("\n--- Testing The Cartographer ---")
# example_query = "Overview of security standards in telecommunication networks"
# cartographer_output = run_cartographer_agent(example_query)
# print("\nCartographer Output:")
# print(cartographer_output)
# print("--- End of Cartographer Test ---")

print("Function 'run_cartographer_agent' defined.")


Function 'run_cartographer_agent' defined.


## Develop agent 2: the analyst

### Subtask:
Develop The Analyst agent responsible for detailed comparative analysis of the thematic overview from The Cartographer to uncover inconsistencies, outdated standards, and underexplored areas.


**Reasoning**:
I need to define and initialize The Analyst agent, including its role, objectives, necessary tools, and then create a function to run this agent with the thematic overview from The Cartographer as input. This involves importing necessary Langchain components, defining the agent's tools (starting with `telecommunication_qa`), initializing the agent, and creating the `run_analyst_agent` function. I will use the `initialize_agent` function with `AgentType.ZERO_SHOT_REACT_DESCRIPTION` as used for The Cartographer.



In [12]:
from langchain.agents import initialize_agent, AgentType
from langchain.chains import RetrievalQA
from langchain.agents import tool
import traceback

# Ensure the necessary tools (at least telecommunication_qa) are available
# telecommunication_qa was defined in a previous step

# Define the tools available to The Analyst. Initially, it uses telecommunication_qa.
# We will add other tools later if needed and if they are developed.
analyst_tools = [telecommunication_qa]

# Check if the LLM and tools are available before initializing the agent
if 'llm' in locals() and llm is not None and analyst_tools:
    try:
        # Initialize The Analyst agent
        the_analyst_agent = initialize_agent(
            analyst_tools,
            llm,
            agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, # Use a suitable agent type
            verbose=True, # Keep verbose to see the agent's thought process
            handle_parsing_errors=True # Add error handling for parsing
        )
        print("The Analyst agent initialized successfully.")
    except Exception as e:
        print(f"Error initializing The Analyst agent: {e}")
        the_analyst_agent = None
else:
    the_analyst_agent = None
    print("Cannot initialize The Analyst agent because LLM or tools are not available.")


# Define a function to run The Analyst agent
def run_analyst_agent(thematic_overview: str) -> str:
    """
    Runs The Analyst agent to perform comparative analysis on the thematic overview.

    Args:
        thematic_overview: A string containing the thematic overview generated by The Cartographer.

    Returns:
        A string detailing the findings of the comparative analysis (inconsistencies, gaps, etc.).
        Returns an error message if the agent is not initialized or execution fails.
    """
    if the_analyst_agent is not None:
        try:
            print(f"\nRunning The Analyst with thematic overview:\n{thematic_overview[:500]}...") # Print snippet of input

            # Construct the prompt for the analyst
            # This prompt instructs the agent on how to perform the comparative analysis
            analyst_input = f"""Analyze the following thematic overview of telecommunication standards to identify inconsistencies, outdated standards, and underexplored or missing areas (gaps). Use your tools to gather more details on specific topics as needed to support your analysis.

Thematic Overview:
{thematic_overview}

Based on this overview and your knowledge base, provide a detailed analysis including:
1. Identified inconsistencies or conflicts between standards or topics.
2. Areas where standards appear outdated or potentially superseded by newer technologies or requirements.
3. Underexplored areas or potential gaps in the documented standards.
4. Provide brief justification or reference where possible using information from your tools.

Present your findings in a structured format.
"""

            # Invoke the agent with the analysis prompt
            # The agent's output will be in the 'output' key of the result dictionary.
            result = the_analyst_agent.invoke({"input": analyst_input})
            analysis_findings = result.get('output', 'The Analyst could not generate analysis findings.')

            print("The Analyst finished processing.")
            return analysis_findings

        except Exception as e:
            print(f"An error occurred while running The Analyst: {e}")
            traceback.print_exc() # Print traceback for debugging
            return f"Error: An error occurred while running The Analyst: {e}"
    else:
        return "Error: The Analyst agent is not initialized."

print("Function 'run_analyst_agent' defined.")

The Analyst agent initialized successfully.
Function 'run_analyst_agent' defined.


## Develop agent 3: the hypothesiser

### Subtask:
Develop The Hypothesiser agent responsible for transforming analytical findings from The Analyst into testable hypotheses or probing questions about potential gaps.


**Reasoning**:
Define the Hypothesiser agent's role and objectives, identify its tools (initially relying on `telecommunication_qa`), and implement its logic using `initialize_agent`. Also, define the `run_hypothesiser_agent` function to encapsulate the agent's execution, construct its prompt, invoke the agent, and handle output and errors.



In [13]:
from langchain.agents import initialize_agent, AgentType
import traceback

# 1. Define the role and objectives of The Hypothesiser agent.
# Role: Transform analytical findings from The Analyst into testable hypotheses or probing questions about potential gaps.
# Objectives:
# - Take structured analysis findings as input.
# - Review the findings, focusing on identified inconsistencies, outdated areas, and gaps.
# - Formulate specific, clear, and testable hypotheses or pointed questions based on these findings.
# - Ensure the hypotheses/questions are actionable and can guide further investigation by The Verifier.

# 2. Identify or create tools specific to this agent.
# Initially, The Hypothesiser will primarily use the core telecommunication_qa tool
# to potentially refine or check preliminary ideas or details related to the analysis findings.
hypothesiser_tools = [telecommunication_qa] # Assuming telecommunication_qa is already defined and available

# 3. Implement the agent's logic using initialize_agent.
# Check if the LLM and tools are available before initializing the agent
if 'llm' in locals() and llm is not None and hypothesiser_tools:
    try:
        # Initialize The Hypothesiser agent
        the_hypothesiser_agent = initialize_agent(
            hypothesiser_tools,
            llm,
            agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, # Use a suitable agent type
            verbose=True, # Keep verbose to see the agent's thought process
            handle_parsing_errors=True # Add error handling for parsing
        )
        print("The Hypothesiser agent initialized successfully.")
    except Exception as e:
        print(f"Error initializing The Hypothesiser agent: {e}")
        the_hypothesiser_agent = None
else:
    the_hypothesiser_agent = None
    print("Cannot initialize The Hypothesiser agent because LLM or tools are not available.")

# 4-8. Define a Python function run_hypothesiser_agent.
def run_hypothesiser_agent(analysis_findings: str) -> str:
    """
    Runs The Hypothesiser agent to generate testable hypotheses or probing questions
    based on the analysis findings.

    Args:
        analysis_findings: A string containing the detailed analysis findings from The Analyst.

    Returns:
        A string listing the generated hypotheses/questions.
        Returns an error message if the agent is not initialized or execution fails.
    """
    if the_hypothesiser_agent is not None:
        try:
            print(f"\nRunning The Hypothesiser with analysis findings:\n{analysis_findings[:500]}...") # Print snippet of input

            # 5. Construct a clear prompt for The Hypothesiser agent.
            hypothesiser_input = f"""Based on the following analysis findings about potential inconsistencies, outdated areas, or gaps in telecommunication standards, generate a list of specific, clear, and testable hypotheses or probing questions. These should guide further investigation to confirm or explore these potential gaps.

Analysis Findings:
{analysis_findings}

Generate a numbered list of hypotheses or questions. Each item should be concise and focused on a single potential gap or inconsistency. Ensure they are formulated in a way that can be investigated or tested.

Example:
1. Hypothesis: Standard X is outdated because it does not address the security requirements of 5G slicing.
2. Question: Does the lack of specification for inter-device communication in Standard Y create a security vulnerability for IoT deployments?

Generate the list now:
"""

            # 6. Invoke the initialized the_hypothesiser_agent within the function.
            # The agent's output will be in the 'output' key of the result dictionary.
            result = the_hypothesiser_agent.invoke({"input": hypothesiser_input})
            generated_hypotheses = result.get('output', 'The Hypothesiser could not generate hypotheses.')

            print("The Hypothesiser finished processing.")
            # 7. Extract the generated hypotheses/questions from the agent's output.
            # This is done by getting the 'output' key from the result dictionary.

            # 8. Include error handling using a try...except block.
            # This is already done around the agent invocation.

            return generated_hypotheses

        except Exception as e:
            print(f"An error occurred while running The Hypothesiser: {e}")
            traceback.print_exc() # Print traceback for debugging
            return f"Error: An error occurred while running The Hypothesiser: {e}"
    else:
        return "Error: The Hypothesiser agent is not initialized."

print("Function 'run_hypothesiser_agent' defined.")

The Hypothesiser agent initialized successfully.
Function 'run_hypothesiser_agent' defined.


## Develop agent 4: the verifier

### Subtask:
Develop The Verifier agent responsible for testing hypotheses from The Hypothesiser through targeted semantic search and evidence gathering from the Pinecone knowledge base.


**Reasoning**:
Define the role and objectives of The Verifier agent, identify its tools (primarily `telecommunication_qa`), implement its logic using `initialize_agent`, and define the `run_verifier_agent` function that iterates through hypotheses, constructs prompts, invokes the agent, collects results, and includes error handling.



In [14]:
from langchain.agents import initialize_agent, AgentType
import traceback

# 1. Define the role and objectives of The Verifier agent.
# Role: Test hypotheses or answer probing questions generated by The Hypothesiser.
# Objectives:
# - Take a list of hypotheses/questions as input.
# - For each hypothesis/question, use available tools to search the knowledge base for supporting or contradictory evidence.
# - Summarize the findings for each hypothesis/question, explicitly stating whether evidence was found.
# - Reference the source documents for any evidence found.
# - Collect and return the verification results for all hypotheses/questions.

# 2. Identify or create tools specific to this agent.
# The Verifier will heavily rely on the telecommunication_qa tool to search the knowledge base.
verifier_tools = [telecommunication_qa] # Assuming telecommunication_qa is already defined and available

# 3. Implement the agent's logic using initialize_agent.
# Check if the LLM and tools are available before initializing the agent
if 'llm' in locals() and llm is not None and verifier_tools:
    try:
        # Initialize The Verifier agent
        the_verifier_agent = initialize_agent(
            verifier_tools,
            llm,
            agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, # Use a suitable agent type
            verbose=True, # Keep verbose to see the agent's thought process
            handle_parsing_errors=True # Add error handling for parsing
        )
        print("The Verifier agent initialized successfully.")
    except Exception as e:
        print(f"Error initializing The Verifier agent: {e}")
        the_verifier_agent = None
else:
    the_verifier_agent = None
    print("Cannot initialize The Verifier agent because LLM or tools are not available.")

# 4-8. Define a Python function run_verifier_agent.
def run_verifier_agent(hypotheses_questions: list[str]) -> list[dict]:
    """
    Runs The Verifier agent to test hypotheses or answer probing questions
    by searching the knowledge base for evidence.

    Args:
        hypotheses_questions: A list of strings, where each string is a hypothesis or question.

    Returns:
        A list of dictionaries, where each dictionary contains the original
        hypothesis/question and the verification result (evidence summary and sources).
        Returns an empty list and prints an error message if the agent is not
        initialized or execution fails.
    """
    verification_results = []

    if the_verifier_agent is not None:
        print("\nRunning The Verifier for hypotheses/questions...")
        for i, item in enumerate(hypotheses_questions):
            print(f"\n--- Verifying Item {i+1}/{len(hypotheses_questions)} ---")
            print(f"Item: {item}")

            try:
                # 6. Construct a prompt for The Verifier agent.
                # Instruct the agent to use its tools to find evidence for/against the hypothesis
                # or to answer the question. Ask it to summarize findings and reference sources.
                verifier_input = f"""Investigate the following statement or question using your available tools:

Statement/Question: {item}

Search your knowledge base for evidence that supports or contradicts this statement, or to find the answer to the question.

Based on your search:
1. State explicitly whether relevant evidence was found ("Evidence Found" or "No Sufficient Evidence Found").
2. Summarize the evidence found, if any.
3. Reference the source documents for the evidence (e.g., "Source: Document Title, Section").

Present your findings clearly for this item.
"""

                # 7. Invoke the initialized the_verifier_agent for each item.
                # The agent's output will be in the 'output' key of the result dictionary.
                result = the_verifier_agent.invoke({"input": verifier_input})
                verification_finding = result.get('output', 'The Verifier could not generate verification findings.')

                # 8. Collect the verification results.
                verification_results.append({
                    "item": item,
                    "verification_result": verification_finding
                })

                print(f"Verification for Item {i+1} finished.")

            except Exception as e:
                print(f"An error occurred while verifying item '{item}': {e}")
                traceback.print_exc() # Print traceback for debugging
                verification_results.append({
                    "item": item,
                    "verification_result": f"Error: An error occurred during verification: {e}"
                })

        print("\n--- The Verifier finished processing all items ---")
        return verification_results
    else:
        print("Error: The Verifier agent is not initialized.")
        return []

print("Function 'run_verifier_agent' defined.")

# Example of how to use the function (for testing purposes - requires hypotheses_questions list)
# Note: This is just for demonstration; the actual agent will be part of a larger workflow.
# print("\n--- Testing The Verifier ---")
# example_hypotheses = [
#     "Standard X is outdated because it does not address the security requirements of 5G slicing.",
#     "Does the lack of specification for inter-device communication in Standard Y create a security vulnerability for IoT deployments?",
#     "What are the key performance indicators for major events in telecommunication networks?",
#     "Is there any evidence in the standards for using blockchain in network management?"
# ]
# verifier_output = run_verifier_agent(example_hypotheses)
# print("\nVerifier Output:")
# for res in verifier_output:
#     print(f"Item: {res['item']}")
#     print(f"Result: {res['verification_result']}")
#     print("-" * 50)
# print("--- End of Verifier Test ---")


The Verifier agent initialized successfully.
Function 'run_verifier_agent' defined.


## Develop agent 5: the synthesiser

### Subtask:
Develop The Synthesiser agent responsible for compiling findings and evidence from The Verifier into a structured, prioritized report on identified gaps.


**Reasoning**:
Define the role and objectives of The Synthesiser agent, identify its tools (relying on the LLM's capabilities), and implement its logic using `initialize_agent`. Also, define the `run_synthesiser_agent` function which will construct the prompt and invoke the agent with the verification results.



In [15]:
from langchain.agents import initialize_agent, AgentType
import traceback

# 1. Define the role and objectives of The Synthesiser agent.
# Role: Compile findings and evidence from The Verifier into a structured, prioritized report on identified gaps.
# Objectives:
# - Take a list of structured verification results (output from The Verifier) as input.
# - Consolidate the findings, focusing on items where evidence of a gap, inconsistency, or lack of information was highlighted.
# - Prioritize potential gaps based on the strength of evidence or implied significance.
# - Structure the information into a coherent report format.
# - Generate the final report output for the user.

# 2. Identify or create tools specific to this agent.
# The Synthesiser will primarily use the llm's inherent ability to process text, synthesize information, and format output.
# For this iteration, no explicit tools beyond the LLM's capabilities are required for synthesis and reporting.
synthesiser_tools = [] # No specific tools needed for this agent's core task in this iteration.

# 3. Implement the agent's logic using initialize_agent.
# Check if the LLM is available before initializing the agent
if 'llm' in locals() and llm is not None:
    try:
        # Initialize The Synthesiser agent
        # Using AgentType.ZERO_SHOT_REACT_DESCRIPTION even with no tools, as it can still reason and format based on the prompt.
        the_synthesiser_agent = initialize_agent(
            synthesiser_tools, # Pass the empty tools list
            llm,
            agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, # Use a suitable agent type
            verbose=True, # Keep verbose to see the agent's thought process
            handle_parsing_errors=True # Add error handling for parsing
        )
        print("The Synthesiser agent initialized successfully.")
    except Exception as e:
        print(f"Error initializing The Synthesiser agent: {e}")
        the_synthesiser_agent = None
else:
    the_synthesiser_agent = None
    print("Cannot initialize The Synthesiser agent because LLM is not available.")

# 4-10. Define a Python function run_synthesiser_agent.
def run_synthesiser_agent(verification_results: list[dict]) -> str:
    """
    Runs The Synthesiser agent to compile verification findings into a structured report.

    Args:
        verification_results: A list of dictionaries, where each dictionary contains
                              the original hypothesis/question and the verification result
                              (evidence summary and sources) from The Verifier.

    Returns:
        A string containing the generated structured report on identified gaps,
        or an error message if the agent is not initialized or execution fails.
    """
    if the_synthesiser_agent is not None:
        print("\nRunning The Synthesiser to generate report...")

        # 6. Construct a clear and detailed prompt for The Synthesiser agent.
        # Format the verification results into a readable string for the prompt.
        verification_results_str = ""
        if verification_results:
            for i, res in enumerate(verification_results):
                verification_results_str += f"--- Item {i+1} ---\n"
                verification_results_str += f"Original Item: {res.get('item', 'N/A')}\n"
                verification_results_str += f"Verification Finding: {res.get('verification_result', 'No result provided.')}\n"
                verification_results_str += "-" * 20 + "\n"
        else:
            verification_results_str = "No verification results were provided."


        synthesiser_input = f"""Compile a structured report based on the following verification results regarding potential gaps or inconsistencies in telecommunication standards.

Review the Verification Results below. Focus on items where evidence of a gap, inconsistency, or lack of sufficient information was noted. Prioritize the most significant potential gaps based on the findings.

Verification Results:
{verification_results_str}

Based on your review, generate a report with the following structure:

Report Title: Analysis of Potential Gaps in Telecommunication Standards

1.  Introduction: Briefly state the purpose of this report, which is to synthesize findings from an analysis and verification process aimed at identifying potential gaps in telecommunication standards based on a knowledge base.
2.  Identified Potential Gaps: For each item where evidence suggested a potential gap, inconsistency, or area lacking sufficient information, create a subsection.
    *   Subtitle: Summarize the potential gap or the area of focus (e.g., "Gap in 5G Slicing Security Specifications").
    *   Original Hypothesis/Question: State the original hypothesis or question that was investigated.
    *   Verification Summary: Briefly summarize the findings from the verification step, including whether relevant evidence was found and what that evidence indicated (or that evidence was lacking).
    *   Supporting Evidence/Sources: List or reference the source documents or specific pieces of evidence mentioned in the verification findings.
    *   Potential Implications: Briefly comment on the possible significance or impact of this potential gap (e.g., security risks, interoperability issues, missed opportunities).
3.  Conclusion: Provide a brief concluding summary of the key potential gaps identified and the overall findings.

Ensure the report is clear, concise, well-structured, and easy to read.
"""

        try:
            # 7. Invoke the initialized the_synthesiser_agent.
            # The agent's final report will be in the 'output' key of the result dictionary.
            result = the_synthesiser_agent.invoke({"input": synthesiser_input})
            generated_report = result.get('output', 'The Synthesiser could not generate the report.')

            print("The Synthesiser finished processing.")
            # 8. Extract the generated report from the agent's output.
            # This is done by getting the 'output' key from the result dictionary.

            # 9. Include error handling using a try...except block.
            # This is already done around the agent invocation.

            return generated_report

        except Exception as e:
            print(f"An error occurred while running The Synthesiser: {e}")
            traceback.print_exc() # Print traceback for debugging
            return f"Error: An error occurred while running The Synthesiser: {e}"
    else:
        print("Error: The Synthesiser agent is not initialized.")
        return "Error: The Synthesiser agent is not initialized."

print("Function 'run_synthesiser_agent' defined.")

# Example of how to use the function (for testing purposes - requires example_verification_results list)
# Note: This is just for demonstration; the actual agent will be part of a larger workflow.
# print("\n--- Testing The Synthesiser ---")
# # Example verification results format (simulated)
# example_verification_results = [
#     {
#         "item": "Hypothesis: Standard X is outdated because it does not address the security requirements of 5G slicing.",
#         "verification_result": """Evidence Found: Found a document discussing the need for enhanced security in 5G slicing that is not covered by Standard X's scope.
# Source: Document Y, Section Z"""
#     },
#     {
#         "item": "Question: Does the lack of specification for inter-device communication in Standard Y create a security vulnerability for IoT deployments?",
#         "verification_result": """No Sufficient Evidence Found: Searched for information on inter-device communication specifications in Standard Y and potential vulnerabilities. The standard is largely silent on this specific interaction type.
# Source: Standard Y (reviewed clauses on device interaction)"""
#     },
#      {
#         "item": "Hypothesis: Standard A comprehensively covers quantum-resistant cryptography for telecommunications.",
#         "verification_result": """No Sufficient Evidence Found: Found mentions of future considerations for quantum computing impacts, but no specific requirements or recommendations for quantum-resistant cryptography in Standard A.
# Source: Standard A, Foreword"""
#     }
# ]
#
# synthesiser_report = run_synthesiser_agent(example_verification_results)
# print("\nSynthesiser Report:")
# print(synthesiser_report)
# print("--- End of Synthesiser Test ---")

Error initializing The Synthesiser agent: Got no tools for ZeroShotAgent. At least one tool must be provided.
Function 'run_synthesiser_agent' defined.


**Reasoning**:
The previous attempt to initialize the Synthesiser agent failed because the `ZERO_SHOT_REACT_DESCRIPTION` agent type requires at least one tool, even if the primary work is done by the LLM itself through prompt interpretation. To resolve this, I will modify the agent initialization to provide a dummy tool or use an agent type that does not strictly require tools. A simpler approach might be to just use the LLM directly with a structured prompt within the `run_synthesiser_agent` function, bypassing the `initialize_agent` structure if no complex tool use is needed for synthesis. Given the requirement to *implement the agent's logic using `langchain.agents.initialize_agent`*, I will add a simple dummy tool to satisfy the requirement of the `ZERO_SHOT_REACT_DESCRIPTION` agent type. This dummy tool will not perform any complex action but will allow the agent to be initialized. I will then redefine the agent and the `run_synthesiser_agent` function.



In [16]:
from langchain.agents import initialize_agent, AgentType, tool
import traceback

# Define a simple dummy tool to satisfy the agent's requirement for at least one tool.
@tool
def dummy_tool(query: str) -> str:
    """A dummy tool that does nothing. Used to satisfy agent requirements."""
    return "Dummy tool executed."

# 1. Define the role and objectives of The Synthesiser agent. (Already defined in the previous attempt)
# Role: Compile findings and evidence from The Verifier into a structured, prioritized report on identified gaps.
# Objectives:
# - Take a list of structured verification results (output from The Verifier) as input.
# - Consolidate the findings, focusing on items where evidence of a gap, inconsistency, or lack of information was highlighted.
# - Prioritize potential gaps based on the strength of evidence or implied significance.
# - Structure the information into a coherent report format.
# - Generate the final report output for the user.

# 2. Identify or create tools specific to this agent.
# The Synthesiser will primarily use the llm's inherent ability to process text, synthesize information, and format output.
# Add the dummy tool to the tools list to allow agent initialization.
synthesiser_tools = [dummy_tool] # Include the dummy tool

# 3. Implement the agent's logic using initialize_agent.
# Check if the LLM is available before initializing the agent
if 'llm' in locals() and llm is not None:
    try:
        # Initialize The Synthesiser agent
        # Using AgentType.ZERO_SHOT_REACT_DESCRIPTION with the dummy tool.
        the_synthesiser_agent = initialize_agent(
            synthesiser_tools, # Pass the tools list including the dummy tool
            llm,
            agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, # Use a suitable agent type
            verbose=True, # Keep verbose to see the agent's thought process
            handle_parsing_errors=True # Add error handling for parsing
        )
        print("The Synthesiser agent initialized successfully with dummy tool.")
    except Exception as e:
        print(f"Error initializing The Synthesiser agent: {e}")
        the_synthesiser_agent = None
else:
    the_synthesiser_agent = None
    print("Cannot initialize The Synthesiser agent because LLM is not available.")

# 4-10. Define a Python function run_synthesiser_agent. (Redefine the function to use the potentially re-initialized agent)
def run_synthesiser_agent(verification_results: list[dict]) -> str:
    """
    Runs The Synthesiser agent to compile verification findings into a structured report.

    Args:
        verification_results: A list of dictionaries, where each dictionary contains
                              the original hypothesis/question and the verification result
                              (evidence summary and sources) from The Verifier.

    Returns:
        A string containing the generated structured report on identified gaps,
        or an error message if the agent is not initialized or execution fails.
    """
    if the_synthesiser_agent is not None:
        print("\nRunning The Synthesiser to generate report...")

        # 6. Construct a clear and detailed prompt for The Synthesiser agent.
        # Format the verification results into a readable string for the prompt.
        verification_results_str = ""
        if verification_results:
            for i, res in enumerate(verification_results):
                verification_results_str += f"--- Item {i+1} ---\n"
                verification_results_str += f"Original Item: {res.get('item', 'N/A')}\n"
                verification_results_str += f"Verification Finding: {res.get('verification_result', 'No result provided.')}\n"
                verification_results_str += "-" * 20 + "\n"
        else:
            verification_results_str = "No verification results were provided."


        synthesiser_input = f"""Compile a structured report based on the following verification results regarding potential gaps or inconsistencies in telecommunication standards.

Review the Verification Results below. Focus on items where evidence of a gap, inconsistency, or lack of sufficient information was noted. Prioritize the most significant potential gaps based on the findings.

Verification Results:
{verification_results_str}

Based on your review, generate a report with the following structure:

Report Title: Analysis of Potential Gaps in Telecommunication Standards

1.  Introduction: Briefly state the purpose of this report, which is to synthesize findings from an analysis and verification process aimed at identifying potential gaps in telecommunication standards based on a knowledge base.
2.  Identified Potential Gaps: For each item where evidence suggested a potential gap, inconsistency, or area lacking sufficient information, create a subsection.
    *   Subtitle: Summarize the potential gap or the area of focus (e.g., "Gap in 5G Slicing Security Specifications").
    *   Original Hypothesis/Question: State the original hypothesis or question that was investigated.
    *   Verification Summary: Briefly summarize the findings from the verification step, including whether relevant evidence was found and what that evidence indicated (or that evidence was lacking).
    *   Supporting Evidence/Sources: List or reference the source documents or specific pieces of evidence mentioned in the verification findings.
    *   Potential Implications: Briefly comment on the possible significance or impact of this potential gap (e.g., security risks, interoperability issues, missed opportunities).
3.  Conclusion: Provide a brief concluding summary of the key potential gaps identified and the overall findings.

Ensure the report is clear, concise, well-structured, and easy to read.
"""

        try:
            # 7. Invoke the initialized the_synthesiser_agent.
            # The agent's final report will be in the 'output' key of the result dictionary.
            result = the_synthesiser_agent.invoke({"input": synthesiser_input})
            generated_report = result.get('output', 'The Synthesiser could not generate the report.')

            print("The Synthesiser finished processing.")
            # 8. Extract the generated report from the agent's output.
            # This is done by getting the 'output' key from the result dictionary.

            # 9. Include error handling using a try...except block.
            # This is already done around the agent invocation.

            return generated_report

        except Exception as e:
            print(f"An error occurred while running The Synthesiser: {e}")
            traceback.print_exc() # Print traceback for debugging
            return f"Error: An error occurred while running The Synthesiser: {e}"
    else:
        return "Error: The Synthesiser agent is not initialized."

print("Function 'run_synthesiser_agent' defined.")

The Synthesiser agent initialized successfully with dummy tool.
Function 'run_synthesiser_agent' defined.


## Orchestrate the multi-agent workflow

### Subtask:
Design and implement the overall workflow that connects the five agents (Cartographer, Analyst, Hypothesiser, Verifier, Synthesiser), managing the inputs and outputs between them to perform the gap identification process from an initial query to the final report.


**Reasoning**:
Design and implement the overall workflow that connects the five agents, managing the inputs and outputs between them to perform the gap identification process from an initial query to the final report. This involves defining a main function that calls the individual agent functions sequentially, passing the output of one as the input to the next, and including error handling.



In [17]:
import traceback

# Define the main workflow orchestrator function
def run_gap_identification_workflow(initial_query: str) -> str:
    """
    Orchestrates the multi-agent workflow for identifying gaps in telecommunication standards.

    Args:
        initial_query: The initial user query defining the area of telecommunication standards to analyze.

    Returns:
        A string containing the final structured report on identified gaps, or
        an error message if any step in the workflow fails.
    """
    print("\n--- Starting Gap Identification Workflow ---")
    print(f"Initial Query: '{initial_query}'")

    # Step 1: Run The Cartographer agent
    try:
        print("\nStep 1: Running The Cartographer...")
        # Assuming run_cartographer_agent is defined in a previous cell
        thematic_overview = run_cartographer_agent(initial_query)
        if thematic_overview.startswith("Error:"):
            return f"Workflow failed at The Cartographer step: {thematic_overview}"
        print("The Cartographer completed successfully.")
        print(f"Thematic Overview (snippet):\n{thematic_overview[:500]}...")
    except Exception as e:
        print(f"An unexpected error occurred during The Cartographer step: {e}")
        traceback.print_exc()
        return f"Workflow failed due to unexpected error in The Cartographer step: {e}"

    # Step 2: Run The Analyst agent
    try:
        print("\nStep 2: Running The Analyst...")
        # Assuming run_analyst_agent is defined in a previous cell
        analysis_findings = run_analyst_agent(thematic_overview)
        if analysis_findings.startswith("Error:"):
             return f"Workflow failed at The Analyst step: {analysis_findings}"
        print("The Analyst completed successfully.")
        print(f"Analysis Findings (snippet):\n{analysis_findings[:500]}...")
    except Exception as e:
        print(f"An unexpected error occurred during The Analyst step: {e}")
        traceback.print_exc()
        return f"Workflow failed due to unexpected error in The Analyst step: {e}"


    # Step 3: Run The Hypothesiser agent
    try:
        print("\nStep 3: Running The Hypothesiser...")
        # Assuming run_hypothesiser_agent is defined in a previous cell
        # The Hypothesiser is expected to return a string which needs to be parsed into a list of strings
        # for The Verifier. This parsing logic needs to be robust.
        generated_hypotheses_str = run_hypothesiser_agent(analysis_findings)
        if generated_hypotheses_str.startswith("Error:"):
            return f"Workflow failed at The Hypothesiser step: {generated_hypotheses_str}"

        # Simple parsing: split by numbered list format (e.g., "1. ", "2. ")
        # This assumes the Hypothesiser's prompt consistently generates a numbered list.
        # More robust parsing might be needed depending on the Hypothesiser's actual output format.
        hypotheses_questions = [item.strip() for item in generated_hypotheses_str.split('\n') if item.strip() and (item[0].isdigit() and item[1:].strip().startswith('.'))]
        # Fallback parsing if numbered list parsing fails or returns empty
        if not hypotheses_questions and generated_hypotheses_str.strip():
             # Simple split by lines as a fallback
             hypotheses_questions = [item.strip() for item in generated_hypotheses_str.split('\n') if item.strip()]


        if not hypotheses_questions:
             print("Warning: The Hypothesiser did not generate any valid hypotheses or questions.")
             # Decide how to handle this: proceed with empty list or stop?
             # Let's proceed with an empty list; The Verifier should handle it gracefully.
             pass # hypotheses_questions remains empty or contains parsed items

        print(f"The Hypothesiser completed successfully. Generated {len(hypotheses_questions)} items.")
        # print(f"Generated Items: {hypotheses_questions}") # Uncomment for detailed debugging

    except Exception as e:
        print(f"An unexpected error occurred during The Hypothesiser step: {e}")
        traceback.print_exc()
        return f"Workflow failed due to unexpected error in The Hypothesiser step: {e}"


    # Step 4: Run The Verifier agent
    try:
        print("\nStep 4: Running The Verifier...")
        # Assuming run_verifier_agent is defined in a previous cell
        # The Verifier expects a list of strings
        verification_results = run_verifier_agent(hypotheses_questions)
        # run_verifier_agent is designed to return [] and print an error if it fails internally,
        # but we should check if the function call itself raised an exception.
        if isinstance(verification_results, str) and verification_results.startswith("Error:"):
             return f"Workflow failed at The Verifier step: {verification_results}"

        if not verification_results:
             print("Warning: The Verifier did not produce any verification results.")
             # Decide how to handle this: proceed with empty list or stop?
             # Let's proceed with an empty list; The Synthesiser should handle it gracefully.
             pass # verification_results remains empty

        print(f"The Verifier completed successfully. Produced {len(verification_results)} results.")
        # print(f"Verification Results: {verification_results}") # Uncomment for detailed debugging

    except Exception as e:
        print(f"An unexpected error occurred during The Verifier step: {e}")
        traceback.print_exc()
        return f"Workflow failed due to unexpected error in The Verifier step: {e}"


    # Step 5: Run The Synthesiser agent
    try:
        print("\nStep 5: Running The Synthesiser...")
        # Assuming run_synthesiser_agent is defined in a previous cell
        # The Synthesiser expects a list of dictionaries (verification_results)
        final_report = run_synthesiser_agent(verification_results)
        if final_report.startswith("Error:"):
             return f"Workflow failed at The Synthesiser step: {final_report}"
        print("The Synthesiser completed successfully.")
        print("\n--- Workflow Completed Successfully ---")
        return final_report

    except Exception as e:
        print(f"An unexpected error occurred during The Synthesiser step: {e}")
        traceback.print_exc()
        return f"Workflow failed due to unexpected error in The Synthesiser step: {e}"

# Example of how to run the workflow
# initial_user_query = "Analyze security standards related to IoT devices in 5G networks."
# gap_report = run_gap_identification_workflow(initial_user_query)
#
# print("\n\n--- FINAL GAP IDENTIFICATION REPORT ---")
# print(gap_report)
# print("--- END OF REPORT ---")

print("Function 'run_gap_identification_workflow' defined.")


Function 'run_gap_identification_workflow' defined.


## Create an interface

### Subtask:
Develop a way for the user to initiate the gap identification process and view the final report from The Synthesiser.


**Reasoning**:
Implement the command-line interface loop to initiate the gap identification process and display the final report.



In [19]:
# Implement the command-line interface loop
print("Telecommunication Standards Gap Identification System")
print("Enter your query to start the analysis, or type 'quit' to exit.")

while True:
    user_query = input("\nYou: ")

    if user_query.lower() == 'quit':
        print("Exiting the system.")
        break

    if not user_query.strip():
        print("Please enter a query to start the gap identification.")
        continue

    # Call the workflow function with the user's input
    # Assuming run_gap_identification_workflow is defined in a previous cell
    final_report = run_gap_identification_workflow(user_query)
    print(final_report)
    print("--- END OF REPORT ---")



> Entering new AgentExecutor chain...
> Question: Are spectrum allocation guidelines adaptable to the regulatory and economic constraints in African nations?

=== [Agent: The Cartographer] ===
Thought: I need to extract and summarize the key themes around spectrum allocation and African regulatory/economic contexts.
Action: topic-modeling/summarization  
Action Input:  
«Pinecone index “telecom-rag-index” documents on spectrum allocation guidelines; focus on regulatory frameworks, economic models, case studies from Africa.»

Observation:  
• Theme 1: Varied national regulator capacities  
• Theme 2: Cost-recovery vs universal service goals  
• Theme 3: Regional harmonization efforts (e.g., ECOWAS, AU)  
• Theme 4: Technology-neutral licensing  
• Theme 5: Secondary markets and spectrum trading

Thought: Summaries ready for analysis.

=== [Agent: The Analyst] ===
Thought: Compare ideal spectrum models with on‑the‑ground constraints in African nations.
Action: comparative-analysis  
Act