In [1]:
import os
from dotenv import load_dotenv
import nest_asyncio
nest_asyncio.apply()
load_dotenv()

True

In [2]:
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')


## 1. Define a Simple Tool

In [4]:
from llama_index.core.tools import FunctionTool

def add(x: int, y: int) -> int:
    """Adds two integers together."""
    return x + y

def mystery(x: int, y: int) -> int: 
    """Mystery function that operates on top of two numbers."""
    return (x + y) * (x + y)

add_tool=FunctionTool.from_defaults(fn=add)
mystery_tool=FunctionTool.from_defaults(fn=mystery)


In [5]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo")
response = llm.predict_and_call(
    [add_tool, mystery_tool], 
    "Tell me the output of the mystery function on 2 and 9", 
    verbose=True
)
print(str(response))

=== Calling Function ===
Calling function: mystery with args: {"x": 2, "y": 9}
=== Function Output ===
121
121


## 2. Define an Auto-Retrieval Tool

In [9]:
from llama_index.core import SimpleDirectoryReader
# load documents
file_path=["./2_ Multidoc_Agentic_RAG/data/metagpt.pdf"]
documents = SimpleDirectoryReader(input_files=file_path).load_data()

In [10]:
from llama_index.core.node_parser import SentenceSplitter
splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

In [11]:
print(nodes[0].get_content(metadata_mode="all"))

page_label: 1
file_name: metagpt.pdf
file_path: 2_ Multidoc_Agentic_RAG/data/metagpt.pdf
file_type: application/pdf
file_size: 16911937
creation_date: 2024-05-24
last_modified_date: 2024-05-24

Preprint
METAGPT: M ETA PROGRAMMING FOR A
MULTI -AGENT COLLABORATIVE FRAMEWORK
Sirui Hong1∗, Mingchen Zhuge2∗, Jonathan Chen1, Xiawu Zheng3, Yuheng Cheng4,
Ceyao Zhang4,Jinlin Wang1,Zili Wang ,Steven Ka Shing Yau5,Zijuan Lin4,
Liyang Zhou6,Chenyu Ran1,Lingfeng Xiao1,7,Chenglin Wu1†,J¨urgen Schmidhuber2,8
1DeepWisdom,2AI Initiative, King Abdullah University of Science and Technology,
3Xiamen University,4The Chinese University of Hong Kong, Shenzhen,
5Nanjing University,6University of Pennsylvania,
7University of California, Berkeley,8The Swiss AI Lab IDSIA/USI/SUPSI
ABSTRACT
Remarkable progress has been made on automated problem solving through so-
cieties of agents based on large language models (LLMs). Existing LLM-based
multi-agent systems can already solve simple dialogue tasks. Solutions to 

In [18]:
from llama_index.core import VectorStoreIndex

vector_index = VectorStoreIndex(nodes)
query_engine = vector_index.as_query_engine(similarity_top_k=2)

In [19]:
response = query_engine.query(
    "What are some high-level results of MetaGPT?", 
)
print(str(response))

MetaGPT achieves an average score of 3.9, surpassing ChatDev's score of 2.1. It simplifies the process of transforming abstract requirements into detailed class and function designs through a specialized division of labor and SOPs workflow. Additionally, MetaGPT's structured messaging and feedback mechanisms reduce loss of communication information and improve the execution of code.


In [20]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '23', 'file_name': 'metagpt.pdf', 'file_path': '2_ Multidoc_Agentic_RAG/data/metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-05-24', 'last_modified_date': '2024-05-24'}
{'page_label': '9', 'file_name': 'metagpt.pdf', 'file_path': '2_ Multidoc_Agentic_RAG/data/metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-05-24', 'last_modified_date': '2024-05-24'}


If you want to do Metadatfiltering for RAG you can specify the dictionary of the metadata you want to filter on. Example below is of pagelabel and the value is 2.

In [41]:
from llama_index.core.vector_stores import MetadataFilters

query_engine = vector_index.as_query_engine(
    similarity_top_k=2,
    filters=MetadataFilters.from_dicts(
        [
            {"key": "page_label", "value": "2"},
            
        ],
    
    )
)


In [42]:
response = query_engine.query(
    "What are some high-level results of MetaGPT?", 
)
print(str(response))

MetaGPT achieves a new state-of-the-art in code generation benchmarks with 85.9% and 87.7% in Pass@1, surpassing other popular frameworks like AutoGPT, LangChain, AgentVerse, and ChatDev. Additionally, MetaGPT demonstrates robustness and efficiency by achieving a 100% task completion rate in experimental evaluations.


In [43]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '2', 'file_name': 'metagpt.pdf', 'file_path': '2_ Multidoc_Agentic_RAG/data/metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-05-24', 'last_modified_date': '2024-05-24'}


### Define the Auto-Retrieval Tool with Metdata filetring as optional parameter whihc your llm can infer from use query

In [44]:
from typing import List
from llama_index.core.vector_stores import FilterCondition


def vector_query(
    query: str, 
    page_numbers: List[str]
) -> str:
    """Perform a vector search over an index.
    
    query (str): the string query to be embedded.
    page_numbers (List[str]): Filter by set of pages. Leave BLANK if we want to perform a vector search
        over all pages. Otherwise, filter by the set of specified pages.
    
    """

    metadata_dicts = [
        {"key": "page_label", "value": p} for p in page_numbers
    ]
    
    query_engine = vector_index.as_query_engine(
        similarity_top_k=2,
        filters=MetadataFilters.from_dicts(
            metadata_dicts,
            condition=FilterCondition.OR
        )
    )
    response = query_engine.query(query)
    return response
    

vector_query_tool = FunctionTool.from_defaults(
    name="vector_tool",
    fn=vector_query
)

Pass the query and above vectorstore as a tool to the llm

In [63]:
llm = OpenAI(model="gpt-3.5-turbo", temperature=0)
response = llm.predict_and_call(
    [vector_query_tool], 
    "What are the high-level results of MetaGPT as described on page 2,3,10 and 20", 
    verbose=True
)

=== Calling Function ===
Calling function: vector_tool with args: {"query": "high-level results of MetaGPT", "page_numbers": ["2", "3", "10", "20"]}
=== Function Output ===
MetaGPT achieves state-of-the-art performance on HumanEval and MBPP, demonstrating its effectiveness as a meta-programming framework for developing LLM-based multi-agent systems. It significantly elevates code generation quality and has shown a 5.4% absolute improvement on MBPP. Additionally, MetaGPT achieves a new state-of-the-art in code generation benchmarks with 85.9% and 87.7% in Pass@1, outperforming other popular frameworks in creating complex software projects. In experimental evaluations, MetaGPT achieves a 100% task completion rate, showcasing its robustness and efficiency in handling higher levels of software complexity.


In [64]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '3', 'file_name': 'metagpt.pdf', 'file_path': '2_ Multidoc_Agentic_RAG/data/metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-05-24', 'last_modified_date': '2024-05-24'}
{'page_label': '2', 'file_name': 'metagpt.pdf', 'file_path': '2_ Multidoc_Agentic_RAG/data/metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-05-24', 'last_modified_date': '2024-05-24'}


## Let's add some other tools!

In [65]:
from llama_index.core import SummaryIndex
from llama_index.core.tools import QueryEngineTool

summary_index = SummaryIndex(nodes)
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
summary_tool = QueryEngineTool.from_defaults(
    name="summary_tool",
    query_engine=summary_query_engine,
    description=(
        "Useful if you want to get a summary of MetaGPT"
    ),
)

In [66]:
response = llm.predict_and_call(
    [vector_query_tool, summary_tool], 
    "What are the MetaGPT comparisons with ChatDev described on page 8?", 
    verbose=True
)

=== Calling Function ===
Calling function: vector_tool with args: {"query": "MetaGPT comparisons with ChatDev", "page_numbers": ["8"]}
=== Function Output ===
MetaGPT outperforms ChatDev on the SoftwareDev dataset in various aspects. For example, MetaGPT achieves a higher score in executability, takes less time for execution, requires more tokens but fewer tokens to generate one line of code compared to ChatDev. Additionally, MetaGPT demonstrates superior performance in code statistics and human revision cost when compared to ChatDev.


In [67]:
for n in response.source_nodes:
    print(n.metadata)

{'page_label': '8', 'file_name': 'metagpt.pdf', 'file_path': '2_ Multidoc_Agentic_RAG/data/metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2024-05-24', 'last_modified_date': '2024-05-24'}


In [68]:
response = llm.predict_and_call(
    [vector_query_tool, summary_tool], 
    "What is a summary of the paper?", 
    verbose=True
)

=== Calling Function ===
Calling function: summary_tool with args: {"input": "Get a summary of the paper."}
=== Function Output ===
The paper introduces MetaGPT, a meta-programming framework that enhances multi-agent systems based on Large Language Models (LLMs) by incorporating Standardized Operating Procedures (SOPs). MetaGPT utilizes role specialization, workflow management, and structured communication interfaces to streamline software development tasks. The framework includes agents like Product Managers, Architects, Engineers, and QA Engineers, each with specific roles and responsibilities. MetaGPT also introduces an executable feedback mechanism to improve code generation quality during runtime. Experimental results show that MetaGPT outperforms existing approaches in various benchmarks, demonstrating its effectiveness in software development tasks. Additionally, the paper discusses future directions for self-improvement mechanisms and multi-agent economies within the framework.