# **Maven Super Agent**

## Setup

In [117]:
import os
from IPython.display import Markdown, display
# Setting environment variables
os.environ['OPENAI_API_KEY'] = ''
research_topic = "AVX512"

In [118]:
sources = ["pdf",
           "wikipedia",
           "arxiv",
           "youtube",
           "webpage"]

## PDF Reader

In [119]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
documents_pdf = SimpleDirectoryReader("local_pdfs").load_data()

In [120]:
# PDF unit testing
index_pdf = VectorStoreIndex.from_documents(documents_pdf)
index_pdf.storage_context.persist(persist_dir="pdf_reader")
query_engine_pdf = index_pdf.as_query_engine()
response = query_engine_pdf.query("What is SIMD width of avx 512?")
display(Markdown(f"{response}"))

The SIMD width of AVX-512 is 512 bits.

## Wikipedia Reader

In [61]:
# wikipedia pages
from llama_index.readers.wikipedia import WikipediaReader

documents_wiki = WikipediaReader().load_data(
    pages=[research_topic]
)

In [64]:
# Wikipedia unit testing
from llama_index.core import VectorStoreIndex
index_wikipedia = VectorStoreIndex.from_documents(documents_wiki)
index_wikipedia.storage_context.persist(persist_dir="wiki_reader")
query_engine_wiki = index_wikipedia.as_query_engine()
response = query_engine_wiki.query("What is SIMD width of avxvnni?")
display(Markdown(f"{response}"))

The SIMD width of AVX-VNNI is 512 bits.

In [65]:
from llama_index.tools.wikipedia.base import WikipediaToolSpec
wiki_spec = WikipediaToolSpec()
wiki_spec.search_data(research_topic)

'AVX-512 are 512-bit extensions to the 256-bit Advanced Vector Extensions SIMD instructions for x86 instruction set architecture (ISA) proposed by Intel in July 2013, and first implemented in the 2016 Intel Xeon Phi x200 (Knights Landing), and then later in a number of AMD and other Intel CPUs (see list below). AVX-512 consists of multiple extensions that may be implemented independently. This policy is a departure from the historical requirement of implementing the entire instruction block. Only the core extension AVX-512F (AVX-512 Foundation) is required by all AVX-512 implementations.\nBesides widening most 256-bit instructions, the extensions introduce various new operations, such as new data conversions, scatter operations, and permutations. The number of AVX registers is increased from 16 to 32, and eight new "mask registers" are added, which allow for variable selection and blending of the results of instructions. In CPUs with the vector length (VL) extension—included in most AV

## Arxiv Reader

In [81]:
from llama_index.readers.papers import ArxivReader

loader = ArxivReader()
documents_arxiv = loader.load_data(
    search_query=research_topic, max_results=5
)

In [82]:
# Arxiv unit test
index_arxiv = VectorStoreIndex.from_documents(documents_arxiv)
index_arxiv.storage_context.persist(persist_dir="arxiv_reader")
query_engine_arxiv = index_arxiv.as_query_engine()
response = query_engine_arxiv.query("What is the difference between avx512 and avx2?")
display(Markdown(f"{response}"))

AVX512 introduces new features and improvements over AVX2, such as wider registers (512-bit versus 256-bit in AVX2), additional instructions, and enhanced capabilities for parallel processing. AVX512 also offers better performance for certain operations due to its increased register width and expanded instruction set.

In [42]:
from llama_index.tools.arxiv.base import ArxivToolSpec
arxiv_tool = ArxivToolSpec()
arxiv_tool.arxiv_query(research_topic)

<llama_index.tools.arxiv.base.ArxivToolSpec object at 0x000001C1DF9C4430>


## Youtube reader

In [113]:
from llama_index.readers.youtube_transcript import YoutubeTranscriptReader
from llama_index.core import VectorStoreIndex

# Verge summary of WWDC 2024 - 18 mins
yt_links = ["https://youtu.be/P1s6ZQcDvUs?si=I4HJFhv241Pnl4Bp",
            "https://youtu.be/bskEGP0r3hE?si=y23RB5OzZS3_JvOp"]
documents_youtube = YoutubeTranscriptReader().load_data(ytlinks=yt_links)

In [75]:
# YouTube unit test
index_youtube = VectorStoreIndex.from_documents(documents_youtube)
index_youtube.storage_context.persist(persist_dir="youtube_reader")
query_engine_youtube = index_youtube.as_query_engine()
response = query_engine_youtube.query("Whats Ian's take on avx 512?")
display(Markdown(f"{response}"))

Ian's take on AVX 512 is that Intel's communication regarding AVX 512 has been somewhat lacking, especially in terms of explaining the technology to the average consumer. He also mentions that there have been thermal issues with AVX 512 in the past, causing CPUs to get hot and experience clock speed decreases. Ian discusses how AMD implemented AVX 512 differently, leading to performance improvements in some cases. Additionally, he suggests that Intel could make AVX 512 more accessible and efficient by unifying it with a 512-bit vector length, which would simplify code writing and improve adoption.

## Webpage

In [77]:
from llama_index.tools.duckduckgo import DuckDuckGoSearchToolSpec
from llama_index.readers.web import SimpleWebPageReader
search_tool = DuckDuckGoSearchToolSpec()
full_search = search_tool.duckduckgo_full_search(research_topic, max_results=3)
urls = [article['href'] for article in full_search]
documents_webpage = SimpleWebPageReader(html_to_text=True).load_data(
    urls
)

In [78]:
# Webpage unit test
from llama_index.core import SummaryIndex
from IPython.display import Markdown, display

index_webpage = SummaryIndex.from_documents(documents_webpage)
index_webpage.storage_context.persist(persist_dir="webpage_reader")
query_engine_webpage = index_webpage.as_query_engine()
response = query_engine_webpage.query("What are the downsides of using avx512?")
display(Markdown(f"{response}"))

The downsides of utilizing AVX-512 involve potential difficulties in software optimization, compatibility issues with older hardware, higher power consumption due to increased computational requirements, and the necessity for specialized expertise to fully utilize the advanced capabilities of AVX-512. Moreover, some applications may not experience substantial benefits from AVX-512 instructions, resulting in potential inefficiencies in specific situations.

## Multi document agent

In [24]:
from llama_index.core import (
    VectorStoreIndex,
    SimpleKeywordTableIndex,
    SimpleDirectoryReader,
)
from llama_index.core import SummaryIndex
from llama_index.core.schema import IndexNode
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.llms.openai import OpenAI
from llama_index.core.callbacks import CallbackManager
from llama_index.agent.openai import OpenAIAgent
from llama_index.core import load_index_from_storage, StorageContext
from llama_index.core.node_parser import SentenceSplitter
import os
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import Settings

Settings.llm = OpenAI(temperature=0, model="gpt-3.5-turbo")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")

node_parser = SentenceSplitter()

# Build agents dictionary
agents = {}
query_engines = {}

# this is for the baseline
all_nodes = []

source_dict ={}
source_dict['pdf'] = documents_pdf
source_dict['arxiv'] = documents_arxiv
source_dict['youtube'] = documents_youtube
source_dict['wikipedia'] = documents_wiki
source_dict['webpage'] = documents_webpage

In [106]:

# Iterate over the combined documents
for idx, source in enumerate(sources):
    nodes = node_parser.get_nodes_from_documents(source_dict[source])
    all_nodes.extend(nodes)

    # Check if a directory for the wiki title exists, if not, create it and build a new vector index
    if not os.path.exists(f"./data/{source}"):
        vector_index = VectorStoreIndex(nodes)  # Create a vector index from the nodes
        vector_index.storage_context.persist(persist_dir=f"./data/{source}")  # Persist the vector index to disk
    else:
        # If directory exists, load the existing vector index from storage
        vector_index = load_index_from_storage(
            StorageContext.from_defaults(persist_dir=f"./data/{source}"),
        )

    # Create a summary index from the nodes
    summary_index = SummaryIndex(nodes)

    # Define query engines for vector and summary indexes
    vector_query_engine = vector_index.as_query_engine(llm=Settings.llm)
    summary_query_engine = summary_index.as_query_engine(llm=Settings.llm)

    # Define tools for querying the vector and summary indexes
    query_engine_tools = [
        QueryEngineTool(
            query_engine=vector_query_engine,
            metadata=ToolMetadata(
                name="vector_tool",
                description=(
                    f"Useful for questions related to specific aspects of {source}.\n\n"
                    "You are a research assistant providing accurate information from local PDFs, Wikipedia, YouTube, webpages, and arXiv papers. Follow these guidelines:\n"
                    "1. For broad overviews, historical context, and general information, use Wikipedia.\n"
                    "2. For in-depth research and technical details, use arXiv papers.\n"
                    "3. For the latest news and updates, use webpages.\n"
                    "4. For tutorials, demonstrations, reviews and feedback from the industry, use YouTube.\n"
                    "5. For specific, pre-verified documents, use local PDFs.\n"
                    "General Instructions:\n"
                    "Choose the source that best fits the query. Combine information from multiple sources when necessary. Ensure the information is accurate and relevant.\n\n"
                    "You must ALWAYS use at least one of the provided tools when answering a question; do NOT rely on prior knowledge alone."
                ),
            ),
        ),
        QueryEngineTool(
            query_engine=summary_query_engine,
            metadata=ToolMetadata(
                name="summary_tool",
                description=(
                    f"Useful for questions related to specific aspects of {source}.\n\n"
                    "You are a research assistant providing accurate information from local PDFs, Wikipedia, YouTube, webpages, and arXiv papers. Follow these guidelines:\n"
                    "1. For broad overviews, historical context, and general information, use Wikipedia.\n"
                    "2. For in-depth research and technical details, use arXiv papers.\n"
                    "3. For the latest news and updates, use webpages.\n"
                    "4. For tutorials, demonstrations, reviews and feedback from the industry, use YouTube.\n"
                    "5. For specific, pre-verified documents, use local PDFs.\n"
                    "General Instructions:\n"
                    "Choose the source that best fits the query. Combine information from multiple sources when necessary. Ensure the information is accurate and relevant.\n\n"
                    "You must ALWAYS use at least one of the provided tools when answering a question; do NOT rely on prior knowledge alone."
                ),
            ),
        ),
    ]

    # Build an OpenAI agent with the defined tools and a custom system prompt
    function_llm = OpenAI(model="gpt-4")  # Initialize the LLM with GPT-4 model
    agent = OpenAIAgent.from_tools(
        query_engine_tools,
        llm=function_llm,
        verbose=True,
        system_prompt=f"""\
You are a specialized agent designed to answer queries about {research_topic}.
You must ALWAYS use at least one of the tools provided when answering a question; do NOT rely on prior knowledge.\
""",
    )

    # Store the agent and the query engine in their respective dictionaries
    agents[source] = agent
    query_engines[source] = vector_index.as_query_engine(similarity_top_k=2)
    

In [107]:
all_tools = []
for source in sources:
    tool_summary = (
        f"Useful for questions related to specific aspects of {source}.\n\n"
        "You are a research assistant providing accurate information from local PDFs, Wikipedia, YouTube, webpages, and arXiv papers. Follow these guidelines:\n"
        "1. For broad overviews, historical context, and general information, use Wikipedia.\n"
        "2. For in-depth research and technical details, use arXiv papers.\n"
        "3. For the latest news and updates, use webpages.\n"
        "4. For tutorials, demonstrations, reviews and feedback from the industry, use YouTube.\n"
        "5. For specific, pre-verified documents, use local PDFs.\n"
        "General Instructions:\n"
        "Choose the source that best fits the query. Combine information from multiple sources when necessary. Ensure the information is accurate and relevant.\n\n"
        "You must ALWAYS use at least one of the provided tools when answering a question; do NOT rely on prior knowledge alone."
    )
    
    doc_tool = QueryEngineTool(
        query_engine=agents[source],
        metadata=ToolMetadata(
            name=f"tool_{source}",
            description=tool_summary,
        ),
    )
    all_tools.append(doc_tool)


In [108]:
# define an "object" index and retriever over these tools
from llama_index.core import VectorStoreIndex
from llama_index.core.objects import ObjectIndex

obj_index = ObjectIndex.from_objects(
    all_tools,
    index_cls=VectorStoreIndex,
)

In [109]:
from llama_index.agent.openai import OpenAIAgent

top_agent = OpenAIAgent.from_tools(
    tool_retriever=obj_index.as_retriever(similarity_top_k=3),
    system_prompt=""" \
        f"Useful for questions related to specific aspects of {source}.\n\n"
        "You are a research assistant providing accurate information from local PDFs, Wikipedia, YouTube, webpages, and arXiv papers. Follow these guidelines:\n"
        "1. For broad overviews, historical context, and general information, use Wikipedia.\n"
        "2. For in-depth research and technical details, use arXiv papers.\n"
        "3. For the latest news and updates, use webpages.\n"
        "4. For tutorials, demonstrations, reviews and feedback from the industry, use YouTube.\n"
        "5. For specific, pre-verified documents, use local PDFs.\n"
        "General Instructions:\n"
        "Choose the source that best fits the query. Combine information from multiple sources when necessary. Ensure the information is accurate and relevant.\n\n"
        "You must ALWAYS use at least one of the provided tools when answering a question; do NOT rely on prior knowledge alone."
""",
    verbose=True,
)

In [110]:
base_index = VectorStoreIndex(all_nodes)
base_query_engine = base_index.as_query_engine(similarity_top_k=4)

In [102]:
# Historical context should use wikipedia
response = top_agent.query(f"What was used before the invention of {research_topic}")
display(Markdown(f"{response}"))

Added user message to memory: What was used before the invention of AVX512
=== Calling Function ===
Calling function: tool_wikipedia with args: {"input":"AVX512"}
Added user message to memory: AVX512
=== Calling Function ===
Calling function: summary_tool with args: {
  "input": "AVX512"
}
Got output: AVX512 is a set of advanced vector extensions designed to significantly improve performance for certain types of computations, particularly in fields like scientific simulations, artificial intelligence, and data analytics. These extensions provide a wider vector register file, new instructions, and additional capabilities for parallel processing, allowing for faster and more efficient processing of data.

Got output: AVX-512 (Advanced Vector Extensions 512) is a set of instructions that are part of the x86 instruction set architecture. It extends the existing AVX instruction set, and is designed to improve performance for specific types of computations. AVX-512 is particularly beneficial

Before the invention of AVX-512, the AVX (Advanced Vector Extensions) instruction set was used. AVX provided a set of instructions for SIMD (Single Instruction, Multiple Data) operations on Intel processors. AVX-512 is an extension of the AVX instruction set, offering wider vector register files, new instructions, and enhanced capabilities for parallel processing.

In [104]:
# Technical details should refer to arxiv
response = top_agent.query(f"Give me the technical details behind {research_topic}")
display(Markdown(f"{response}"))

Added user message to memory: Give me the technical details behind AVX512
=== Calling Function ===
Calling function: tool_arxiv with args: {"input":"AVX512"}
Added user message to memory: AVX512
=== Calling Function ===
Calling function: summary_tool with args: {
  "input": "AVX512"
}
Got output: Repeat.

Got output: AVX-512 (Advanced Vector Extensions) is a set of new instructions that can accelerate performance for workloads and usages such as scientific simulations, financial analytics, artificial intelligence (AI)/deep learning, 3D modeling and analysis, image and audio/video processing, cryptography and data compression.

AVX-512 instructions are important because they offer higher performance for the most demanding computational tasks. AVX-512 can operate on arrays of data in one operation, which is called SIMD (Single Instruction, Multiple Data) computing. This can significantly increase performance for applications that are able to take advantage of it.

AVX-512 has a larger nu

AVX-512 (Advanced Vector Extensions) is a set of new instructions that can accelerate performance for various workloads such as scientific simulations, financial analytics, AI/deep learning, 3D modeling, image and audio/video processing, cryptography, and data compression. These instructions offer higher performance for demanding computational tasks by operating on arrays of data in a single operation, known as SIMD (Single Instruction, Multiple Data) computing. AVX-512 provides a larger number of registers, wider vectors, more operations, and a more flexible programming model compared to AVX2. It also includes features for efficient software multi-threading and data alignment.

In [112]:
# For reviews and and feed back use youtube
response = top_agent.query(f"What is the review or feedback from the industry on {research_topic}")
display(Markdown(f"{response}"))

Added user message to memory: What is the review or feedback from the industry on AVX512
=== Calling Function ===
Calling function: tool_youtube with args: {"input":"AVX512 review"}
Added user message to memory: AVX512 review
=== Calling Function ===
Calling function: summary_tool with args: {
  "input": "AVX512 review"
}
Got output: AVX512 is not mentioned in the provided context information.

=== Calling Function ===
Calling function: vector_tool with args: {
  "input": "AVX512 review"
}
Got output: AVX512 is a set of advanced instructions designed for demanding computational tasks, particularly in the realm of scientific computing, artificial intelligence, and data analytics. It offers enhanced performance by allowing for parallel processing of a larger number of data elements compared to previous instruction sets.

Got output: AVX512 is a set of advanced instructions designed for demanding computational tasks, particularly in the realm of scientific computing, artificial intelligen

AVX512 is a set of advanced instructions designed for demanding computational tasks, particularly in scientific computing, artificial intelligence, and data analytics. It offers enhanced performance by allowing for parallel processing of a larger number of data elements compared to previous instruction sets. For more detailed reviews and feedback, you might want to check out specific industry forums or technical blogs.

In [115]:
# Latest news shuld use webpages/ wikipedia
response = top_agent.query(f"What is the latest news and updates on {research_topic}")
display(Markdown(f"{response}"))

Added user message to memory: What is the latest news and updates on AVX512
=== Calling Function ===
Calling function: tool_wikipedia with args: {"input":"AVX512"}
Added user message to memory: AVX512
=== Calling Function ===
Calling function: summary_tool with args: {
  "input": "AVX512"
}
Got output: AVX512 is not mentioned in the provided context information.

=== Calling Function ===
Calling function: vector_tool with args: {
  "input": "AVX512"
}
Got output: AVX512 is a set of advanced instructions for processors that can significantly accelerate performance for certain types of computations.

Got output: AVX512, or Advanced Vector Extensions 512, is a set of instructions for processors that can significantly enhance performance for specific types of computations. These instructions allow for the simultaneous processing of multiple data points, which can be particularly beneficial in fields such as scientific computing, machine learning, and multimedia applications.



AVX512, or Advanced Vector Extensions 512, is a set of instructions for processors that can significantly enhance performance for specific types of computations. These instructions allow for the simultaneous processing of multiple data points, which can be particularly beneficial in fields such as scientific computing, machine learning, and multimedia applications.

In [116]:
# Latest news shuld use webpages/ wikipedia
response = top_agent.query(f"What are the registers used in {research_topic}")
display(Markdown(f"{response}"))

Added user message to memory: What are the registers used in AVX512
=== Calling Function ===
Calling function: tool_wikipedia with args: {"input":"AVX512 registers"}
Added user message to memory: AVX512 registers
=== Calling Function ===
Calling function: summary_tool with args: {
  "input": "AVX512 registers"
}
Got output: Sundar Pichai's career and background do not relate to AVX512 registers.

=== Calling Function ===
Calling function: vector_tool with args: {
  "input": "AVX512 registers"
}
Got output: Sundar Pichai's background and career details are provided in the context.

=== Calling Function ===
Calling function: summary_tool with args: {
  "input": "AVX512 registers"
}
Got output: Sundar Pichai's career and background do not relate to AVX512 registers.

=== Calling Function ===
Calling function: vector_tool with args: {
  "input": "AVX512 registers"
}
Got output: Sundar Pichai's background and career information do not include any details related to AVX512 registers.

=== Ca

AVX-512 registers are 512 bits wide and can hold up to 64 bytes of data. They are used to enhance performance for certain types of computational tasks, such as vector and matrix operations.