# Run Once

In [1]:
from workshop_code.agents import NaiveQaRagAgent
from IPython.display import Markdown

def display_md(content):
  display(Markdown(content))

# Summary
You can combine the indexer, retriever, and generator into a convenient abstracted interface like in the code below:

In [2]:
doc_uri = "https://arxiv.org/html/2312.10997v5"
qa_agent = NaiveQaRagAgent()

qa_agent.index(doc_uri)

question = "What is naive RAG?"
completion = qa_agent.query(question)
display_md(completion)

The term "Naive RAG" refers to a research paradigm in the field of AI, specifically in the context of the Retrieve-Augment-Generate (RAG) framework. Naive RAG represents the earliest methodology within the RAG family and is characterized by a traditional process that involves indexing, retrieval, and generation of information.

In the Naive RAG approach, the process starts with indexing, where raw data in various formats like PDF, HTML, Word, and Markdown is cleaned, extracted, and converted into a uniform plain text format. This indexed data is then segmented into smaller, digestible chunks, encoded into vector representations using an embedding model, and stored in a vector database. This step is crucial for enabling efficient similarity searches during the retrieval phase.

During the retrieval phase, when a user query is received, the RAG system transforms the query into a vector representation using the same encoding model used during indexing. It then computes similarity scores between the query vector and the vector representations of chunks within the indexed corpus. The system retrieves the top K chunks that demonstrate the greatest similarity to the query, which are then used as expanded context in the prompt.

Finally, in the generation phase, the posed query and selected documents are synthesized into a coherent prompt. A large language model

## Task: inspect the code
Look at the code in `./cheat_code/agents.py`. If anything doesn't make sense, let one of us know.