# LongRAG example

This LlamaPack implements LongRAG based on [this paper](https://arxiv.org/pdf/2406.15319).

LongRAG retrieves large tokens at a time, with each retrieval unit being ~6k tokens long, consisting of entire documents or groups of documents. This contrasts the short retrieval units (100 word passages) of traditional RAG. LongRAG is advantageous because results can be achieved using only the top 4-8 retrieval units, and long-context LLMs can better understand the context of the documents because long retrieval units preserve their semantic integrity.

## Setup

In [None]:
%pip install llama-index


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.1.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


## Usage

Below shows the usage of `LongRAGPack` using the default OpenAI embed model and the `gpt-4o` LLM, which is able to handle long context inputs.

In [None]:
from llama_index.packs.longrag import LongRAGPack
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings

Settings.llm = OpenAI("gpt-4o")
embed_model = OpenAIEmbedding()

pack = LongRAGPack(data_dir="./data", embed_model=embed_model)

Generating embeddings:   0%|          | 0/175 [00:00<?, ?it/s]

In [None]:
from IPython.display import Markdown, display

query_str = (
    "How can Pittsburgh become a startup hub, and what are the two types of moderates?"
)
res = pack.run(query_str)
display(Markdown(str(res)))

To transform Pittsburgh into a startup hub, several strategies can be employed. Encouraging the youth-driven food boom is one approach, as it attracts young people, particularly those aged 25 to 29, who are crucial for startups. The city should also focus on maintaining its affordable yet desirable housing, preserving historic buildings, and enhancing its bicycle and pedestrian infrastructure to make it more appealing to young professionals. Additionally, leveraging Carnegie Mellon University's (CMU) strengths and striving to make it an even better research institution can attract ambitious talent. Finally, fostering a culture of tolerance and attracting investors, despite the current lack of a strong investor community, are also essential steps.

Regarding the two types of moderates, they are intentional moderates and accidental moderates. Intentional moderates deliberately choose positions midway between the extremes of right and left, while accidental moderates end up in the middle on average because they make up their own minds about each question, and the far right and far left are roughly equally wrong.

Other parameters include `chunk_size`, `similarity_top_k`, and `small_chunk_size`.
- `chunk_size`: To demonstrate how different documents are grouped together, documents are split into nodes of `chunk_size` tokens, then re-grouped based on the relationships between the nodes. Because this does not affect the final answer, it can be disabled by setting `chunk_size` to -1. The default size is 4096.
- `similarity_top_k`: Retrieves the top k large retrieval units. The default is 8, and based on the paper, the ideal range is 4-8.
- `small_chunk_size`: To compare similarities, each large retrieval unit is split into smaller child retrieval units of `small_chunk_size` tokens. The embeddings of these smaller retrieval units are compared to the query embeddings. The top k large parent retrieval units are chosen based on the maximum scores of their smaller child retrieval units. The default size is 512.

In [None]:
pack = LongRAGPack(data_dir="./data", chunk_size=-1, similarity_top_k=4)
query_str = (
    "How can Pittsburgh become a startup hub, and what are the two types of moderates?"
)
res = pack.run(query_str)
display(Markdown(str(res)))

Generating embeddings:   0%|          | 0/170 [00:00<?, ?it/s]

Pittsburgh can become a startup hub by leveraging its increasing population of young people, particularly those aged 25 to 29, who are crucial for the startup ecosystem. The city should encourage the youth-driven food boom, streamline the permit process for new restaurants and cafes, and focus on historic preservation to maintain its unique character. Additionally, Pittsburgh should capitalize on its pre-car city layout to become the most bicycle and pedestrian-friendly city in the country. Carnegie Mellon University (CMU) can contribute by continuing to be a top-tier research university and attracting ambitious talent. The city should also foster a culture of tolerance and pragmatism, reminiscent of its industrial roots, and gradually build an investor community as startups grow and succeed.

There are two types of moderates: intentional moderates and accidental moderates. Intentional moderates deliberately choose positions midway between the extremes of right and left, often shifting their stance as the median opinion changes. Accidental moderates, on the other hand, form their opinions independently on each issue, resulting in a broad range of views that average out to a moderate position. Intentional moderates' opinions are predictable and acquired in bulk, while accidental moderates' opinions are more varied and individually chosen.