# Ensemble Retrieval Guide

Oftentimes when building a RAG applications there are many retreival parameters/strategies to decide from (from chunk size to vector vs. keyword vs. hybrid search, for instance).

Thought: what if we could try a bunch of strategies at once, and have any AI/reranker/LLM prune the results?

This achieves two purposes:
- Better (albeit more costly) retrieved results by pooling results from multiple strategies, assuming the reranker is good
- A way to benchmark different retrieval strategies against each other (w.r.t reranker)

This guide showcases this over the Great Gatsby. We do ensemble retrieval over different chunk sizes and also different indices.

## Setup

In [1]:
# NOTE: This is ONLY necessary in jupyter notebook.
# Details: Jupyter runs an event-loop behind the scenes.
#          This results in nested event-loops when we start an event-loop to make async queries.
#          This is normally not allowed, we use nest_asyncio to allow it for convenience.
import nest_asyncio

nest_asyncio.apply()

In [2]:
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().handlers = []
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

from llama_index import (
    VectorStoreIndex,
    ListIndex,
    SimpleDirectoryReader,
    ServiceContext,
    StorageContext,
    SimpleKeywordTableIndex,
)
from llama_index.response.notebook_utils import display_response
from llama_index.llms import OpenAI

Note: NumExpr detected 12 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
NumExpr defaulting to 8 threads.


## Load Data

We first show how to convert a Document into a set of Nodes, and insert into a DocumentStore.

In [3]:
# try loading great gatsby

documents = SimpleDirectoryReader(
    input_files=["../../../examples/gatsby/gatsby_full.txt"]
).load_data()

In [4]:
# initialize service context (set chunk size)
llm = OpenAI(model="gpt-4")
chunk_sizes = [128, 256, 512, 1024]
service_contexts = []
nodes_list = []
vector_indices = []
query_engines = []
for chunk_size in chunk_sizes:
    print(f"Chunk Size: {chunk_size}")
    service_context = ServiceContext.from_defaults(chunk_size=chunk_size, llm=llm)
    service_contexts.append(service_context)
    nodes = service_context.node_parser.get_nodes_from_documents(documents)

    # add chunk size to nodes to track later
    for node in nodes:
        node.metadata["chunk_size"] = chunk_size
        node.excluded_embed_metadata_keys = ["chunk_size"]
        node.excluded_llm_metadata_keys = ["chunk_size"]

    nodes_list.append(nodes)

    # build vector index
    vector_index = VectorStoreIndex(nodes)
    vector_indices.append(vector_index)

    # query engines
    query_engines.append(vector_index.as_query_engine())

Chunk Size: 128
Chunk Size: 256
Chunk Size: 512
Chunk Size: 1024


In [5]:
# try ensemble retrieval

from llama_index.tools import RetrieverTool

retriever_tools = []
for chunk_size, vector_index in zip(chunk_sizes, vector_indices):
    retriever_tool = RetrieverTool.from_defaults(
        retriever=vector_index.as_retriever(),
        description=f"Retrieves relevant context from the Great Gatsby (chunk size {chunk_size})",
    )
    retriever_tools.append(retriever_tool)

In [6]:
from llama_index.selectors.pydantic_selectors import PydanticMultiSelector
from llama_index.retrievers import RouterRetriever


retriever = RouterRetriever(
    selector=PydanticMultiSelector.from_defaults(llm=llm, max_outputs=4),
    retriever_tools=retriever_tools,
)

In [7]:
nodes = await retriever.aretrieve(
    "Describe and summarize the interactions between Gatsby and Daisy"
)

Selecting retriever 0: This choice retrieves a moderate amount of context from the Great Gatsby, which could provide a balanced amount of detail for describing and summarizing the interactions between Gatsby and Daisy..
Selecting retriever 1: This choice retrieves a larger amount of context from the Great Gatsby, which could provide more detail for describing and summarizing the interactions between Gatsby and Daisy..
Selecting retriever 2: This choice retrieves an even larger amount of context from the Great Gatsby, which could provide a comprehensive summary of the interactions between Gatsby and Daisy..
Selecting retriever 3: This choice retrieves the largest amount of context from the Great Gatsby, which could provide the most detailed and comprehensive summary of the interactions between Gatsby and Daisy..
message='OpenAI API response' path=https://api.openai.com/v1/embeddings processing_ms=40 request_id=d269f8a582ac9a70cdb6f587a34d5877 response_code=200
message='OpenAI API respon

In [8]:
for node in nodes:
    print(node.node.metadata["chunk_size"])
    print(node.node.get_text())

128
the beach that morning. Finally we came to Gatsby’s own
apartment, a bedroom and a bath, and an Adam’s study, where we sat
down and drank a glass of some Chartreuse he took from a cupboard in
the wall.

He hadn’t once ceased looking at Daisy, and I think he revalued
everything in his house according to the measure of response it drew
from her well-loved eyes. Sometimes too, he stared around at his
possessions in a dazed
128
turn out as he had
imagined. He had intended, probably, to take what he could and go—but
now he found that he had committed himself to the following of a
grail. He knew that Daisy was extraordinary, but he didn’t realize
just how extraordinary a “nice” girl could be. She vanished into her
rich house, into her rich, full life, leaving Gatsby—nothing. He felt
married to her, that was all.

When they met again, two days later, it
256
the
direction. In this heat every extra gesture was an affront to the
common store of life.

The room, shadowed well with awnings, wa

In [9]:
# define reranker
from llama_index.indices.postprocessor import (
    LLMRerank,
    SentenceTransformerRerank,
    CohereRerank,
)

# reranker = LLMRerank()
# reranker = SentenceTransformerRerank(top_n=10)
reranker = CohereRerank(top_n=10)

In [10]:
# define RetrieverQueryEngine
from llama_index.query_engine import RetrieverQueryEngine

query_engine = RetrieverQueryEngine(retriever, node_postprocessors=[reranker])

In [11]:
response = query_engine.query(
    "Describe and summarize the interactions between Gatsby and Daisy"
)

Selecting retriever 0: This choice provides a moderate chunk size that could contain relevant interactions between Gatsby and Daisy without being too overwhelming..
Selecting retriever 1: This choice provides a larger chunk size that could contain more detailed interactions between Gatsby and Daisy..
Selecting retriever 2: This choice provides an even larger chunk size that could contain extensive interactions between Gatsby and Daisy, providing a more comprehensive summary..
Selecting retriever 3: This choice provides the largest chunk size that could contain the most detailed and comprehensive interactions between Gatsby and Daisy, but it might also include a lot of irrelevant information..


In [12]:
display_response(
    response, show_source=True, source_length=500, show_source_metadata=True
)

**`Final Response:`** The interactions between Gatsby and Daisy are full of tension and emotion. Daisy is drawn to Gatsby's cool demeanor and his mysterious past, and Gatsby is captivated by Daisy's beauty and her voice, which he describes as "full of money." They are constantly aware of each other's presence, and when they are together, they are often lost in their own world. At one point, Daisy tells Gatsby that she loves him, and Tom Buchanan sees. Tom is astounded and looks between Gatsby and Daisy as if he has just recognized her from a long time ago. Daisy then innocently remarks that Gatsby looks like the advertisement of a man, and Tom quickly interrupts and suggests they all go to town. Daisy then begs Tom to let them have some fun, but he does not answer.

When they are all ready to leave, Daisy suggests that they take something to drink, and Tom goes inside to get whisky. Gatsby then turns to Nick and says he can't say anything in Tom's house. Nick remarks that Daisy has an indiscreet voice, and Gatsby responds that it is full of money. When they arrive at the party, Gatsby takes them ceremoniously from group to group, introducing them as Mr. and Mrs. Buchanan. Tom then remarks that he doesn't know anyone there, and Gatsby suggests they look around. Daisy is captivated by a gorgeous woman, and Gatsby reveals that the man bending over her is her director.

When they return to Gatsby's house, Gatsby turns on the lights and they all go into the music room. Gatsby lights Daisy's cigarette and they sit on a couch far away from the light. Klipspringer plays the piano, and Gatsby commands him to play. As Nick is about to leave, he notices that Gatsby and Daisy are lost in their own world, and Gatsby's expression of bewilderment has returned, as if he is doubting the quality of his present happiness. The tension between them is palpable, and it is clear that Gatsby is deeply in love with Daisy, while Daisy is drawn to Gatsby's mysterious past and his passionate devotion to her. They are both aware of the power of their connection, and it is clear that they are both deeply affected by it.

---

**`Source Node 1/8`**

**Node ID:** 237d8160-70be-4506-8639-2b9ef11bcbf8<br>**Similarity:** 0.9875684<br>**Text:** Daisy insistently. Gatsby’s eyes
floated toward her. “Ah,” she cried, “you look so cool.”

Their eyes met, and they stared together at each other, alone in
space. With an effort she glanced down at the table.

“You always look so cool,” she repeated.

She had told him that she loved him, and Tom Buchanan saw. He was
astounded. His mouth opened a little, and he looked at Gatsby, and
then back at Daisy as if he had just recognized her as someone he knew
a long time ago.

“You resemble the adver...<br>**Metadata:** {'chunk_size': 1024}<br>

---

**`Source Node 2/8`**

**Node ID:** e2fd481c-f587-4edb-8eba-07ed266e5d87<br>**Similarity:** 0.97419035<br>**Text:** world complete
in itself, with its own standards and its own great figures, second to
nothing because it had no consciousness of being so, and now I was
looking at it again, through Daisy’s eyes. It is invariably saddening
to look through new eyes at things upon which you have expended your
own powers of adjustment.

They arrived at twilight, and, as we strolled out among the sparkling
hundreds, Daisy’s voice was playing murmurous tricks in her throat.

“These things excite me so,” she whispe...<br>**Metadata:** {'chunk_size': 512}<br>

---

**`Source Node 3/8`**

**Node ID:** 3761970f-1920-4708-8f53-f59e6639ce11<br>**Similarity:** 0.9717254<br>**Text:** The
grey windows disappeared as the house glowed full of light.

In the music-room Gatsby turned on a solitary lamp beside the piano.
He lit Daisy’s cigarette from a trembling match, and sat down with her
on a couch far across the room, where there was no light save what the
gleaming floor bounced in from the hall.

When Klipspringer had played “The Love Nest” he turned around on the
bench and searched unhappily for Gatsby in the gloom.

“I’m all out of practice, you see. I told you I couldn’...<br>**Metadata:** {'chunk_size': 1024}<br>

---

**`Source Node 4/8`**

**Node ID:** 51691a7e-0ba1-4caa-b233-462bf45b452e<br>**Similarity:** 0.9529258<br>**Text:** go downstairs,” interrupted Gatsby. He flipped a switch. The
grey windows disappeared as the house glowed full of light.

In the music-room Gatsby turned on a solitary lamp beside the piano.
He lit Daisy’s cigarette from a trembling match, and sat down with her
on a couch far across the room, where there was no light save what the
gleaming floor bounced in from the hall.

When Klipspringer had played “The Love Nest” he turned around on the
bench and searched unhappily for Gatsby in the gloom....<br>**Metadata:** {'chunk_size': 512}<br>

---

**`Source Node 5/8`**

**Node ID:** e6a478a2-fadc-442c-8c47-9f44eba98190<br>**Similarity:** 0.93245333<br>**Text:** the beach that morning. Finally we came to Gatsby’s own
apartment, a bedroom and a bath, and an Adam’s study, where we sat
down and drank a glass of some Chartreuse he took from a cupboard in
the wall.

He hadn’t once ceased looking at Daisy, and I think he revalued
everything in his house according to the measure of response it drew
from her well-loved eyes. Sometimes too, he stared around at his
possessions in a dazed...<br>**Metadata:** {'chunk_size': 128}<br>

---

**`Source Node 6/8`**

**Node ID:** 978bcc28-e5ab-46a0-830c-4420c5099311<br>**Similarity:** 0.93033177<br>**Text:** In the meantime, In between time—”

As I went over to say goodbye I saw that the expression of
bewilderment had come back into Gatsby’s face, as though a faint doubt
had occurred to him as to the quality of his present happiness. Almost
five years! There must have been moments even that afternoon when
Daisy tumbled short of his dreams—not through her own fault, but
because of the colossal vitality of his illusion. It had gone beyond
her, beyond everything. He had thrown himself into it with a...<br>**Metadata:** {'chunk_size': 256}<br>

---

**`Source Node 7/8`**

**Node ID:** 94550d7f-2e92-4c27-8249-57d1c3d7a709<br>**Similarity:** 0.92135763<br>**Text:** turn out as he had
imagined. He had intended, probably, to take what he could and go—but
now he found that he had committed himself to the following of a
grail. He knew that Daisy was extraordinary, but he didn’t realize
just how extraordinary a “nice” girl could be. She vanished into her
rich house, into her rich, full life, leaving Gatsby—nothing. He felt
married to her, that was all.

When they met again, two days later, it...<br>**Metadata:** {'chunk_size': 128}<br>

---

**`Source Node 8/8`**

**Node ID:** f9e88ebd-cc13-493f-a4c2-483124fed766<br>**Similarity:** 0.82290184<br>**Text:** the
direction. In this heat every extra gesture was an affront to the
common store of life.

The room, shadowed well with awnings, was dark and cool. Daisy and
Jordan lay upon an enormous couch, like silver idols weighing down
their own white dresses against the singing breeze of the fans.

“We can’t move,” they said together.

Jordan’s fingers, powdered white over their tan, rested for a moment
in mine.

“And Mr. Thomas Buchanan, the athlete?” I inquired.

Simultaneously I heard his voice, g...<br>**Metadata:** {'chunk_size': 256}<br>

In [13]:
# compute the average precision for each chunk size based on positioning in combined ranking
from collections import defaultdict
import pandas as pd


def mrr_all(metadata_values, metadata_key, source_nodes):
    # source nodes is a ranked list
    # go through each value, find out positioning in source_nodes
    value_to_mrr_dict = {}
    for metadata_value in metadata_values:
        mrr = 0
        for idx, source_node in enumerate(source_nodes):
            if source_node.node.metadata[metadata_key] == metadata_value:
                mrr = 1 / (idx + 1)
                break
            else:
                continue

        # normalize AP, set in dict
        value_to_mrr_dict[metadata_value] = mrr

    df = pd.DataFrame(value_to_mrr_dict, index=["MRR"])
    df.style.set_caption("Mean Reciprocal Rank")
    return df

In [14]:
# Compute the Mean Reciprocal Rank for each chunk size (higher is better)
# we can see that chunk size of 256 has the highest ranked results.
print("Mean Reciprocal Rank for each Chunk Size")
mrr_all(chunk_sizes, "chunk_size", response.source_nodes)

Mean Reciprocal Rank for each Chunk Size


Unnamed: 0,128,256,512,1024
MRR,0.2,0.166667,0.5,1.0


## Compare Against Baseline

Compare against a baseline of chunk size 1024 (k=2)

In [15]:
query_engine_1024 = query_engines[-1]

In [16]:
response_1024 = query_engine_1024.query(
    "Describe and summarize the interactions between Gatsby and Daisy"
)

In [17]:
display_response(response_1024, show_source=True, source_length=500)

**`Final Response:`** Gatsby and Daisy have a passionate and intense relationship. They are deeply in love, and Daisy is drawn to Gatsby's mysterious and glamorous lifestyle. Gatsby is devoted to Daisy, and he is willing to do anything to make her happy. In the scene described, Daisy and Gatsby are alone in the music room, and Gatsby has lit Daisy's cigarette from a trembling match. Klipspringer is playing the piano, and Gatsby tells him to stop talking and start playing. Daisy compliments Gatsby on his coolness, and they share a moment of intense eye contact. Tom Buchanan then interrupts them, and Daisy and Gatsby are forced to part. Gatsby is left with a look of bewilderment on his face, as if he is questioning the quality of his happiness. Daisy and Gatsby then part ways, with Daisy holding out her hand to Gatsby and Gatsby not recognizing the reporter who is watching them. The reporter is left with the impression that Gatsby and Daisy are deeply in love and that their relationship is full of emotion and passion.

---

**`Source Node 1/2`**

**Node ID:** 3761970f-1920-4708-8f53-f59e6639ce11<br>**Similarity:** 0.8610318997979541<br>**Text:** The
grey windows disappeared as the house glowed full of light.

In the music-room Gatsby turned on a solitary lamp beside the piano.
He lit Daisy’s cigarette from a trembling match, and sat down with her
on a couch far across the room, where there was no light save what the
gleaming floor bounced in from the hall.

When Klipspringer had played “The Love Nest” he turned around on the
bench and searched unhappily for Gatsby in the gloom.

“I’m all out of practice, you see. I told you I couldn’...<br>

---

**`Source Node 2/2`**

**Node ID:** 237d8160-70be-4506-8639-2b9ef11bcbf8<br>**Similarity:** 0.8589377236179734<br>**Text:** Daisy insistently. Gatsby’s eyes
floated toward her. “Ah,” she cried, “you look so cool.”

Their eyes met, and they stared together at each other, alone in
space. With an effort she glanced down at the table.

“You always look so cool,” she repeated.

She had told him that she loved him, and Tom Buchanan saw. He was
astounded. His mouth opened a little, and he looked at Gatsby, and
then back at Daisy as if he had just recognized her as someone he knew
a long time ago.

“You resemble the adver...<br>