# LLM Reranker Demonstration

This tutorial showcases how to do a two-stage pass for retrieval. Use embedding-based retrieval with a high top-k value
in order to maximize recall and get a large set of candidate items. Then, use LLM-based retrieval
to dynamically select the nodes that are actually relevant to the query.

In [1]:
import nest_asyncio
nest_asyncio.apply()

In [2]:
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader, ServiceContext, LLMPredictor
from llama_index.indices.postprocessor import (
    LLMRerank
)
from langchain.chat_models import ChatOpenAI
from IPython.display import Markdown, display

INFO:numexpr.utils:Note: NumExpr detected 12 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
Note: NumExpr detected 12 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
INFO:numexpr.utils:NumExpr defaulting to 8 threads.
NumExpr defaulting to 8 threads.


  from .autonotebook import tqdm as notebook_tqdm


## Load Data, Build Index

In [16]:
# LLM Predictor (gpt-3.5-turbo) + service context
llm_predictor = LLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo"))
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, chunk_size_limit=512)

Unknown max input size for gpt-3.5-turbo, using defaults.


In [17]:
# load documents
documents = SimpleDirectoryReader('../../../examples/gatsby/data').load_data()

In [18]:
documents

[Document(text='\n                                 VII\n\nIt was when curiosity about Gatsby was at its highest that the lights\nin his house failed to go on one Saturday night—and, as obscurely as\nit had begun, his career as Trimalchio was over. Only gradually did I\nbecome aware that the automobiles which turned expectantly into his\ndrive stayed for just a minute and then drove sulkily away. Wondering\nif he were sick I went over to find out—an unfamiliar butler with a\nvillainous face squinted at me suspiciously from the door.\n\n“Is Mr. Gatsby sick?”\n\n“Nope.” After a pause he added “sir” in a dilatory, grudging way.\n\n“I hadn’t seen him around, and I was rather worried. Tell him Mr.\nCarraway came over.”\n\n“Who?” he demanded rudely.\n\n“Carraway.”\n\n“Carraway. All right, I’ll tell him.”\n\nAbruptly he slammed the door.\n\nMy Finn informed me that Gatsby had dismissed every servant in his\nhouse a week ago and replaced them with half a dozen others, who never\nwent into West 

In [19]:
index = GPTVectorStoreIndex.from_documents(documents, service_context=service_context)

INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 49266 tokens
> [build_index_from_nodes] Total embedding token usage: 49266 tokens


## Retrieval

In [20]:
from llama_index.retrievers import VectorIndexRetriever
from llama_index.indices.query.schema import QueryBundle

def get_retrieved_nodes(query_str, vector_top_k=10, reranker_top_n=3, with_reranker=False):
    query_bundle = QueryBundle(query_str)
    # configure retriever
    retriever = VectorIndexRetriever(
        index=index, 
        similarity_top_k=vector_top_k,
    )
    retrieved_nodes = retriever.retrieve(query_bundle)

    if with_reranker:
        # configure reranker
        reranker = LLMRerank(choice_batch_size=5, top_n=reranker_top_n, service_context=service_context)
        retrieved_nodes = reranker.postprocess_nodes(retrieved_nodes, query_bundle)
    
    return retrieved_nodes


def visualize_retrieved_nodes(nodes) -> None:
    for node in nodes:
        print(f'\n\n****Score****: {node.score}\n****Node text****\n: {node.node.get_text()}')

In [21]:
new_nodes = get_retrieved_nodes(
    "Who was driving the car that hit Myrtle?", vector_top_k=3, with_reranker=False
)

INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 10 tokens
> [retrieve] Total embedding token usage: 10 tokens


In [22]:
visualize_retrieved_nodes(new_nodes)



****Score****: 0.8288724419764687
****Node text****
: and some garrulous man telling over and over what
had happened, until it became less and less real even to him and he
could tell it no longer, and Myrtle Wilson’s tragic achievement was
forgotten. Now I want to go back a little and tell what happened at
the garage after we left there the night before.

They had difficulty in locating the sister, Catherine. She must have
broken her rule against drinking that night, for when she arrived she
was stupid with liquor and unable to understand that the ambulance had
already gone to Flushing. When they convinced her of this, she
immediately fainted, as if that was the intolerable part of the
affair. Someone, kind or curious, took her in his car and drove her in
the wake of her sister’s body.

Until long after midnight a changing crowd lapped up against the front
of the garage, while George Wilson rocked himself back and forth on
the couch inside. For a while the door of the office was open

In [25]:
new_nodes = get_retrieved_nodes(
    "Who was driving the car that hit Myrtle?", vector_top_k=10, reranker_top_n=3, with_reranker=True
)

INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 10 tokens
> [retrieve] Total embedding token usage: 10 tokens
Doc: 3, Relevance: 10
No relevant documents for this question.


In [26]:
visualize_retrieved_nodes(new_nodes)



****Score****: 10.0
****Node text****
: went on, “and left the car in
my garage. I don’t think anybody saw us, but of course I can’t be
sure.”

I disliked him so much by this time that I didn’t find it necessary to
tell him he was wrong.

“Who was the woman?” he inquired.

“Her name was Wilson. Her husband owns the garage. How the devil did
it happen?”

“Well, I tried to swing the wheel—” He broke off, and suddenly I
guessed at the truth.

“Was Daisy driving?”

“Yes,” he said after a moment, “but of course I’ll say I was. You see,
when we left New York she was very nervous and she thought it would
steady her to drive—and this woman rushed out at us just as we were
passing a car coming the other way. It all happened in a minute, but
it seemed to me that she wanted to speak to us, thought we were
somebody she knew. Well, first Daisy turned away from the woman toward
the other car, and then she lost her nerve and turned back. The second
my hand reached the wheel I felt the shock—it must

In [27]:
new_nodes = get_retrieved_nodes(
    "What did Gatsby want Daisy to do in front of Tom?", vector_top_k=3, with_reranker=False
)

INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 14 tokens
> [retrieve] Total embedding token usage: 14 tokens


In [28]:
visualize_retrieved_nodes(new_nodes)



****Score****: 0.8647796939111776
****Node text****
: got to make your house into a pigsty in order to have any
friends—in the modern world.”

Angry as I was, as we all were, I was tempted to laugh whenever he
opened his mouth. The transition from libertine to prig was so
complete.

“I’ve got something to tell you, old sport—” began Gatsby. But Daisy
guessed at his intention.

“Please don’t!” she interrupted helplessly. “Please let’s all go
home. Why don’t we all go home?”

“That’s a good idea,” I got up. “Come on, Tom. Nobody wants a drink.”

“I want to know what Mr. Gatsby has to tell me.”

“Your wife doesn’t love you,” said Gatsby. “She’s never loved you.
She loves me.”

“You must be crazy!” exclaimed Tom automatically.

Gatsby sprang to his feet, vivid with excitement.

“She never loved you, do you hear?” he cried. “She only married you
because I was poor and she was tired of waiting for me. It was a
terrible mistake, but in her heart she never loved anyone except me!”

At this p

In [29]:
new_nodes = get_retrieved_nodes(
    "What did Gatsby want Daisy to do in front of Tom?", vector_top_k=10, reranker_top_n=3, with_reranker=True
)

INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 14 tokens
> [retrieve] Total embedding token usage: 14 tokens
Doc: 2, Relevance: 10
No relevant documents found. Please provide a different question.


In [30]:
visualize_retrieved_nodes(new_nodes)



****Score****: 10.0
****Node text****
: to keep your
shoes dry?” There was a husky tenderness in his tone … “Daisy?”

“Please don’t.” Her voice was cold, but the rancour was gone from it.
She looked at Gatsby. “There, Jay,” she said—but her hand as she tried
to light a cigarette was trembling. Suddenly she threw the cigarette
and the burning match on the carpet.

“Oh, you want too much!” she cried to Gatsby. “I love you now—isn’t
that enough? I can’t help what’s past.” She began to sob
helplessly. “I did love him once—but I loved you too.”

Gatsby’s eyes opened and closed.

“You loved me too?” he repeated.

“Even that’s a lie,” said Tom savagely. “She didn’t know you were
alive. Why—there’s things between Daisy and me that you’ll never know,
things that neither of us can ever forget.”

The words seemed to bite physically into Gatsby.

“I want to speak to Daisy alone,” he insisted. “She’s all excited
now—”

“Even alone I can’t say I never loved Tom,” she admitted in a pitiful
voice. “

In [33]:
new_nodes = get_retrieved_nodes(
    "What did Tom admit when Nick met him for the last time?", vector_top_k=3, with_reranker=False
)

INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 13 tokens
> [retrieve] Total embedding token usage: 13 tokens


In [34]:
visualize_retrieved_nodes(new_nodes)



****Score****: 0.8486235346565064
****Node text****
: guess. I thought you were rather an honest,
straightforward person. I thought it was your secret pride.”

“I’m thirty,” I said. “I’m five years too old to lie to myself and
call it honour.”

She didn’t answer. Angry, and half in love with her, and tremendously
sorry, I turned away.

------------------------------------------------------------------------

One afternoon late in October I saw Tom Buchanan. He was walking ahead
of me along Fifth Avenue in his alert, aggressive way, his hands out a
little from his body as if to fight off interference, his head moving
sharply here and there, adapting itself to his restless eyes. Just as
I slowed up to avoid overtaking him he stopped and began frowning into
the windows of a jewellery store. Suddenly he saw me and walked back,
holding out his hand.

“What’s the matter, Nick? Do you object to shaking hands with me?”

“Yes. You know what I think of you.”

“You’re crazy, Nick,” he said quic

In [36]:
new_nodes = get_retrieved_nodes(
    "What did Tom admit when Nick met him for the last time?", vector_top_k=10, reranker_top_n=3, with_reranker=True
)

INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 13 tokens
> [retrieve] Total embedding token usage: 13 tokens
Document 1: 
Tom admits to telling Wilson the truth about who owned the car that hit Myrtle. (Relevance: 10)

Document 2: 
Tom talks about Gatsby and accuses him of causing a row in his house. (Relevance: 2)

Document 3: 
Tom accuses Gatsby of trying to make love to his wife and causing a row in his house. (Relevance: 4)

Document 4: 
Tom accuses Gatsby of having something on him that Walter is afraid to tell him about. (Relevance: 1)

Document 5: 
Nick reflects on turning thirty and the promise of a decade of loneliness. (Relevance: 0)

Therefore, the relevant documents are 1, 3, and 2 with relevance scores of 10, 4, and 2 respectively.
No relevant documents provided.


In [35]:
visualize_retrieved_nodes(new_nodes)



****Score****: 0.8486235346565064
****Node text****
: guess. I thought you were rather an honest,
straightforward person. I thought it was your secret pride.”

“I’m thirty,” I said. “I’m five years too old to lie to myself and
call it honour.”

She didn’t answer. Angry, and half in love with her, and tremendously
sorry, I turned away.

------------------------------------------------------------------------

One afternoon late in October I saw Tom Buchanan. He was walking ahead
of me along Fifth Avenue in his alert, aggressive way, his hands out a
little from his body as if to fight off interference, his head moving
sharply here and there, adapting itself to his restless eyes. Just as
I slowed up to avoid overtaking him he stopped and began frowning into
the windows of a jewellery store. Suddenly he saw me and walked back,
holding out his hand.

“What’s the matter, Nick? Do you object to shaking hands with me?”

“Yes. You know what I think of you.”

“You’re crazy, Nick,” he said quic

## Query Engine

In [None]:
query_engine = index.as_query_engine(
    similarity_top_k=10,
    node_postprocessors=[reranker],
    response_mode="tree_summarize"
)
response = query_engine.query(
    "What did the author do during his time at Y Combinator?", 
)

In [None]:
query_engine = index.as_query_engine(
    similarity_top_k=3,
    response_mode="tree_summarize"
)
response = query_engine.query(
    "What did the author do during his time at Y Combinator?", 
)

In [None]:
retrieval =