# Retrieval: Re-ranking
 
![re-ranking](../images/images-re-ranking.png)

**Re-ranking** is a process used in information retrieval systems to improve the relevance of search results. After an initial set of documents or items is retrieved based on a user's query, re-ranking involves reordering these results to better match the user's intent.

**How Re-ranking Works:**

1. **Initial Retrieval:**
   - The system retrieves a broad set of documents or items that potentially match the user's query.

2. **Scoring:**
   - Each retrieved item is evaluated using advanced models or criteria to assess its relevance to the query.

3. **Re-ordering:**
   - Based on the new relevance scores, the items are reordered so that the most pertinent results appear at the top.

**Benefits of Re-ranking:**

- **Enhanced Relevance:** By refining the order of search results, re-ranking ensures that users are presented with the most relevant information first.
- **Improved User Satisfaction:** Delivering more accurate results increases user satisfaction and the overall effectiveness of the retrieval system.

**Example in Practice:**

Imagine you're searching for "best Italian restaurants in New York." The initial search might retrieve a wide range of Italian restaurants. Re-ranking would then reorder these results to prioritize those with the highest ratings, best reviews, or closest proximity, ensuring that the top suggestions align closely with your intent.

Incorporating re-ranking into retrieval systems enhances their ability to deliver precise and user-centric results, making information access more efficient and intuitive. 

![re-ranking](../images/re-ranking.png)


## Setup

In [1]:
%run "../Z - Common/setup.ipynb"

!pip install -qU cohere dill==0.3.8 multiprocess==0.70.16

Stored 'enable_langsmith' (bool)


USER_AGENT environment variable not set, consider setting it to identify your requests.


In [2]:
docs = load_sample_data()
split_docs = split_sample_data(docs)
retriever = seed_sample_data(split_docs)

We touched on this subject earlier when we looked at RAG-Fusion where we implemented a _reciprocal rank fusion_ function to re-rank results. But lets look at how to achieve the same using [Cohere Re-Rank](https://python.langchain.com/docs/integrations/retrievers/cohere-reranker#doing-reranking-with-coherererank). 

Links:

- Cohere [blog](https://cohere.com/blog/rerank)

Let's start by defining the prompt and chain to retrieve related documents (in the same way as for the RAG-fusion example):

In [3]:
from langchain.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

template = """You are a helpful assistant that generates multiple search queries based on a single input query. \n
Generate multiple search queries related to: {question} \n
Output a total of 4 queries. Just the queries as text, nothing else, separated via new line character:"""

prompt_generate_queries = ChatPromptTemplate.from_template(template)

chain_generate_queries = (
    prompt_generate_queries 
    | llm
    | StrOutputParser() 
    | (lambda x: x.split("\n"))
)

chain_retrieval = (
    chain_generate_queries 
    | retriever.map()
)

Pass the results, as text, to the Cohere reranker.

In [4]:
from langchain.load import dumps
import cohere

llm_reranker = cohere.BedrockClientV2(
    aws_access_key=os.environ["AWS_ACCESS_KEY_ID"],
    aws_secret_key=os.environ["AWS_SECRET_ACCESS_KEY"],
    aws_session_token=os.environ["AWS_SESSION_TOKEN"], 
    aws_region=os.environ["AWS_REGION"]
)

question = "What is task decomposition?"

docs = chain_retrieval.invoke({"question": question})
# print("docs: ", docs)

docs_str = [dumps(doc) for doc in docs]

reranked_docs = llm_reranker.rerank(
    model="cohere.rerank-v3-5:0",
    query=question,
    documents=docs_str,
    top_n=3,
)

# print(reranked_docs)

for idx, doc in enumerate(reranked_docs.results):
    # print(doc)
    print("Sequence: ",  idx + 1)
    print("Relevance score: ", doc.relevance_score)
    print(docs[doc.index][0].page_content)
    print("\n")

Sequence:  1
Relevance score:  0.8312667
Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.
Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a clas