### Re-ranking
Ditto as RAG-fusion.



##### Re-Ranking using Flashrank
FlashRank is the Ultra-lite & Super-fast Python library to add re-ranking to your existing search & retrieval pipelines. It is based on SoTA cross-encoders. Example to use flashrank for document compression and retrieval.

In [None]:
!pip install --upgrade --quiet  flashrank

In [9]:
from dotenv import load_dotenv, dotenv_values
import google.generativeai as genai
from IPython.display import Markdown, display
import os 

import os
os.environ['KMP_DUPLICATE_LIB_OK']='True'

load_dotenv()
os.getenv("GOOGLE_API_KEY") 
my_api_key = os.getenv("GOOGLE_API_KEY")
genai.configure(api_key=my_api_key)

In [2]:
from langchain.prompts import PromptTemplate
from langchain_google_genai import GoogleGenerativeAIEmbeddings, ChatGoogleGenerativeAI

## Call Embedding Model
embedding = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")
# LLM
llm = ChatGoogleGenerativeAI(model= "gemini-1.5-flash")

In [3]:
# Helper function for printing docs


def pretty_print_docs(docs):
    print(
        f"\n{'-' * 100}\n".join(
            [
                f"Document {i+1}:\n\n{d.page_content}\nMetadata: {d.metadata}"
                for i, d in enumerate(docs)
            ]
        )
    )

In [4]:
from langchain.document_loaders import TextLoader
from langchain.vectorstores import FAISS
from langchain_text_splitters import RecursiveCharacterTextSplitter

documents = TextLoader(
    "Data/state_of_the_union.txt", encoding = "UTF-8"
).load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
texts = text_splitter.split_documents(documents)
for idx, text in enumerate(texts):
    text.metadata["id"] = idx
retriever = FAISS.from_documents(texts, embedding).as_retriever(search_kwargs={"k": 20})

query = "What did the president say about Harris?"
docs = retriever.invoke(query)
pretty_print_docs(docs)

Document 1:

But that trickle-down theory led to weaker economic growth, lower wages, bigger deficits, and the widest gap between those at the top and everyone else in nearly a century. 

Vice President Harris and I ran for office with a new economic vision for America. 

Invest in America. Educate Americans. Grow the workforce. Build the economy from the bottom up  
and the middle out, not from the top down.
Metadata: {'source': 'Data/state_of_the_union.txt', 'id': 23}
----------------------------------------------------------------------------------------------------
Document 2:

As I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. 

While it often appears that we never agree, that isn’t true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military

##### Reranking with FlashRank

In [None]:
!pip install langchain-community -qU
#pip install langchain-google-genai langchain-core -qU

In [5]:
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import FlashrankRerank
compressor = FlashrankRerank()
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, base_retriever=retriever
)

compressed_docs = compression_retriever.invoke(
   "What did the president say about Harris?"
)
print([doc.metadata["id"] for doc in compressed_docs])

[23, 50, 3]


After reranking, the top 3 documents are different from the top 3 documents retrieved by the base retriever.

In [6]:
pretty_print_docs(compressed_docs)

Document 1:

But that trickle-down theory led to weaker economic growth, lower wages, bigger deficits, and the widest gap between those at the top and everyone else in nearly a century. 

Vice President Harris and I ran for office with a new economic vision for America. 

Invest in America. Educate Americans. Grow the workforce. Build the economy from the bottom up  
and the middle out, not from the top down.
Metadata: {'source': 'Data/state_of_the_union.txt', 'id': 23, 'relevance_score': 0.9970603}
----------------------------------------------------------------------------------------------------
Document 2:

And tonight, I’m announcing that the Justice Department will name a chief prosecutor for pandemic fraud. 

By the end of this year, the deficit will be down to less than half what it was before I took office.  

The only president ever to cut the deficit by more than one trillion dollars in a single year. 

Lowering your costs also means demanding more competition. 

I’m a capit

##### QA reranking with FlashRank

In [7]:
from langchain.chains import RetrievalQA

chain = RetrievalQA.from_chain_type(llm=llm, retriever=compression_retriever)

In [8]:
chain.invoke(query)

{'query': 'What did the president say about Harris?',
 'result': 'The provided text states that the president and Vice President Harris "ran for office with a new economic vision for America." \n'}