Start by retrieving a set of candidate documents based on a query. This can be done using traditional methods like TF-IDF or BM25 for keyword-based retrieval.


Using BM25 for Initial Retrieval

In [None]:
from rank_bm25 import BM25Okapi

# Sample documents
documents = [
    "The cat sat on the mat.",
    "Dogs are great companions.",
    "The sun is shining today.",
    "Cats and dogs are popular pets.",
    "It is a beautiful day.",
    "Pets provide emotional support."
]

# Tokenize documents
tokenized_docs = [doc.lower().split() for doc in documents]
bm25 = BM25Okapi(tokenized_docs)

# Function to retrieve top K documents
def retrieve_documents(query, k=3):
    tokenized_query = query.lower().split()
    scores = bm25.get_scores(tokenized_query)
    top_indices = scores.argsort()[-k:][::-1]  # Get indices of top K scores
    return [documents[i] for i in top_indices]

query = "What pets are popular?"
initial_retrieval = retrieve_documents(query)
print("Initial Retrieval:", initial_retrieval)

Step 2: Semantic Scoring with a Simple Model

Next, implement a simple scoring mechanism to rerank the retrieved documents. You can use cosine similarity between embeddings of the query and documents.
Example: Using Sentence Embeddings

In [None]:
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity

# Load a pre-trained model for embeddings
model = SentenceTransformer('all-MiniLM-L6-v2')

# Function to compute semantic scores
def semantic_score(query, documents):
    # Generate embeddings for the query and documents
    query_embedding = model.encode([query])
    doc_embeddings = model.encode(documents)
    
    # Calculate cosine similarity scores
    scores = cosine_similarity(query_embedding, doc_embeddings)
    return scores.flatten()

# Rerank documents based on semantic scores
def rerank_documents(query, retrieved_docs):
    semantic_scores = semantic_score(query, retrieved_docs)
    ranked_docs = sorted(zip(retrieved_docs, semantic_scores), key=lambda x: x[1], reverse=True)
    return [doc for doc, score in ranked_docs]

reranked_results = rerank_documents(query, initial_retrieval)
print("Reranked Results:", reranked_results)

Step 3: Combine Scores (Optional)

You can further enhance the reranking by combining keyword-based scores with semantic scores. This can be done by normalizing and weighting both scores.

In [None]:
def combined_rerank(query, retrieved_docs):
    bm25_scores = bm25.get_scores(query.lower().split())
    semantic_scores = semantic_score(query, retrieved_docs)

    combined_scores = [(bm25_scores[i] + semantic_scores[i]) / 2 for i in range(len(retrieved_docs))]
    
    ranked_docs = sorted(zip(retrieved_docs, combined_scores), key=lambda x: x[1], reverse=True)
    return [doc for doc, score in ranked_docs]

final_results = combined_rerank(query, initial_retrieval)
print("Final Combined Reranked Results:", final_results)

Summary
This implementation provides a straightforward way to create a reranking system for RAG applications using basic libraries. The process involves:
Initial Retrieval: Using BM25 to fetch relevant documents.
Semantic Scoring: Utilizing sentence embeddings to compute relevance.
Reranking: Sorting documents based on their semantic scores.
Optional Combination: Merging keyword and semantic scores for improved ranking.
This approach allows you to build an effective reranking mechanism tailored to your specific needs without heavy dependencies.


## Cross-Encoders

Overview of Cross-Encoders in Sentence Transformers
Cross-Encoders are a type of model used in the Sentence Transformers framework, specifically designed for scoring and classifying pairs of sentences. They differ fundamentally from Bi-Encoders, which are more efficient for certain applications.
Key Differences: Cross-Encoder vs. Bi-Encoder
Input Handling:
Cross-Encoders process two sentences simultaneously, concatenating them with a special separator token (e.g., <SEP>). This allows them to evaluate the relationship between the sentences directly.
Bi-Encoders, on the other hand, encode each sentence independently into embeddings, which can then be compared using methods like cosine similarity.
Output:
Cross-Encoders produce a score indicating the similarity between the two sentences (ranging from 0 to 1) but do not generate standalone embeddings for individual sentences.
Bi-Encoders generate embeddings that can be used for various tasks like clustering or semantic search.
Performance and Scalability:
Cross-Encoders typically achieve higher accuracy in scoring and classification tasks due to their ability to consider both sentences together.
However, they are less scalable for large datasets because they require computing scores for all possible pairs of sentences, which can be computationally expensive. For example, comparing 100,000 sentences would require processing nearly 5 billion pairs with a Cross-Encoder, whereas a Bi-Encoder would only need to encode the 100,000 sentences once134.
Use Cases for Cross-Encoders
Cross-Encoders are particularly useful when:
You have a predefined set of sentence pairs and need to evaluate their similarity.
Tasks require high accuracy in classification or ranking, such as:
Natural Language Inference (NLI)
Semantic Textual Similarity (STS)
In practice, Cross-Encoders are often combined with Bi-Encoders in applications like Information Retrieval. A typical approach is to first use a Bi-Encoder to retrieve a smaller set of candidate sentences and then apply a Cross-Encoder to re-rank these candidates for better accuracy13.
Implementing Cross-Encoders
Using a Cross-Encoder is straightforward. Here’s an example implementation:
python
from sentence_transformers import CrossEncoder

# Load a pre-trained Cross-Encoder model
model = CrossEncoder("cross-encoder/ms-marco-TinyBERT-L-2-v2")

# Define sentence pairs
sentence_pairs = [
    ["How many people live in Berlin?", "Berlin had a population of 3,520,031 registered inhabitants."],
    ["What is the capital of France?", "Paris is the capital city of France."]
]

# Predict similarity scores
scores = model.predict(sentence_pairs)
print(scores)  # Output: array of similarity scores

This code snippet demonstrates how to load a Cross-Encoder model and use it to predict similarity scores for predefined sentence pairs145.
Conclusion
Cross-Encoders offer a powerful method for evaluating sentence pairs with high accuracy but come with scalability challenges. They are best suited for tasks where precision is critical and where the number of sentence pairs is manageable. For broader applications requiring efficiency and speed, Bi-Encoders remain the preferred choice.