# Understanding Reranking in RAG Applications
### What is Reranking?
- Reranking is a crucial step in Retrieval-Augmented Generation (RAG) systems that refines the initial search results to improve relevance and quality. After the initial retrieval phase returns a set of potentially relevant documents, reranking reorders these documents based on more sophisticated relevance scoring, typically using cross-encoder models that can better understand the relationship between the query and each document.

### When is Reranking Done?
- Reranking occurs after initial retrieval but before feeding documents to the LLM:

- Query Processing → User submits a question
- Initial Retrieval → Vector search returns top-k documents (e.g., top 20)
- Reranking → Reorder documents by relevance (select top-n, e.g., top 5)
- Generation → LLM uses reranked documents to generate answer

### Why Reranking Matters
- Initial retrieval (like vector similarity search) often suffers from the "semantic gap" - documents that are semantically similar might not be contextually relevant to the specific query. Reranking helps bridge this gap by using more sophisticated models that can understand nuanced relationships.

### Key Differences in Output Quality
- Without Reranking:

Top Result: "Introduction to Machine Learning" - provides general ML concepts
Focus: Broader, less specific to the neural network learning process
Context: May miss nuanced aspects of how neural networks specifically learn
Generated Answer: Likely to be more generic about machine learning in general

- With Reranking:

Top Result: "Training Deep Learning Models" - directly addresses neural network learning
Focus: Specific to neural network training, architecture, and key components
Context: Better understanding of the query's intent
Generated Answer: More precise and relevant to neural network learning processes

### Technical Implementation
- Common reranking models include:

- Cross-encoders: Models like cross-encoder/ms-marco-MiniLM-L-6-v2
- Sentence-BERT based: Fine-tuned for passage ranking
- Cohere Rerank: Commercial API for reranking
- Custom models: Fine-tuned on domain-specific data

In [None]:
import React, { useState } from 'react';
import { Search, ArrowRight, ArrowUp, ArrowDown } from 'lucide-react';

const RerankingDemo = () => {
  const [activeTab, setActiveTab] = useState('without');
  
  // Demo data: documents in a knowledge base about AI/ML
  const documents = [
    {
      id: 1,
      title: "Introduction to Machine Learning",
      content: "Machine learning is a subset of artificial intelligence that enables computers to learn and improve from experience without being explicitly programmed. It involves algorithms that can identify patterns in data.",
      vectorScore: 0.85,
      rerankScore: 0.72
    },
    {
      id: 2,
      title: "Deep Learning Neural Networks",
      content: "Deep learning uses neural networks with multiple layers to model and understand complex patterns. These networks can automatically learn hierarchical representations of data.",
      vectorScore: 0.82,
      rerankScore: 0.91
    },
    {
      id: 3,
      title: "Natural Language Processing Applications",
      content: "NLP enables machines to understand, interpret and generate human language. Applications include chatbots, translation services, and sentiment analysis.",
      vectorScore: 0.78,
      rerankScore: 0.45
    },
    {
      id: 4,
      title: "Computer Vision Techniques",
      content: "Computer vision allows machines to interpret and understand visual information from the world. It includes image recognition, object detection, and facial recognition technologies.",
      vectorScore: 0.75,
      rerankScore: 0.38
    },
    {
      id: 5,
      title: "Neural Network Architecture Design",
      content: "Designing effective neural network architectures involves choosing the right layers, activation functions, and connections. Modern architectures like transformers have revolutionized AI.",
      vectorScore: 0.73,
      rerankScore: 0.89
    },
    {
      id: 6,
      title: "Training Deep Learning Models",
      content: "Training deep learning models requires careful consideration of loss functions, optimizers, and regularization techniques. The process involves forward and backward propagation through neural networks.",
      vectorScore: 0.70,
      rerankScore: 0.94
    }
  ];

  const query = "How do neural networks learn and what are the key components?";

  // Sort documents by vector similarity (initial retrieval)
  const vectorRanked = [...documents].sort((a, b) => b.vectorScore - a.vectorScore);
  
  // Sort documents by rerank score
  const reranked = [...documents].sort((a, b) => b.rerankScore - a.rerankScore);

  const ScoreBar = ({ score, color = "blue" }) => (
    <div className="w-full bg-gray-200 rounded-full h-2 mt-1">
      <div 
        className={`bg-${color}-500 h-2 rounded-full`} 
        style={{ width: `${score * 100}%` }}
      />
    </div>
  );

  const DocumentCard = ({ doc, rank, scoreType, showChange = false, originalRank = null }) => (
    <div className="border rounded-lg p-4 mb-3 bg-white shadow-sm">
      <div className="flex justify-between items-start mb-2">
        <div className="flex items-center gap-2">
          <span className="text-lg font-bold text-gray-600">#{rank}</span>
          <h3 className="font-semibold text-gray-800">{doc.title}</h3>
          {showChange && originalRank !== null && originalRank !== rank && (
            <div className="flex items-center ml-2">
              {originalRank > rank ? (
                <ArrowUp className="text-green-500 w-4 h-4" />
              ) : (
                <ArrowDown className="text-red-500 w-4 h-4" />
              )}
              <span className="text-sm text-gray-500 ml-1">
                (was #{originalRank})
              </span>
            </div>
          )}
        </div>
        <div className="text-right">
          <div className="text-sm font-medium">
            {scoreType === 'vector' ? 'Vector Score' : 'Rerank Score'}
          </div>
          <div className="text-lg font-bold">
            {(scoreType === 'vector' ? doc.vectorScore : doc.rerankScore).toFixed(2)}
          </div>
          <ScoreBar 
            score={scoreType === 'vector' ? doc.vectorScore : doc.rerankScore} 
            color={scoreType === 'vector' ? 'blue' : 'green'}
          />
        </div>
      </div>
      <p className="text-gray-600 text-sm leading-relaxed">{doc.content}</p>
    </div>
  );

  return (
    <div className="max-w-6xl mx-auto p-6 bg-gray-50 min-h-screen">
      <h1 className="text-3xl font-bold text-center mb-2">RAG Reranking Demo</h1>
      <div className="text-center mb-6">
        <div className="bg-blue-100 p-3 rounded-lg inline-block">
          <Search className="inline mr-2" size={18} />
          <span className="font-medium">Query: "{query}"</span>
        </div>
      </div>

      <div className="flex justify-center mb-6">
        <div className="bg-white rounded-lg p-1 shadow-sm">
          <button
            onClick={() => setActiveTab('without')}
            className={`px-4 py-2 rounded-md mr-2 transition-colors ${
              activeTab === 'without' 
                ? 'bg-blue-500 text-white' 
                : 'text-gray-600 hover:bg-gray-100'
            }`}
          >
            Without Reranking
          </button>
          <button
            onClick={() => setActiveTab('with')}
            className={`px-4 py-2 rounded-md mr-2 transition-colors ${
              activeTab === 'with' 
                ? 'bg-green-500 text-white' 
                : 'text-gray-600 hover:bg-gray-100'
            }`}
          >
            With Reranking
          </button>
          <button
            onClick={() => setActiveTab('comparison')}
            className={`px-4 py-2 rounded-md transition-colors ${
              activeTab === 'comparison' 
                ? 'bg-purple-500 text-white' 
                : 'text-gray-600 hover:bg-gray-100'
            }`}
          >
            Side-by-Side
          </button>
        </div>
      </div>

      {activeTab === 'without' && (
        <div className="bg-white rounded-lg shadow-sm p-6">
          <h2 className="text-xl font-semibold mb-4 text-blue-600">
            Results: Initial Vector Search Only
          </h2>
          <p className="text-gray-600 mb-4">
            Documents ranked purely by vector similarity to the query embedding.
          </p>
          {vectorRanked.slice(0, 4).map((doc, index) => (
            <DocumentCard 
              key={doc.id} 
              doc={doc} 
              rank={index + 1} 
              scoreType="vector"
            />
          ))}
        </div>
      )}

      {activeTab === 'with' && (
        <div className="bg-white rounded-lg shadow-sm p-6">
          <h2 className="text-xl font-semibold mb-4 text-green-600">
            Results: After Reranking
          </h2>
          <p className="text-gray-600 mb-4">
            Documents reordered using a cross-encoder model that better understands query-document relevance.
          </p>
          {reranked.slice(0, 4).map((doc, index) => {
            const originalRank = vectorRanked.findIndex(d => d.id === doc.id) + 1;
            return (
              <DocumentCard 
                key={doc.id} 
                doc={doc} 
                rank={index + 1} 
                scoreType="rerank"
                showChange={true}
                originalRank={originalRank}
              />
            );
          })}
        </div>
      )}

      {activeTab === 'comparison' && (
        <div className="grid grid-cols-1 lg:grid-cols-2 gap-6">
          <div className="bg-white rounded-lg shadow-sm p-6">
            <h2 className="text-xl font-semibold mb-4 text-blue-600">
              Without Reranking
            </h2>
            {vectorRanked.slice(0, 4).map((doc, index) => (
              <DocumentCard 
                key={doc.id} 
                doc={doc} 
                rank={index + 1} 
                scoreType="vector"
              />
            ))}
          </div>
          
          <div className="bg-white rounded-lg shadow-sm p-6">
            <h2 className="text-xl font-semibold mb-4 text-green-600">
              With Reranking
            </h2>
            {reranked.slice(0, 4).map((doc, index) => {
              const originalRank = vectorRanked.findIndex(d => d.id === doc.id) + 1;
              return (
                <DocumentCard 
                  key={doc.id} 
                  doc={doc} 
                  rank={index + 1} 
                  scoreType="rerank"
                  showChange={true}
                  originalRank={originalRank}
                />
              );
            })}
          </div>
        </div>
      )}

      <div className="mt-8 bg-yellow-50 border border-yellow-200 rounded-lg p-6">
        <h3 className="text-lg font-semibold text-yellow-800 mb-3">
          Key Differences Observed:
        </h3>
        <div className="grid grid-cols-1 md:grid-cols-2 gap-4 text-sm">
          <div>
            <h4 className="font-medium text-yellow-800">Without Reranking:</h4>
            <ul className="mt-2 space-y-1 text-yellow-700">
              <li>• Top result: "Introduction to Machine Learning" (general)</li>
              <li>• Focuses on broader ML concepts</li>
              <li>• Less specific to neural networks</li>
              <li>• Vector similarity based on keyword overlap</li>
            </ul>
          </div>
          <div>
            <h4 className="font-medium text-yellow-800">With Reranking:</h4>
            <ul className="mt-2 space-y-1 text-yellow-700">
              <li>• Top result: "Training Deep Learning Models" (specific)</li>
              <li>• Directly addresses how neural networks learn</li>
              <li>• Better contextual understanding</li>
              <li>• Cross-encoder considers query-document relationship</li>
            </ul>
          </div>
        </div>
      </div>

      <div className="mt-6 bg-gray-100 rounded-lg p-6">
        <h3 className="text-lg font-semibold mb-3">How Reranking Works:</h3>
        <div className="space-y-3 text-sm">
          <div className="flex items-start gap-3">
            <span className="bg-blue-500 text-white rounded-full w-6 h-6 flex items-center justify-center text-xs font-bold">1</span>
            <div>
              <strong>Initial Retrieval:</strong> Vector search finds documents with similar embeddings to the query
            </div>
          </div>
          <div className="flex items-start gap-3">
            <span className="bg-green-500 text-white rounded-full w-6 h-6 flex items-center justify-center text-xs font-bold">2</span>
            <div>
              <strong>Cross-Encoder Scoring:</strong> A more sophisticated model evaluates each query-document pair
            </div>
          </div>
          <div className="flex items-start gap-3">
            <span className="bg-purple-500 text-white rounded-full w-6 h-6 flex items-center justify-center text-xs font-bold">3</span>
            <div>
              <strong>Reordering:</strong> Documents are reranked based on the cross-encoder scores
            </div>
          </div>
          <div className="flex items-start gap-3">
            <span className="bg-orange-500 text-white rounded-full w-6 h-6 flex items-center justify-center text-xs font-bold">4</span>
            <div>
              <strong>Top-k Selection:</strong> Only the most relevant documents are passed to the LLM
            </div>
          </div>
        </div>
      </div>
    </div>
  );
};

export default RerankingDemo;

SyntaxError: closing parenthesis '}' does not match opening parenthesis '(' on line 76 (1312650715.py, line 87)