Skip to content

Conversation

@enitrat
Copy link
Collaborator

@enitrat enitrat commented Jul 25, 2025

PR Summary: Introduction of a Retrieval Judge to the RAG Pipeline

This PR introduces a new, critical component to the RAG pipeline: a RetrievalJudge. The primary goal is to enhance the quality and relevance of the context provided to the generation model. It acts as an intelligent filter, using an LLM to score and discard irrelevant documents retrieved from the vector store, leading to more accurate and contextually-aware final answers.

This change also includes a significant refactoring of the RAG pipeline's unit tests, introducing pytest fixtures for better modularity and adding comprehensive test coverage for the new judging functionality.


Key Changes

  • New RetrievalJudge Module:

    • A new dspy.Module (retrieval_judge.py) is implemented.
    • It uses an LLM (Gemini Flash) to rate each retrieved document's relevance on a scale of 0.0 to 1.0 against the user's query.
    • Documents scoring below a configurable threshold (defaulting to 0.4) are filtered out.
    • It operates in parallel to efficiently score multiple documents.
  • Pipeline Integration:

    • The RetrievalJudge is seamlessly integrated into the RagPipeline, running immediately after the DocumentRetriever and before the GenerationProgram.
    • Token usage from the judge is now tracked and included in the total get_lm_usage() report.
  • Optimizations and Resiliency:

    • Template Passthrough: Standard contract and test templates are automatically kept without being sent to the judge, saving latency and cost.
    • Robust Error Handling: The pipeline is designed to be resilient. If the RetrievalJudge fails for any reason, it logs a warning and proceeds with all retrieved documents, ensuring the user still receives an answer.
  • Test Suite Overhaul:

    • A new test file (test_retrieval_judge.py) provides focused unit tests for the judge's logic, including score clamping, error handling, and template passthrough.
    • The main pipeline test file (test_rag_pipeline.py) has been completely refactored to use pytest fixtures, improving readability and maintainability.
    • A new test suite (TestRagPipelineWithJudge) specifically validates the pipeline's behavior with the judge enabled, covering filtering, error fallbacks, and metadata enrichment.

Updated RAG Pipeline Flow

The diagram below illustrates the new RAG pipeline, highlighting the addition of the Retrieval Judge step.

flowchart TD
    subgraph RAG Pipeline
        A[User Query] --> B("1Query Processor");
        B -- Semantic Queries --> C("Document Retriever");
        C -- Retrieves from --> D[Vector DB];
        C -- Retrieved Docs --> E{{"Retrieval Judge"}};
        E -- Filters using LLM --> F[Filtered, Relevant Docs];
        F -- High-Quality Context --> G("Generation Program");
        G -- Final Answer --> H[User Response];
    end

    style E fill:#cde4ff,stroke:#0066ff,stroke-width:2px,stroke-dasharray: 5 5
Loading

Component Interaction

This sequence diagram shows how the RagPipeline orchestrates the new retrieval and judging process.

sequenceDiagram
    User->>Pipeline: forward(query)
    Pipeline->>Retriever: aforward(query)
    Retriever-->>Pipeline: Returns list of documents
    
    Note over Pipeline, Judge: Pipeline invokes the new Judge
    Pipeline->>Judge: aforward(query, documents)
    Judge-->>Pipeline: Returns filtered list of documents
    
    Pipeline->>Generator: aforward(query, filtered_documents)
    Generator-->>Pipeline: Returns final answer
    Pipeline-->>User: Streams final answer
Loading

@enitrat enitrat changed the base branch from feat/migrate-dspy to main July 30, 2025 15:29
@enitrat enitrat force-pushed the feat/llm-judge-docs branch 2 times, most recently from c339e58 to b0ed8d1 Compare July 30, 2025 15:33
@enitrat enitrat force-pushed the feat/llm-judge-docs branch from dfb17fe to bb56031 Compare July 31, 2025 17:59
@enitrat enitrat force-pushed the feat/llm-judge-docs branch from bb56031 to 2a90d98 Compare July 31, 2025 17:59
@enitrat enitrat merged commit 17f12fc into main Jul 31, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant