# Grading

Give a relevance score to every retrieved chunk by instructing an LLM with a structured output and a precise rubric

Optionally, add a filter and an ordering, so a `reranking` effect may take place

![Self-RAG](docs/selfrag.png)


In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import os
from typing import List, TypedDict
from dotenv import load_dotenv

from langchain.schema import Document
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_qdrant import QdrantVectorStore
from langchain_core.prompts import ChatPromptTemplate

from src import utils, conf

  from .autonotebook import tqdm as notebook_tqdm


# Params

In [3]:
conf_settings = conf.load(file="settings.yaml")
conf_infra = conf.load(file="infra.yaml")    

LLM_WORKHORSE = conf_settings.llm_workhorse
LLM_FLAGSHIP = conf_settings.llm_flagship
EMBEDDINGS = conf_settings.embeddings
VDB_URL = conf_infra.vdb_url
INDEX_NAME = conf_settings.vdb_index


# Environment Variables

In [4]:
load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
QDRANT_API_KEY = os.getenv("QDRANT_API_KEY")

# Clients

In [5]:
llm = ChatOpenAI(
    api_key=OPENAI_API_KEY,
    model=LLM_WORKHORSE,
    )
try:
    _ = llm.invoke("tell me a joke about devops")
except Exception as err:
    print(err)
    
embeddings = OpenAIEmbeddings(api_key=OPENAI_API_KEY, model=EMBEDDINGS)
try:
    _ = embeddings.embed_query("healthcheck")

except Exception as err:
    print(err)



vector_store = QdrantVectorStore.from_existing_collection(
    embedding=embeddings,
    collection_name=INDEX_NAME,
    url=VDB_URL,
    api_key=QDRANT_API_KEY,
)
try:
    _ = vector_store.asimilarity_search("healthcheck")
except Exception as err:
    print(err)


In [6]:
llm = ChatOpenAI(
    api_key=OPENAI_API_KEY,
    model=LLM_WORKHORSE,
    )



# Context Example

In [7]:
docs = [
    Document(
        page_content="John J. Hopfield and Geoffrey Hinton received the Nobel Prize in Physics in 2024 for their groundbreaking work on artificial neural networks, a foundation of modern AI. Hopfield developed an associative memory model in the 1980s that allows networks to store and reconstruct patterns. Building on this, Hinton developed the Boltzmann machine, which uses statistical physics principles to recognize and classify data. These pioneering contributions are essential for today's machine learning technologies, enhancing applications from medical imaging to material science.",
        metadata={"source": "wikipedia"}
    ),
    Document(
        page_content="In Chemistry, David Baker, Demis Hassabis, and John Jumper were honored win Nobel Prize in 2024 for their breakthroughs in protein structure prediction. Baker’s work in computational protein design enables the creation of novel proteins, while Hassabis and Jumper, known for their work with DeepMind's AlphaFold, developed an AI that accurately predicts protein structures—a long-standing challenge in biology. This advancement could lead to transformative applications in drug development and synthetic biology.",
        metadata={"source": "wikipedia"}
    ),
]


# Grading

In [8]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

from pydantic import BaseModel, Field
from enum import Enum

class RelevancyScore(BaseModel):
    """
    **Scoring Relevancy Rubric:**
    * 3: Highly Relevant. The document directly and comprehensively answers the query. It contains key facts, data, or arguments that are essential for a complete response. It is a primary source of information for the query.
    * 2: Partially Relevant. The document contains some useful information related to the query, but it is not a direct or complete answer. It may provide background, related context, or a partial answer that needs to be supplemented by other sources.
    * 1: Tangentially Relevant. The document mentions a keyword or a related concept from the query, but it does not provide any substantive information that would help answer the query. Its content is not useful for generating a response.
    * 0: Not Relevant. The document has no connection to the query whatsoever. It is completely unrelated.
    """
    assessment: str = Field(..., description="Concise, one-sentence relevancy assessment")  # little CoT
    score: int  = Field(..., description="Relevancy Score following the rubric")
    


grader_template = """You are a highly specialized document relevance grader. Your task is to evaluate a document's relevance to a given query.\
    Provide a concise, one-sentence relevancy assessment, focusing on facts
    You must output a single integer score from 0 to 3 based on the following scale
    User message: {question}
    Document: {chunk}
"""

llm_grading = llm.with_structured_output(RelevancyScore)
prompt_grading= ChatPromptTemplate.from_template(grader_template)
chain_grading = prompt_grading | llm_grading

chain_grading.invoke(
    {
        "question": "What are the primary causes of the decline in bee populations globally?",
        "chunk": "A recent study by the EPA highlights the multifaceted causes of global bee population decline.\
            The report identifies habitat loss due to monoculture farming and urbanization as a major factor.\
            The widespread use of neonicotinoid pesticides, which impair the bees' neurological functions and navigation, is another critical issue.\
            Climate change also plays a role by disrupting flowering seasons and increasing the prevalence of bee-specific parasites like the Varroa mite. \
            The study emphasizes that these factors often interact, creating a synergistic negative effect on bee health."
    }
)


RelevancyScore(assessment='The document directly identifies habitat loss, pesticide use, and climate change as primary causes of global bee population decline, providing a comprehensive answer.', score=3)

In [9]:
chain_grading.invoke(
    {
        "question": "What are the primary causes of the decline in bee populations globally?",
        "chunk": "The agricultural industry is increasingly adopting precision farming techniques to improve crop yields.\
            These technologies, which include GPS-guided tractors and drone-based crop monitoring, help farmers use resources more efficiently. \
            While these methods are designed to be more sustainable, their implementation in large-scale operations often relies on a single crop, which can reduce biodiversity."
    }
)

RelevancyScore(assessment='The document discusses agricultural techniques and biodiversity but does not directly address the causes of the global decline in bee populations.', score=1)

In [10]:

chain_grading.invoke(
    {
        "question": "What are the primary causes of the decline in bee populations globally?",
        "chunk": "In the spring, many flowers begin to bloom, attracting various insects. \
            Pollinators, such as bees and butterflies, are essential for the reproduction of many plant species. \
            The honey produced by bees is a valuable commodity and has been used by humans for centuries for its nutritional and medicinal properties."
    }
)

RelevancyScore(assessment='The document mentions bees and pollinators but does not address causes of the global decline in bee populations.', score=1)