<a href="https://colab.research.google.com/github/Pranov1984/QA_RAG_Langchain_AdvancedRetriever/blob/main/V1_LIC_RAG_Extraction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Load OpenAI API Credentials

Here we load it from a file so we don't explore the credentials on the internet by mistake

In [1]:
import locale
locale.getpreferredencoding = lambda: "UTF-8"

In [2]:
import yaml

with open('api_keys.yml', 'r') as file:
    api_creds = yaml.safe_load(file)

In [3]:
api_creds.keys()

dict_keys(['openai_key', 'ngrok_key'])

In [4]:
import os

os.environ['OPENAI_API_KEY'] = api_creds['openai_key']

## 🔹 Step 1.1 — Install Dependencies

In [5]:
# Install all required libraries
!pip install -q \
  langchain==0.3.11 \
  langchain-openai==0.2.12 \
  langchain-community==0.3.11 \
  langchain-chroma==0.2.2 \
  langchainhub\
  chromadb==0.6.3 \
  tiktoken\
  sentence-transformers==2.7.0 \
  pydantic==2.10.1 \
  PyMuPDF==1.24.0 \
  pyngrok==7.2.2 \
  pypdf \
  transformers # Add transformers explicitly as it's a dependency and might need reinstallation

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m169.7/169.7 kB[0m [31m11.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.0/61.0 kB[0m [31m5.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m6.3 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m60.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.7/50.7 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m90.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m611.1/611.1 kB[0m [31m45.2 MB/s[0m eta [36m0

## Step 1.2 — Load and Split LIC Document

In [6]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("/content/Final Policy document_LICs New Jeevan Shanti.pdf")
docs = loader.load_and_split()

## 🔹 Step 1.3 — Embedding with OpenAI & ChromaDB Setup

In [7]:
from langchain_openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma

embedding = OpenAIEmbeddings(model='text-embedding-3-small')
vectorstore = Chroma.from_documents(documents=docs, embedding=embedding, persist_directory="./chromadb")
retriever = vectorstore.as_retriever()

## 🔹 Step 1.4 — Advanced Retriever: Contextual Compression Retriever

In [8]:
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import LLMChainExtractor
from langchain.chat_models import ChatOpenAI

compressor_llm = ChatOpenAI(model="gpt-4", temperature=0)
compressor = LLMChainExtractor.from_llm(compressor_llm)
compression_retriever = ContextualCompressionRetriever(base_compressor=compressor, base_retriever=retriever)

  compressor_llm = ChatOpenAI(model="gpt-4", temperature=0)


##  Step 1.5 — Build LCEL Chain: Retriever + Prompt + LLM

In [9]:
from langchain.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableMap
from langchain_core.output_parsers import StrOutputParser
from langchain.chat_models import ChatOpenAI

# LLM
llm = ChatOpenAI(model="gpt-4", temperature=0)

# Prompt Template
prompt = ChatPromptTemplate.from_template(
    """You are a helpful assistant answering questions from LIC policy documents.

Context:
{context}

Question: {question}
Answer:"""
)

# RAG Chain using LCEL
rag_chain = (
    # Pass only the 'question' string to the retriever
    {"context": lambda x: compression_retriever.invoke(x["question"]), "question": lambda x: x["question"]}
    | RunnableMap({
        "context": lambda x: "\n\n".join([doc.page_content for doc in x["context"]]),
        "question": lambda x: x["question"]
    })
    | prompt
    | llm
    | StrOutputParser()
)

## 🔹 Step 1.6 — Run the Chain

In [10]:
query = {"question": "What is the death benefit under this LIC policy?"}
response = rag_chain.invoke(query)
print("Answer:", response)

Answer: The death benefit under this LIC policy is the higher of the following two options: the purchase price plus accrued additional benefit on death (as specified in Condition 3 of Part C of the policy document) minus the total annuity amount payable till the date of death, or 105% of the purchase price. This benefit is payable to the nominee(s) as per the option exercised by the annuitant as specified in Condition 3 of Part D of the policy document. The options for receiving the death benefit include a lumpsum payment, annuitisation of the death benefit, or in installments over a chosen period of 5, 10, or 15 years.


## Correct Answer Retrieved.

Let's try a few more questions

In [None]:
query = {"question": "What are the conditions under which the policy can be surrendered and what is the surrender value calculation method?"}
response = rag_chain.invoke(query)
print("Answer:", response)

Answer: The policy can be surrendered at any time during the policy term. The surrender value payable will be the higher of the Guaranteed Surrender Value or the Special Surrender Value.

The Guaranteed Surrender Value is calculated as the product of the GSV Factor and the Purchase Price, minus the total annuity amount payable up to the date of surrender. The GSV Factor varies depending on the policy year, being 75% for the first three years and 90% for the fifth year and beyond.

The Special Surrender Value is determined by the Corporation from time to time, subject to prior approval of IRDAI.

If a loan has been taken under the policy, the gross annuity amount originally payable will be deducted for the calculation of the Surrender Value. Any outstanding loan amount along with interest and/or any other amount recoverable from the Annuitant will be recovered from the surrender value payment.

In case of QROPS, the surrender provisions will be subject to any specific provisions regardi

## Correct Answer is retrieved

In [None]:
query = {"question": "Is there any maturity benefit under this policy?"}
response = rag_chain.invoke(query)
print("Answer:", response)

Answer: No, there is no maturity benefit under this policy.


In [None]:
query = {"question": "Is there any maturity benefit under this policy?Justify from policy and reference to the section"}
response = rag_chain.invoke(query)
print("Answer:", response)

Answer: No, there is no maturity benefit under this policy. This is explicitly stated in the section titled "Maturity Benefit" which says, "There is no maturity benefit under this policy."


In [None]:
query = {"question": "What are the options available to the nominee(s) for receiving the death benefit amount?"}
response = rag_chain.invoke(query)
print("Answer:", response)

Answer: The nominee(s) can receive the death benefit amount in one of the following ways, as chosen by the Annuitant(s):

1. Lumpsum Death Benefit: The entire benefit amount payable on death is paid to the nominee(s) in a lump sum.

2. Annuitisation of Death Benefit: The benefit amount payable on death is used to purchase an Immediate Annuity from the Corporation for the nominee(s), effective from the date of death of the annuitant/last survivor. The annuity amount payable to the nominee(s) is based on the age of the nominee(s) and immediate annuity rates prevailing as on the date of death of the Annuitant.

3. In Installment: The benefit amount payable on death can be received in installments over the chosen period of 5 or 10 or 15 years instead of a lump sum amount. The installments are paid in advance at yearly or half-yearly or quarterly or monthly intervals, subject to minimum installment amounts for different modes of payments. If the Net Claim Amount is less than the required am

## Correct Answer

In [None]:
query = {"question": "How is the “Additional Benefit on Death” calculated during the deferment period?"}
response = rag_chain.invoke(query)
print("Answer:", response)

Answer: The "Additional Benefit on Death" during the deferment period is calculated monthly. The formula for this is: 

Additional Benefit on Death per month = (Purchase Price * Annuity rate p.a. payable monthly) / 12 

The Annuity rate per annum payable monthly is equal to the Monthly tabular annuity rate and depends on the Option chosen, Age at entry of the annuitant(s) and the Deferment Period opted for. 

In case of death of the annuitant or surrender of the policy during the Deferment Period, Additional Benefit on Death for the policy year in which the death/surrender has occurred shall accrue till the completed policy month as on the date of death/surrender.


In [None]:
query = {"question": "Provide the formula for 'Additional Benefit on Death'"}
response = rag_chain.invoke(query)
print("Answer:", response)

Answer: The formula for 'Additional Benefit on Death' per month is = (Purchase Price * Annuity rate p.a. payable monthly) / 12


In [None]:
query = {"question": "What is the policy name"}
response = rag_chain.invoke(query)
print("Answer:", response)

Answer: The policy name is LIC’s New Jeevan Shanti.


In [None]:
query = {"question": "Plan Type?"}
response = rag_chain.invoke(query)
print("Answer:", response)

Answer: Non-Linked, Non-Participating, Individual, Single Premium, Deferred Annuity Plan


In [None]:
query = {"question": "Minimum Vesting Age?"}
response = rag_chain.invoke(query)
print("Answer:", response)

Answer: 55 years


In [None]:
query = {"question": "Free-Look Period?"}
response = rag_chain.invoke(query)
print("Answer:", response)

Answer: The Free Look Period is a period of 30 days from the date of receipt of the electronic or physical mode of Policy Document, whichever is earlier, by the Policyholder. During this period, the Policyholder can review the terms and conditions of the policy. If the Policyholder disagrees with any of the terms and conditions, they have the option to return the policy. If the policy is returned during the Free Look Period, the Corporation will cancel the Policy and return the Premium paid after deducting charges for stamp duty and annuity paid, if any. This condition is only applicable in case of new purchase of Deferred Annuity plan.


In [None]:
query = {"question": "Loan Eligibility?"}
response = rag_chain.invoke(query)
print("Answer:", response)

Answer: A loan facility is available at any time after three months from the completion of the policy or after the expiry of the free-look period, whichever is later. This is subject to the terms and conditions within the surrender value of the policy. Under joint life, the loan can be availed by the Primary Annuitant and in the absence of the Primary Annuitant, the same can be availed by the Secondary Annuitant. The maximum amount of loan that can be granted under the policy shall be such that the effective annual interest amount payable on loan does not exceed 50% of the annual annuity amount payable under the policy subject to a maximum of 80% of Surrender Value.


## Phase 2: Modularize for LLM & Embedding Switching (LCEL Compatible)

🔹 Step 2.1 — Configurable Embedding Loader


### Define a function to switch between OpenAI and HuggingFace embeddings:

In [11]:
def get_embedding_model(provider="openai"):
    if provider == "openai":
        from langchain_openai import OpenAIEmbeddings
        return OpenAIEmbeddings(model='text-embedding-3-small')
    elif provider == "huggingface":
        from langchain.embeddings import HuggingFaceEmbeddings
        return HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
    else:
        raise ValueError("Unsupported embedding provider.")

## 🔹 Step 2.2 — Configurable LLM Loader
Support GPT-4, GPT-4o, GPT-3.5, Mistral (local/HuggingFace):

In [12]:
def get_llm(model_name="gpt-4", provider="openai"):
    if provider == "openai":
        from langchain.chat_models import ChatOpenAI
        return ChatOpenAI(model=model_name, temperature=0)
    elif provider == "ollama":
        from langchain_community.llms import Ollama
        return Ollama(model=model_name)
    else:
        raise ValueError("Unsupported LLM provider.")

## 🔹 Step 2.3 — Vectorstore & Compression Retriever Builder

In [13]:
def build_advanced_retriever(docs, embedding, llm, persist_path="./chromadb"):
    from langchain.vectorstores import Chroma
    from langchain.retrievers import ContextualCompressionRetriever
    from langchain.retrievers.document_compressors import LLMChainExtractor

    vs = Chroma.from_documents(documents=docs, embedding=embedding, persist_directory=persist_path)
    retriever = vs.as_retriever()
    compressor = LLMChainExtractor.from_llm(llm)
    return ContextualCompressionRetriever(base_compressor=compressor, base_retriever=retriever)

## 🔹 Step 2.4 — Build Full RAG Chain (Modular)

In [14]:
def build_rag_chain(llm, compression_retriever):
    from langchain.prompts import ChatPromptTemplate
    from langchain_core.output_parsers import StrOutputParser
    from langchain_core.runnables import RunnableMap

    prompt = ChatPromptTemplate.from_template(
        """You are a helpful assistant answering questions from LIC policy documents.

Context:
{context}

Question: {question}
Answer:"""
    )

    chain = (
        {"context": lambda x: compression_retriever.invoke(x["question"]), "question": lambda x: x["question"]}
        | RunnableMap({
            "context": lambda x: "\n\n".join([doc.page_content for doc in x["context"]]),
            "question": lambda x: x["question"]
        })
        | prompt
        | llm
        | StrOutputParser()
    )
    return chain

In [15]:
# Load and split
loader = PyPDFLoader("/content/Final Policy document_LICs New Jeevan Shanti.pdf")
raw_docs = loader.load()

# Build components
embedding = get_embedding_model("openai")
llm_core = get_llm("gpt-4", provider="openai")
compression_retriever = build_advanced_retriever(raw_docs, embedding, llm_core)

# Build and run RAG chain
rag_chain = build_rag_chain(llm_core, compression_retriever)

In [16]:
result = rag_chain.invoke({"question": "What is the death benefit under this LIC policy?"})
print(result)

The death benefit under this LIC policy is the higher of either the Purchase Price plus Accrued Additional Benefit on Death (as specified in Condition 3 of Part C of this policy document) minus the total annuity amount payable till the date of death, or 105% of the Purchase Price. This benefit is payable to the nominee(s) as per the option exercised by the Annuitant as specified in Condition 3 of Part D of this policy document.


In [None]:
result = rag_chain.invoke({"question": "What is the policy type?"})
print(result)

The policy type is Joint Life annuity.


In [None]:
result = rag_chain.invoke({"question": "What is the Plan Type?"})
print(result)

The plan type is a Non-Linked, Non-Participating, Individual, Single Premium, Deferred Annuity Plan.


## Step 2.5 — Example Run with Different Combos

## Try with GPT-3.5-turbo & embedding model = "text-embedding-3-large"

In [None]:
result = rag_chain.invoke({"question": "What is the policy name?"})
print(result)

In [17]:
# Configure components
# Load and split
loader = PyPDFLoader("/content/Final Policy document_LICs New Jeevan Shanti.pdf")
raw_docs = loader.load()

# Build components
embedding = get_embedding_model("openai")
llm_core = get_llm("gpt-3.5-turbo", provider="openai")
compression_retriever = build_advanced_retriever(raw_docs, embedding, llm_core)

# Build and run RAG chain
rag_chain = build_rag_chain(llm_core, compression_retriever)
result = rag_chain.invoke({"question": "What is the Plan Type of the LIC's New Jeevan Shanti?"})
print(result)

The plan type of LIC's New Jeevan Shanti is a Non-Linked, Non-Participating, Individual, Single Premium, Deferred Annuity Plan.


In [18]:
result = rag_chain.invoke({"question":"Provide the formula for 'Additional Benefit on Death"})
print(result)

The formula for Additional Benefit on Death per month is:

Additional Benefit on Death per month = (Purchase Price * Annuity rate p.a. payable monthly) / 12

Where Annuity rate p.a. payable monthly is equal to the Monthly tabular annuity rate and depends on the Option chosen, Age at entry of the annuitant(s), and the Deferment Period opted for.


In [21]:
result = rag_chain.invoke({"question":"What is the Minimum Vesting Age in the New Jeevan Shanti LIC policy?"})
print(result)

The minimum vesting age in the New Jeevan Shanti LIC policy is 30 years.


In [22]:

result = rag_chain.invoke({"question":"Free-Look Period?"})
print(result)

The Free Look Period is a period of 30 days from the date of receipt of the electronic or physical mode of the Policy Document, whichever is earlier, during which the Policyholder can review the terms and conditions of the policy. If the Policyholder is not satisfied with the terms and conditions, they have the option to return the policy to the Corporation within this period. The Corporation will then cancel the policy and return the Premium paid after deducting charges for stamp duty and annuity paid, if any. Please note that the Free Look Period is only applicable in the case of a new purchase of a Deferred Annuity plan and not for purchases from existing funds.


## ✅ Phase 3: Evaluate LLM Answers Using Metrics
We’ll compute:

Metric	Method
✅ Faithfulness	GPT-based evaluator (via LangChain)

✅ Relevance	GPT-based evaluator (via LangChain)

✅ Precision	Overlap between prediction and ground truth tokens

✅ Recall	Same as above

✅ F1 Score	Harmonic mean of precision and recall

## Step 3.2 — Prepare Inputs: Questions + Ground Truth

In [23]:
questions = [
    "What is the death benefit under the Joint Life deferred annuity option after the deferment period?",
    "What are the conditions under which the policy can be surrendered and what is the surrender value calculation method?",
    "Is there any maturity benefit under this policy? Justify your answer from the policy.",
    "What are the options available to the nominee(s) for receiving the death benefit amount?",
    "How is the “Additional Benefit on Death” calculated during the deferment period?"
]

ground_truths = [
    "After the deferment period, under the Joint Life deferred annuity option, on the first death, the annuity continues for the surviving annuitant. Upon the death of the last survivor, the death benefit payable is higher of (a) Purchase Price plus Accrued Additional Benefit on Death minus Total annuity amount paid, or (b) 105% of Purchase Price.",
    "The policy can be surrendered at any time during its term. The surrender value is the higher of Guaranteed Surrender Value (GSV) or Special Surrender Value. GSV is calculated as (GSV Factor * Purchase Price) minus total annuity amount paid. The GSV Factor varies by policy year.",
    "No, there is no maturity benefit under this policy. This is explicitly stated in Part C of the policy document.",
    "The nominee can choose from: (1) Lump sum death benefit, (2) Annuitisation of the benefit amount into an immediate annuity, or (3) Receiving the benefit in installments over 5, 10, or 15 years.",
    "It is calculated as (Purchase Price * Monthly annuity rate) / 12. This accrues at the end of each policy month only during the deferment period."
]

🔹 Step 3.3 — Run the Chain on All Questions

In [24]:
rag_outputs = []
for q in questions:
    result = rag_chain.invoke({"question": q})
    rag_outputs.append(result)

## 🔹 Step 3.4 — Compute Faithfulness & Relevance (LLM-Eval)

In [25]:
from langchain.evaluation import load_evaluator

faithfulness_criteria = {"faithfulness": "Is the answer fully supported by the relevant information?"}
relevance_criteria = {"relevance": "Is the answer directly relevant to the question asked?"}

faithfulness_evaluator = load_evaluator("criteria", llm=llm_core, criteria=faithfulness_criteria)
relevance_evaluator = load_evaluator("criteria", llm=llm_core, criteria=relevance_criteria)

faithfulness_scores = []
relevance_scores = []

for pred, ref, q in zip(rag_outputs, ground_truths, questions):
    faith_eval_result = faithfulness_evaluator.evaluate_strings(prediction=pred, reference=ref, input=q)
    rel_eval_result = relevance_evaluator.evaluate_strings(prediction=pred, reference=ref, input=q)

    # Robust access with fallback
    faithfulness_scores.append(faith_eval_result.get("score", 0.0))
    relevance_scores.append(rel_eval_result.get("score", 0.0))

print("Faithfulness Scores:", faithfulness_scores)
print("Relevance Scores:", relevance_scores)

To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.


Faithfulness Scores: [1, 1, 1, 1, 1]
Relevance Scores: [1, 1, 1, 1, 1]


## Step 3.5 — Compute Precision, Recall, F1

In [31]:
def compute_token_overlap_metrics(pred, ref):
    pred_tokens = set(pred.lower().split())
    ref_tokens = set(ref.lower().split())

    true_positives = len(pred_tokens & ref_tokens)
    precision = true_positives / len(pred_tokens) if pred_tokens else 0
    recall = true_positives / len(ref_tokens) if ref_tokens else 0
    f1 = (2 * precision * recall) / (precision + recall) if (precision + recall) > 0 else 0
    return precision, recall, f1

precision_scores = []
recall_scores = []
f1_scores = []

for pred, ref in zip(rag_outputs, ground_truths):
    p, r, f1 = compute_token_overlap_metrics(pred, ref)
    precision_scores.append(p)
    recall_scores.append(r)
    f1_scores.append(f1)

## 🔹 Step 3.6 — Final Metrics Table

In [30]:
!pip install -q tabulate
from tabulate import tabulate

results_table = []
for i, (q, r, f) in enumerate(zip(questions, relevance_scores, faithfulness_scores), 1):
    results_table.append([i, q[:70] + ("..." if len(q) > 70 else ""), round(r, 2), round(f, 2)])

print(tabulate(results_table, headers=["#", "Question", "Relevance", "Faithfulness"], tablefmt="fancy_grid"))

╒═════╤═══════════════════════════════════════════════════════════════════════════╤═════════════╤════════════════╕
│   # │ Question                                                                  │   Relevance │   Faithfulness │
╞═════╪═══════════════════════════════════════════════════════════════════════════╪═════════════╪════════════════╡
│   1 │ What is the death benefit under the Joint Life deferred annuity option... │           1 │              1 │
├─────┼───────────────────────────────────────────────────────────────────────────┼─────────────┼────────────────┤
│   2 │ What are the conditions under which the policy can be surrendered and ... │           1 │              1 │
├─────┼───────────────────────────────────────────────────────────────────────────┼─────────────┼────────────────┤
│   3 │ Is there any maturity benefit under this policy? Justify your answer f... │           1 │              1 │
├─────┼─────────────────────────────────────────────────────────────────────────

In [34]:
from langchain.evaluation import load_evaluator

# Define binary correctness evaluation
correctness_criteria = {
    "correctness": "Is the answer factually correct according to the expected ground truth?"
}
correctness_evaluator = load_evaluator("criteria", llm=llm_core, criteria=correctness_criteria)

correctness_scores = []

for pred, ref, q in zip(rag_outputs, ground_truths, questions):
    result = correctness_evaluator.evaluate_strings(prediction=pred, reference=ref, input=q)
    score = result.get("score", 0.0)  # Fallback if no score key
    correctness_scores.append(score)

To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.


In [35]:
from tabulate import tabulate

# Build the results
results_table = []
for i, (q, f, rel, corr) in enumerate(zip(
    questions, faithfulness_scores, relevance_scores, correctness_scores), 1):
    results_table.append([
        i,
        q[:60] + ("..." if len(q) > 60 else ""),
        round(f, 2),
        round(rel, 2),
        round(corr, 2)
    ])

# Display
print(tabulate(
    results_table,
    headers=["#", "Question", "Faithfulness", "Relevance", "Correctness"],
    tablefmt="fancy_grid"
))


╒═════╤═════════════════════════════════════════════════════════════════╤════════════════╤═════════════╤═══════════════╕
│   # │ Question                                                        │   Faithfulness │   Relevance │   Correctness │
╞═════╪═════════════════════════════════════════════════════════════════╪════════════════╪═════════════╪═══════════════╡
│   1 │ What is the death benefit under the Joint Life deferred annu... │              1 │           1 │             1 │
├─────┼─────────────────────────────────────────────────────────────────┼────────────────┼─────────────┼───────────────┤
│   2 │ What are the conditions under which the policy can be surren... │              1 │           1 │             1 │
├─────┼─────────────────────────────────────────────────────────────────┼────────────────┼─────────────┼───────────────┤
│   3 │ Is there any maturity benefit under this policy? Justify you... │              1 │           1 │             1 │
├─────┼─────────────────────────

The faithfulness and relevance were being calculated against ground truth. Let's switch it to context based.

| Metric       | Input                   | Reference              |
| ------------ | ----------------------- | ---------------------- |
| Faithfulness | 🔹 `question`, `answer` | 🔹 Retrieved context   |
| Relevance    | 🔹 `question`, `answer` | 🔹 Retrieved context   |
| Correctness  | 🔹 `question`, `answer` | 🔹 Ground truth answer |


In [36]:
from langchain.evaluation import load_evaluator

# Evaluators
faithfulness_evaluator = load_evaluator(
    "criteria", llm=llm_core, criteria={"faithfulness": "Is the answer fully supported by the given context?"})
relevance_evaluator = load_evaluator(
    "criteria", llm=llm_core, criteria={"relevance": "Is the answer directly relevant to the question, given the context?"})
correctness_evaluator = load_evaluator(
    "criteria", llm=llm_core, criteria={"correctness": "Is the answer factually correct according to the expected ground truth?"})

# Final scores
faithfulness_scores = []
relevance_scores = []
correctness_scores = []

for q, gt_answer in zip(questions, ground_truths):
    # 1. Get retrieved context and model output
    retrieved_docs = compression_retriever.invoke(q)
    retrieved_context = "\n".join([doc.page_content for doc in retrieved_docs])
    model_answer = rag_chain.invoke({"question": q})

    # 2. Evaluate
    faith_score = faithfulness_evaluator.evaluate_strings(prediction=model_answer, reference=retrieved_context, input=q)
    rel_score = relevance_evaluator.evaluate_strings(prediction=model_answer, reference=retrieved_context, input=q)
    corr_score = correctness_evaluator.evaluate_strings(prediction=model_answer, reference=gt_answer, input=q)

    # 3. Store scores (fallback to 0 if not found)
    faithfulness_scores.append(faith_score.get("score", 0.0))
    relevance_scores.append(rel_score.get("score", 0.0))
    correctness_scores.append(corr_score.get("score", 0.0))


To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.
To use references, use the labeled_criteria instead.


In [37]:
from tabulate import tabulate

results_table = []
for i, (q, f, rel, corr) in enumerate(zip(
    questions, faithfulness_scores, relevance_scores, correctness_scores), 1):
    results_table.append([
        i,
        q[:60] + ("..." if len(q) > 60 else ""),
        round(f, 2),
        round(rel, 2),
        round(corr, 2)
    ])

print(tabulate(
    results_table,
    headers=["#", "Question", "Faithfulness", "Relevance", "Correctness"],
    tablefmt="fancy_grid"
))

╒═════╤═════════════════════════════════════════════════════════════════╤════════════════╤═════════════╤═══════════════╕
│   # │ Question                                                        │   Faithfulness │   Relevance │   Correctness │
╞═════╪═════════════════════════════════════════════════════════════════╪════════════════╪═════════════╪═══════════════╡
│   1 │ What is the death benefit under the Joint Life deferred annu... │              1 │           1 │             1 │
├─────┼─────────────────────────────────────────────────────────────────┼────────────────┼─────────────┼───────────────┤
│   2 │ What are the conditions under which the policy can be surren... │              1 │           1 │             1 │
├─────┼─────────────────────────────────────────────────────────────────┼────────────────┼─────────────┼───────────────┤
│   3 │ Is there any maturity benefit under this policy? Justify you... │              1 │           1 │             1 │
├─────┼─────────────────────────

In [40]:
from tabulate import tabulate

results_table = []
for i, (q, f, rel, corr, ans, gt) in enumerate(zip(
    questions, faithfulness_scores, relevance_scores, correctness_scores, rag_outputs, ground_truths), 1):

    results_table.append([
        i,
        q[:60] + ("..." if len(q) > 60 else ""),
        ans[:100] + ("..." if len(ans) > 100 else ""),
        gt[:100] + ("..." if len(gt) > 100 else ""),
        round(f, 2),
        round(rel, 2),
        round(corr, 2)
    ])

print(tabulate(
    results_table,
    headers=["#", "Question", "Answer", "Ground Truth", "Faithfulness", "Relevance", "Correctness"],
    tablefmt="fancy_grid"
))

╒═════╤═════════════════════════════════════════════════════════════════╤═════════════════════════════════════════════════════════════════════════════════════════════════════════╤═════════════════════════════════════════════════════════════════════════════════════════════════════════╤════════════════╤═════════════╤═══════════════╕
│   # │ Question                                                        │ Answer                                                                                                  │ Ground Truth                                                                                            │   Faithfulness │   Relevance │   Correctness │
╞═════╪═════════════════════════════════════════════════════════════════╪═════════════════════════════════════════════════════════════════════════════════════════════════════════╪═════════════════════════════════════════════════════════════════════════════════════════════════════════╪════════════════╪═════════════╪═══════════════╡
│

In [41]:
for i, (q, ans, gt, f, rel, corr) in enumerate(zip(
    questions, rag_outputs, ground_truths, faithfulness_scores, relevance_scores, correctness_scores), 1):

    print("=" * 100)
    print(f"🔹 Question {i}:")
    print(q)
    print("\n📝 Model Answer:\n", ans)
    print("\n🎯 Ground Truth:\n", gt)
    print(f"\n📊 Scores → Faithfulness: {f:.2f} | Relevance: {rel:.2f} | Correctness: {corr:.2f}")

🔹 Question 1:
What is the death benefit under the Joint Life deferred annuity option after the deferment period?

📝 Model Answer:
 The death benefit under the Joint Life deferred annuity option after the deferment period is the higher of:
1. Purchase Price plus Accrued Additional Benefit on Death (as specified in Condition 3 of Part C of the policy document) minus the total annuity amount payable till the date of death, or
2. 105% of the Purchase Price. 

The Additional Benefit on Death accrues at the end of each policy month until the end of the Deferment Period only.

🎯 Ground Truth:
 After the deferment period, under the Joint Life deferred annuity option, on the first death, the annuity continues for the surviving annuitant. Upon the death of the last survivor, the death benefit payable is higher of (a) Purchase Price plus Accrued Additional Benefit on Death minus Total annuity amount paid, or (b) 105% of Purchase Price.

📊 Scores → Faithfulness: 1.00 | Relevance: 1.00 | Correctnes