# rag4rag: Using RAG to generate data for model fine-tuning

In this notebook, we accomplish the following:
* Load our dataset of synthetically-generated answers and their hallucinated spans and confidence scores
* Choose a confidence score threshold for classifying an answer as a hallucination
* Subset the dataset to include only hallucinations
* Use a vanilla RAG pipeline (LangChain + FAISS) to generate new answers
* Run the new RAG answers through LettuceDetect to check for hallucinations
* Compare the new RAG answers with the previous answers

In [1]:
import openai

import pandas as pd
import plotly.express as px

from datasets import Dataset, load_dataset
from lettucedetect.models.inference import HallucinationDetector

from langchain.prompts import PromptTemplate
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.docstore.document import Document
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA

In [2]:
import warnings
import logging

# Suppress warnings (as you did)
warnings.filterwarnings("ignore")

# Suppress all logging below ERROR level for the root logger
logging.getLogger().setLevel(logging.ERROR)

In [3]:
# Load synthetic RAG dataset with detected hallucination
ds = load_dataset("m-newhauser/rag-synthetic-distilabel-hallucinations")
ds

DatasetDict({
    train: Dataset({
        features: ['context', 'anchor', 'human_positive', 'synthetic_positive', 'synthetic_negative', 'hallucinated_span', 'confidence'],
        num_rows: 4989
    })
})

In [4]:
# Transform dataset to dataframe
df = ds["train"].to_pandas()

# Preview the dataset
df.head()

Unnamed: 0,context,anchor,human_positive,synthetic_positive,synthetic_negative,hallucinated_span,confidence
0,"Architecturally, the school has a Catholic cha...",To whom did the Virgin Mary allegedly appear i...,Saint Bernadette Soubirous,The Virgin Mary allegedly appeared to Bernadet...,The Virgin Mary appeared in the sky as the sun...,,
1,"Architecturally, the school has a Catholic cha...",What is in front of the Notre Dame Main Building?,a copper statue of Christ,"In front of the Notre Dame Main Building, you'...",The main building's roof is painted in bright ...,,
2,"Architecturally, the school has a Catholic cha...",The Basilica of the Sacred heart at Notre Dame...,the Main Building,The Basilica of the Sacred Heart at Notre Dame...,The basilica's heart-shaped design was inspire...,,
3,"Architecturally, the school has a Catholic cha...",What is the Grotto at Notre Dame?,a Marian place of prayer and reflection,The Grotto at Notre Dame is a sacred replica o...,The grotto was filled with colorful lights and...,,
4,"Architecturally, the school has a Catholic cha...",What sits on top of the Main Building at Notre...,a golden statue of the Virgin Mary,The iconic Golden Dome sits on top of the Main...,The main course sits on top of the dining tabl...,,


In [5]:
# Manually set threshold
threshold = 0.9

# Filter the DataFrame for hallucinations based on the threshold
hallucinations_df = (
    df
    .query("confidence != ''")
    .query(f"confidence >= {threshold}")
)

In [9]:
# OpenAI API key variable name
openai_api_key_var = "OPENAI_API_KEY"  # Replace with the name of your secret/env var

# Fetch API key from environment variable
import os
openai_api_key = os.getenv(openai_api_key_var)
if not openai_api_key:
    raise EnvironmentError(
        f"Environment variable '{openai_api_key_var}' is not set. "
        "Please define it before running this script."
    )

openai.api_key = openai_api_key_var

This Langchain RAG pipeline first constructs a [FAISS](https://github.com/facebookresearch/faiss) vector store from provided text contexts by embedding them with OpenAI's embeddings. Then, it uses Langchain's RetrievalQA chain, configured with an OpenAI LLM and a retriever based on the vector store, to generate answers for a list of input questions.

In [10]:
# Prepare data
contexts = hallucinations_df["context"].tolist()
anchors = hallucinations_df["anchor"].tolist()
human_positives = hallucinations_df["human_positive"].tolist()

# Create a vectorstore from the *unique* contexts
unique_contexts = hallucinations_df["context"].unique().tolist()
docs = [Document(page_content=ctx) for ctx in unique_contexts]
embedding = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(docs, embedding)

prompt_template = """You are an expert information extraction system.
Your task is to answer the question using ONLY the information provided in the following context.
The answer to the question is GUARANTEED to be found DIRECTLY within the context.
You MUST provide the exact answer as it appears in the context, without adding any extra words or explanations.
Your answer must be as concise as possible and not a full sentence.
Do not say things like "According to the context," or rephrase the answer.
If the question asks for a specific piece of information (e.g., a year, a name), provide ONLY that specific piece of information.

Context:
{context}

Question: {question}

Answer:"""

CUSTOM_PROMPT = PromptTemplate(
    template=prompt_template,
    input_variables=["context", "question"]
)

# Create a retriever + QA chain
retriever = vectorstore.as_retriever(search_kwargs={"k": 1})
qa = RetrievalQA.from_chain_type(
    llm=OpenAI(), 
    retriever=retriever,
    chain_type_kwargs={"prompt": CUSTOM_PROMPT},
)

# Run the RAG pipeline for each anchor and strip newlines
rag_answers = []
for anchor in anchors:
    answer = qa.run(anchor)
    cleaned_answer = answer.replace('\n', ' ').strip()  # Replace newlines with spaces and trim
    rag_answers.append(cleaned_answer)

# Put the answers into a dataframe, aligning by index
df = pd.DataFrame({
    "context": contexts,
    "question": anchors,
    "human_positive": human_positives,
    "rag_positive": rag_answers,
})

In [6]:
# Load the hallucination detector model
detector = HallucinationDetector(
    method="transformer", model_path="KRLabsOrg/lettucedect-base-modernbert-en-v1"
)

In [12]:
# Run over the RAG dataset
def predict_hallucinations(row):
    predictions = detector.predict(
        context=[row['context']],
        question=row['question'],
        answer=row['rag_positive'],
        output_format="spans"
    )
    # Assuming predictions is a list of dictionaries
    if predictions:
        return predictions[0].get('text', ''), predictions[0].get('confidence', 0.0)
    return '', ''

# Apply the function to each row of the DataFrame
df[['hallucinated_span', 'confidence']] = df.apply(predict_hallucinations, axis=1, result_type='expand')

In [13]:
# Replace blank strings with NaN
df['confidence'] = df['confidence'].replace('', pd.NA)

# Convert the column to numeric (float or int)
df['confidence'] = pd.to_numeric(df['confidence'], errors='raise')

In [14]:
# Convert to Dataset
ds = Dataset.from_pandas(df)

# Save to Hub
ds.push_to_hub("m-newhauser/rag4rag-synthetic-hallucinations")

In [21]:
# Subset hallucinations based on the threshold
threshold = 0.9
hallucinations_df = df.query("confidence != ''").query(f"confidence >= {threshold}")

In [None]:
# Print stats regarding hallucinations
print(f"Total hallucinations detected: {hallucinations_df.shape[0]} ({hallucinations_df.shape[0]/df.shape[0] * 100:.2f}%)")
