# Explainable retrieval in document search
This notebook demonstrates how to build an explainable retriever — a system that not only finds relevant documents for a user query but also explains why those documents were selected. This is useful when we want to make our retrieval process transparent, interpretable, and trustworthy. Being able to justify why a piece of text was retrieved is just as important as the retrieval itself.

Most document retrieval systems — especially those using vector similarity — return results as a black box. We get a result, but We are not told why. This can be frustrating, especially when trust or traceability is important. We add natural language explanations using an LLM. After we retrieve a document, we pass both the document and the original query to the LLM and ask it to explain why that document is relevant.

In [1]:
import os
from dotenv import load_dotenv
from langchain_openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate

# Load environment variables from a .env file
load_dotenv()

# Access the API key
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')

### Prepare the document collection
Here we define the “document collection” — a few sample texts. These are what we will be searching through.

In [2]:
texts = [
    "The sky is blue because of the way sunlight interacts with the atmosphere.",
    "Photosynthesis is the process by which plants use sunlight to produce energy.",
    "Global warming is caused by the increase of greenhouse gases in Earth's atmosphere."
]

We are keeping it simple for now with just three facts, but this approach works just as well for paragraphs or entire documents.

### Prepare the core retrieval components
To build our explainable retriever, we need:
- Embeddings: To convert text into vector space.
- Vector store (FAISS): To store and search those vectors.
- LLM (ChatOpenAI): To explain results in natural language.

#### Create the vector store
Now let's embed texts using OpenAI embeddings and store them using FAISS for similarity search.


In [3]:
# Convert input texts to embeddings and store them
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_texts(texts, embeddings)

- First, we load OpenAI’s embedding model. This is a tool that turns each sentence into a numerical vector based on its meaning.
- Then, we pass our list of `texts` into that embedding model.
- The results — those numerical representations — are then stored in FAISS.

#### Set up the retriever
We now create a retriever that lets us search through the vector store. It will return the top 5 most relevant pieces of text.

In [4]:
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})

What this does is wrap our FAISS vector store in a retriever interface. It will return the top 5 most relevant pieces of text.

### Build the explanation generator
Here, we use a prompt to tell the language model: "Explain why this chunk of text is relevant to this query." And we wire that prompt into GPT-4o-mini to create a full explanation chain.

In [5]:
llm = ChatOpenAI(temperature=0, model_name="gpt-4o-mini-2024-07-18", max_tokens=4000)

# Define a prompt that asks the LLM to explain why a document is relevant to the query
explain_prompt = PromptTemplate(
    input_variables=["query", "context"],
    template="""
    Analyze the relationship between the following query and the retrieved context.
    Explain why this context is relevant to the query and how it might help answer the query.

    Query: {query}

    Context: {context}

    Explanation:
    """
)

# Chain the prompt to the LLM
explain_chain = explain_prompt | llm

- First, we are initializing an instance of OpenAI’s GPT model. The `temperature=0` ensures that the model behaves deterministically — it will give the same output for the same input every time. This is great for explanations, where we want consistent reasoning rather than creative variations.
- Then, we are defining a prompt template. This is a reusable structure that we will fill in later with a real query and a real chunk of retrieved content. This kind of prompt engineering helps the LLM give structured and focused responses.
  - `input_variables=["query", "context"]`: This tells the system which variables will be filled in when the template is used.
  - The `template` itself is a clear instruction to the LLM: it gets the query and the retrieved content and is asked to generate an explanation of their connection.
- Then, we use LangChain's LCEL (LangChain Expression Language) syntax to create a chain by "piping" the filled-in prompt directly into the language model. What this means practically is: for every query + retrieved document pair, the system will inject those values into the prompt template, send it to the model, and return the model’s generated explanation.

### Run a query and get explanations
Now we are ready to try a real query. We will retrieve the relevant results and pass them through the explanation generator.

In [7]:
query = "Why is the sky blue?"

# Retrieve matching documents
docs = retriever.invoke(query)

# Generate explanations for each result
explained_results = []

for doc in docs:
    # Prepare inputs
    input_data = {
        "query": query,
        "context": doc.page_content
    }
    # Ask the LLM to explain the relevance
    explanation = explain_chain.invoke(input_data).content

    # Collect the result with its explanation
    explained_results.append({
        "content": doc.page_content,
        "explanation": explanation
    })

- First, the user provides a natural language query.
- Then we call `.get_relevant_documents(query)` on the retriever. Behind the scenes, the retriever takes the query, converts it into an embedding, and compares it to the precomputed embeddings in the FAISS vector store. It returns the top matches — in this setup, we have configured it to return the top 5 (`k=5`).
- So now we have a list of documents that are likely relevant — but we don’t yet know why they were chosen. That is where the explanation chain comes in. For each retrieved document, we take:
  - The original query (the user's question),
  - The document content (the candidate match),
  - And format them into a structured input for the LLM (via the explanation prompt we defined earlier).
  - We pass this to the language model, which responds with a natural-language explanation of how the document is related to the query.
- We then store each document along with its explanation in a list of dictionaries, so we can easily display or analyze them later.

### Show the final results
Let’s print out each result and its explanation. This makes the retrieval process clear and understandable — we see not just *what* was retrieved, but *why*.

In [8]:
for i, result in enumerate(explained_results, 1):
    print(f"Result {i}:")
    print(f"Content: {result['content']}")
    print(f"Explanation: {result['explanation']}")
    print()

Result 1:
Content: The sky is blue because of the way sunlight interacts with the atmosphere.
Explanation: The context provided directly addresses the query by explaining the reason behind the phenomenon of a blue sky. The query asks for an explanation of why the sky appears blue, and the context succinctly states that this is due to the interaction of sunlight with the atmosphere.

The relevance of the context lies in its focus on the scientific principles involved in the scattering of light. Specifically, it implies that the blue color of the sky is a result of Rayleigh scattering, where shorter wavelengths of light (blue) are scattered more than longer wavelengths (red) when sunlight passes through the Earth's atmosphere. This explanation is fundamental to understanding the query.

By providing this context, it helps answer the query by offering a clear and concise reason for the blue appearance of the sky, which is likely what the person asking the question is seeking. The context 