<a href="https://colab.research.google.com/github/duper203/RAG_Techniques_with_upstage/blob/main/upstage/25_explainable_retrieval.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Explainable Retrieval in Document Search


## Key Components
1. Vector store creation from input texts
2. Base retriever using FAISS for efficient similarity search
3. Language model (LLM) for generating explanations
4. Custom ExplainableRetriever class that combines retrieval and explanation generation

In [None]:
! pip3 install -qU langchain-upstage langchain langchain-community faiss-cpu sentence_transformers

In [3]:
import os
from langchain_upstage import UpstageEmbeddings, ChatUpstage
from langchain.prompts import PromptTemplate
from langchain.vectorstores import FAISS

from google.colab import userdata
os.environ["UPSTAGE_API_KEY"] = userdata.get("UPSTAGE_API_KEY")

## Define the explainable retriever class

In [4]:
class ExplainableRetriever:
    def __init__(self, texts):
        self.embeddings = UpstageEmbeddings(model="solar-embedding-1-large")

        self.vectorstore = FAISS.from_texts(texts, self.embeddings)
        self.llm = ChatUpstage(model='solar-pro')


        # Create a base retriever
        self.retriever = self.vectorstore.as_retriever(search_kwargs={"k": 5})

        # Create an explanation chain
        explain_prompt = PromptTemplate(
            input_variables=["query", "context"],
            template="""
            Analyze the relationship between the following query and the retrieved context.
            Explain why this context is relevant to the query and how it might help answer the query.

            Query: {query}

            Context: {context}

            Explanation:
            """
        )
        self.explain_chain = explain_prompt | self.llm

    def retrieve_and_explain(self, query):
        # Retrieve relevant documents
        docs = self.retriever.get_relevant_documents(query)

        explained_results = []

        for doc in docs:
            # Generate explanation
            input_data = {"query": query, "context": doc.page_content}
            explanation = self.explain_chain.invoke(input_data).content

            explained_results.append({
                "content": doc.page_content,
                "explanation": explanation
            })

        return explained_results

## Create a mock example and explainable retriever instance

In [5]:
# Usage
texts = [
    "The sky is blue because of the way sunlight interacts with the atmosphere.",
    "Photosynthesis is the process by which plants use sunlight to produce energy.",
    "Global warming is caused by the increase of greenhouse gases in Earth's atmosphere."
]

explainable_retriever = ExplainableRetriever(texts)


## Show the results

In [7]:
query = "Why is the sky blue?"
results = explainable_retriever.retrieve_and_explain(query)

for i, result in enumerate(results, 1):
    print(f"Result {i}:")
    print(f"Content: {result['content']}")
    print(f"Explanation: {result['explanation']}")
    print()

Result 1:
Content: The sky is blue because of the way sunlight interacts with the atmosphere.
Explanation: 
            The context is directly relevant to the query as it provides a concise and accurate answer to the question "Why is the sky blue?" The explanation given in the context, "The sky is blue because of the way sunlight interacts with the atmosphere," highlights the primary reason for the sky's blue color, which is the scattering of sunlight by the Earth's atmosphere. This context can help answer the query by satisfying the user's curiosity about the phenomenon and potentially encouraging further learning about the topic.

Result 2:
Content: Global warming is caused by the increase of greenhouse gases in Earth's atmosphere.
Explanation: 
Explanation: The provided context is not directly relevant to the query "Why is the sky blue?" as it discusses global warming and greenhouse gases, while the query is about the color of the sky. However, it might be indirectly related becaus