# RAG Retrieval

This notebook demonstrates the **retrieval phase** of a
Retrieval-Augmented Generation (RAG) pipeline.

Two retrieval algorithms are explored:

- **Similarity Search**
- **Maximal Marginal Relevance (MMR) Search**

Additionally, the notebook introduces the concept of a
**Runnable Retriever** backed by a vector store.


## Retrieval Algorithms Overview

During retrieval, the goal is to select the most relevant
document chunks from a vector store given a user query.

Two common strategies are used:

1. **Similarity Search**  
   Retrieves documents that are most similar to the query embedding.

2. **Maximal Marginal Relevance (MMR)**  
   Retrieves documents that balance:
   - Relevance to the query
   - Diversity among retrieved documents


In [None]:
import getpass
import os
from langchain_openai.embeddings import OpenAIEmbeddings
from langchain_core.documents import Document
from langchain_chroma import Chroma


In [None]:
if not os.environ.get("OPENAI_API_KEY"):
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

## Vector Store Initialization

A Chroma vector store is loaded from disk.
It must be initialized with the **same embedding function**
used during indexing to ensure compatibility.


In [None]:
embedding = OpenAIEmbeddings(model="text-embedding-3-small")

In [None]:
vectorstore = Chroma(persist_directory = "./vectorstore/rag-practice", 
                                    embedding_function = embedding)

### Optional: Adding a New Document

New documents can be embedded and added to the vector store.
This step is optional if the vector store is already populated.


In [None]:
added_document = Document(page_content='Alright! So… How are the techniques used in data, business intelligence, or predictive analytics applied in real life? Certainly, with the help of computers. You can basically split the relevant tools into two categories—programming languages and software. Knowing a programming language enables you to devise programs that can execute specific operations. Moreover, you can reuse these programs whenever you need to execute the same action', 
                          metadata={'Course Title': 'Introduction to Data and Data Science', 
                                    'Lecture Title': 'Programming Languages & Software Employed in Data Science - All the Tools You Need'})

In [None]:
# vectorstore.add_documents([added_document])

## Example Queries

Two related questions are used to demonstrate retrieval behavior.


In [None]:
question = "What programming languages do data scientists use?"

In [None]:
question2 = "What software do data scientists use?"

## Similarity Search

Similarity search retrieves the top-k documents
whose embeddings are closest to the query embedding.


In [None]:
retrived_docs = vectorstore.similarity_search(query = question, k=5)

In [None]:
for i in retrived_docs:
    print(f"Page Content: {i.page_content}\n----------------\nLecture Title: {i.metadata['Lecture Title']}\n")

In [None]:
retrived_docs2= vectorstore.similarity_search(query = question2, k=3)

In [None]:
for i in retrived_docs2:
    print(f"Page Content: {i.page_content}\n----------------\nLecture Title: {i.metadata['Lecture Title']}\n")

## Maximal Marginal Relevance (MMR) Search

MMR search retrieves documents that are both:
- Relevant to the query
- Diverse relative to each other

This is useful when retrieved documents are highly similar
and redundancy should be reduced.


### Why MMR Can Fail in Practice

MMR search may fail or raise errors currently unstable in some environments with langchain


In [None]:
# retrieved_docs= vectorstore.max_marginal_relevance_search(
#     query=question2,
#     k=3,
#     fetch_k=10,         
#     lambda_mult=0.5  
# )


## Runnable Retriever

LangChain provides a retriever abstraction that wraps
a vector store as a **Runnable**.

This allows retrieval to be composed into LCEL chains.


In [None]:
len(vectorstore.get()['documents'])

In [None]:
retriver =  vectorstore.as_retriever(search_kwargs={"k":2,  })

In [None]:
retriver  #has a runnable class

In [None]:
retrived_docs_runnable =  retriver.invoke(question2)

In [None]:
retrived_docs_runnable

## Summary

This notebook demonstrated the retrieval phase of a RAG pipeline:

- Similarity-based document retrieval
- Conceptual understanding of MMR search
- Common reasons why MMR may fail
- Using a vector store as a Runnable retriever

These retrieval strategies form the foundation
for retrieval-augmented question answering systems.
