# **Online Query: Retrieval & Reasoning**

This notebook implements the **Online Query stage**, focused on *retrieval and reasoning*.  
It connects to the persisted `ChromaDB` vector store created in Ingestion Pipeline, retrieves the most relevant context for a user question, and leverages an LLM to generate accurate, grounded answers.

The main goals of this phase are:
- To load and query the existing vector database (`db/chroma`);
- To test and evaluate the retriever and prompt templates;
- To execute reasoning chains combining context and user questions;
- To validate the full Retrieval-Augmented Generation (RAG) pipeline in interactive mode.

> ðŸ’¡ Use this notebook to explore how retrieval and reasoning work together to produce context-aware, explainable answers.


**Notebook Setup and Autoreload**

In [1]:
# --- Notebook setup and autoreload configuration ---
# This cell runs the initial setup script and enables the autoreload extension.
# The autoreload feature ensures that updates made to imported modules (e.g., in src/)
# are automatically reloaded without restarting the kernel.
%run notebook_setup.py

%load_ext autoreload

%autoreload 2

Notebook environment configured successfully!

Project root: /home/ilfn/datascience/workspace/rag-movie-plots
Added to sys.path:
  - /home/ilfn/datascience/workspace/rag-movie-plots/src
  - /home/ilfn/datascience/workspace/rag-movie-plots/src/backend
PYTHONPATH: /home/ilfn/datascience/workspace/rag-movie-plots/src
Current working directory: /home/ilfn/datascience/workspace/rag-movie-plots/notebooks


In [None]:
# import os, sys
# print("CWD:", os.getcwd())
# print("PYTHONPATH:", sys.path[:3])

**1. Retriever Sanity Check**

This initial cell verifies that the persisted Chroma vector store can be successfully loaded and queried.

It performs a quick retrieval test using a few example questions, printing the top-k most similar documents and their metadata.

This helps confirm that:
* the embeddings model matches the stored vectors
* the Chroma database is accessible and not corrupted
* the retriever is returning meaningful context

In [2]:
from backend.retriever.retriever import Retriever

top_k=5

# Load the retriever (top_k controls how many documents to fetch)
retriever = Retriever(top_k=top_k).load()

# Example test questions
test_questions = [
    "Who directed Titanic and what is the movie about?",
    "List some science fiction movies from the 1990s.",
    "Which movies were made in India?"
]

print(f"\nTesting {len(test_questions)} questions with top_k={top_k}...\n")

# Run retrieval and print results
for i, question in enumerate(test_questions, start=1):
    print(f"\n=== Question {i} ===")
    print(f"Q: {question}\n")

    try:
        docs = retriever.invoke(question)
        print("\nRetrieved Documents:")
        for j, doc in enumerate(docs, start=1):
            metadata = doc.metadata
            source_id = metadata.get("source_id", "N/A")
            title = metadata.get("Title", "N/A")
            release_year = metadata.get("Release Year", "N/A")
            wiki_page = metadata.get("Wiki Page", "N/A")
            origin = metadata.get("Origin/Ethnicity", "N/A")
            director = metadata.get("Director", "N/A")
            cast = metadata.get("Cast", "N/A")
            genre = metadata.get("Genre", "N/A")

            print(f"\nDoc {j}")
            print(f"  â€¢ Source ID: {source_id}")
            print(f"  â€¢ Title: {title}")
            print(f"  â€¢ Release Year: {release_year}")
            print(f"  â€¢ Wiki Page: {wiki_page}")
            print(f"  â€¢ Origin/Ethnicity: {origin}")
            print(f"  â€¢ Director: {director}")
            print(f"  â€¢ Cast: {cast}")
            print(f"  â€¢ Genre: {genre}")
            print(f"  â€¢ Content Preview: {doc.page_content[:300]}...\n")
    except Exception as e:
        print(f"[Error] Could not retrieve results: {e}")

    print("-" * 100)

  from .autonotebook import tqdm as notebook_tqdm


Loading vector store from: /home/ilfn/datascience/workspace/rag-movie-plots/db/chroma
Using embedding model: sentence-transformers/all-MiniLM-L6-v2
Retrieval: top_k=5
Retriever loaded successfully.

Testing 3 questions with top_k=5...


=== Question 1 ===
Q: Who directed Titanic and what is the movie about?


Retrieved Documents:

Doc 1
  â€¢ Source ID: 13153
  â€¢ Title: Titanic
  â€¢ Release Year: 1997
  â€¢ Wiki Page: https://en.wikipedia.org/wiki/Titanic_(1997_film)
  â€¢ Origin/Ethnicity: American
  â€¢ Director: James Cameron
  â€¢ Cast: Leonardo DiCaprio, Kate Winslet, Billy Zane, Frances Fisher, Victor Garber, Kathy Bates, Bill Paxton, Gloria Stuart, David Warner, Suzy Amis
  â€¢ Genre: Historical Epic, Disaster
  â€¢ Content Preview: Title: Titanic
Director: James Cameron
Cast: Leonardo DiCaprio, Kate Winslet, Billy Zane, Frances Fisher, Victor Garber, Kathy Bates, Bill Paxton, Gloria Stuart, David Warner, Suzy Amis
Genre: Historical Epic, Disaster
Release Year: 1997...


Doc 

This cell performs an end-to-end test of the Retrieval-Augmented Generation (RAG) pipeline using the ChatRAG class.

It compares two responses:
1. **RAG Mode** â€“ combines retrieved movie documents with the LLM for grounded, context-aware answers.
2. **LLM-Only Mode** â€“ queries the language model directly, without using any retrieval context.

**Purpose**:
* Validate that the retriever and the LLM interact correctly.
* Inspect how the RAG answer differs from a pure-LLM answer.
* Ensure that document metadata and context printing work as expected.

In [8]:
from backend.chat.chat_rag import ChatRAG

# Initialize the ChatRAG pipeline (verbose=True prints retrieved docs)
chat = ChatRAG(verbose=True)

# Example question to test both modes
question = "Who directed Titanic and what is the movie about?"
# --- Run LLM-only chat (no retrieval) ---
print("\n\n============ WITHOUT RAG ============\n")
answer_llm = chat.ask_llm_only(question)
print("\n**Answer**:\n")
print(answer_llm)

# --- Run RAG-based chat ---
print("\n\n============ WITH RAG ============\n")
answer_rag = chat.ask(question)
print("\n**Answer**:\n")
print(answer_rag)

Using model: gpt-4o-mini
Loading vector store from: /home/ilfn/datascience/workspace/rag-movie-plots/db/chroma
Using embedding model: sentence-transformers/all-MiniLM-L6-v2
Retrieval: top_k=5
Retriever loaded successfully.




Running LLM only (no RAG) with model: gpt-4o-mini
Question: Who directed Titanic and what is the movie about?
LLM-only answer generated.


**Answer**:

"Titanic" was directed by James Cameron and was released in 1997. The film is a romantic drama that tells the story of a fictional love affair between two characters, Jack Dawson (played by Leonardo DiCaprio) and Rose DeWitt Bukater (played by Kate Winslet), set against the backdrop of the real-life sinking of the RMS Titanic in 1912.

The narrative unfolds through a combination of a present-day exploration of the wreck of the Titanic and flashbacks to the ship's ill-fated maiden voyage. Jack, a poor artist, and Rose, a young woman from a wealthy family, meet aboard the ship and fall in love despite the constraint