## Self-RAG (RAG with self critique), RAGAS

In [1]:
# Setup & Imports
import time

from src.langchain.chains.movie_rag import MovieRAGChain
from src.langchain.prompts import ZERO_SHOT_QA_PROMPT

print("✓ Imports successful")

✓ Imports successful


In [2]:
# Configuration
PLOTS_PATH = "/Users/saghar/Desktop/movie-rag/datasets/rotten-tomatoes-reviews/prep/movie_plots.csv"
REVIEWS_PATH = "/Users/saghar/Desktop/movie-rag/datasets/rotten-tomatoes-reviews/prep/reviews_w_movies_full.csv"
MAX_MOVIES = 500  # Limit for faster demos

In [3]:
def run(chain, query, strategy_name):
    start = time.time()
    result = chain.query(query)
    time_t = time.time() - start

    print(f"\nStrategy: {strategy_name}")
    print(f"\nQuery: {query}")
    print(f"\nTime: {time_t:.2f}s")
    print(f"\nAnswer: {result['answer']}")
    print(f"\nTop 5 Retrieved:")
    if 'hyde_query' in result['sources'][0]['metadata']:
            print(f"\n  hyde query: {result['sources'][0]['metadata']['hyde_query']}")
    for src in result['sources'][:5]:
        score = src['metadata'].get('score', 'N/A')
        print(f"\n  {src['metadata']['movie_title']} - {src['metadata']['release_year']}(initial score={score})")
        print(f"\n content: {src['content']}...")
        if 'rerank_score' in src['metadata']:
            print(f"\n  rerank score: {src['metadata']['rerank_score']:.3f}")
    if 'citations' in result:
        print(f"\n  Citations: {', '.join(result.get('citations', [])[:3])}")

#### RAG + HyDE + Reranking

In [4]:
print("="*60)
print("RAG (HyDE + Reranking)")
print("="*60)

# Build chain with all features
full_chain = MovieRAGChain(
    plots_path=PLOTS_PATH,
    reviews_path=REVIEWS_PATH,
    max_movies=MAX_MOVIES,
    use_custom_retriever=True,
    use_custom_chunk=True,
    custom_prompt=ZERO_SHOT_QA_PROMPT,
    k=5,
    use_hyde=True,
    hyde_model="gpt-4o-mini",
    use_reranking=True,
    reranker_cfg={'type':'llm'},
    initial_k=20, 
)

full_chain.build()

run(full_chain, "recommend mind-bending sci-fi with great visuals", "RAG + HyDE + Reranking")

RAG (HyDE + Reranking)
✓ MovieRAGChain initialized
  Retriever type: custom + reranking + HyDE
  LLM: gpt-4o-mini

Building RAG Pipeline

1. Loading documents...
Limiting to 500 movies
Loading plots from /Users/saghar/Desktop/movie-rag/datasets/rotten-tomatoes-reviews/prep/movie_plots.csv...
Created 383 plot docs.
  ✓ 383 plot documents
Loading reviews from /Users/saghar/Desktop/movie-rag/datasets/rotten-tomatoes-reviews/prep/reviews_w_movies_full.csv...
Created 500 review docs.
  ✓ 500 review documents
✓ Total: 883 reviews and plots documents

2. Chunking with custom func...

Chunking documents...
Chunked 883 docs → 10114 chunks using 'sentence' strategy.

Building base retriever...

3. Building custom retriever...
Loading embedding model: text-embedding-3-small (provider: openai)
✓ Model loaded (dimension: 1536)
✓ FaissDenseRetriever initialized (index_type=flat)
Generating embeddings for 10114 documents...
Embeddings generated
Saving index...
✓ Added 10114 documents to FAISS index
 

  response = self.llm.predict(prompt_value)
  documents = self.base_retriever.get_relevant_documents(hypothetical_doc)



Strategy: RAG + HyDE + Reranking

Query: recommend mind-bending sci-fi with great visuals

Time: 17.16s

Answer: For mind-bending sci-fi with great visuals, I recommend **Limitless**. It features stylized camerawork, vibrant colors, and a captivating soundtrack, making it visually appealing. While it has some imperfections, it offers an engaging performance by Bradley Cooper and a unique premise. 

Another option is **Transcendence**, which, despite its mixed reviews, is noted for its eye-catching imagery and imaginative style. However, be aware that it may not fully engage you throughout. 

If you're looking for a classic, **Alien** is also a great choice, known for its thrilling atmosphere and imaginative storytelling, though it leans more towards horror.

Top 5 Retrieved:

  hyde query: If you're seeking mind-bending sci-fi films that deliver not only intricate narratives but also stunning visuals, I highly recommend "Inception" (2010) directed by Christopher Nolan. This film maste

#### Wrapping with self-rag

In [5]:
from src.langchain.chains.self_rag import SelfRAGWrapper

In [6]:
self_rag_chain = SelfRAGWrapper(full_chain)

In [8]:
run(self_rag_chain, "recommend mind-bending sci-fi with great visuals", "self rag -> RAG + HyDE + Reranking")


Question: recommend mind-bending sci-fi with great visuals

Initial answer:
For mind-bending sci-fi with great visuals, I recommend **Akira**. It's noted for being "easily the most breathtaking and kinetic anime ever made," offering a surreal experience that departs from the typical good vs. evil narrative. 

Another option is **Transcendence**, which, despite its mixed reviews, is described as "well-made" with "eye-catching images." It explores themes of artificial intelligence and consciousness, making it a visually interesting choice, though it may not fully engage viewers. 

Both films provide unique visual experiences and thought-provoking concepts in the sci-fi genre.

Critique: GOOD

Answer is good, no refinement needed

Strategy: self rag -> RAG + HyDE + Reranking

Query: recommend mind-bending sci-fi with great visuals

Time: 19.12s

Answer: For mind-bending sci-fi with great visuals, I recommend **Akira**. It's noted for being "easily the most breathtaking and kinetic anime 

In [13]:
_ = self_rag_chain.query("a good movie that is kind of crazy but isn't scary and can make me cry with that Spanish man acting in it")


Question: a good movie that is kind of crazy but isn't scary and can make me cry with that Spanish man acting in it

Initial answer:
Based on your description, "Let the Sunshine In" (Un beau soleil intérieur) might be a good fit. It features a complex emotional journey and explores themes of love and self-realization, which can evoke strong feelings. While it doesn't specifically feature a Spanish actor, it stars Juliette Binoche, who delivers a powerful performance that could resonate with you. The film is known for its emotional depth and sophisticated storytelling, making it a potential choice for a moving experience.

Critique: BAD

Refining using existing sources...

Refined answer:
Based on your description, I recommend "Certified Copy" (Copie Conforme) as a great choice. This film features the talented French actress Juliette Binoche, who delivers a nuanced performance that captures a range of emotions. "Certified Copy" explores complex themes of love and identity through the i

My conclusion: The task is quite easy. Hyde and rerank seem unnecessary (being fast is more important). Maybe self-critique isn't too bad to have though

### RAGAS

In [None]:
from src.langchain.ragas import create_movie_test_set, evaluate_chain, print_results

Note: movie-specific questions depend on whether the movie was chosen to be in the 500 movies we have access to! Therefore I'm going to remove them

In [43]:
def create_movie_test_set():
    """Movie-specific test questions with ground truth."""
    return [
        {
            "question": "What makes Christopher Nolan's directing style unique?",
            "ground_truth": "Nolan is known for non-linear storytelling, complex narratives, practical effects over CGI, philosophical themes about time and memory, and intricate puzzle-like plots that reward multiple viewings.",
            "query_type": "analytical"
        },
        {
            "question": "Recommend sci-fi movies with time travel themes",
            "ground_truth": "Sci-fi movies with time travel include Primer (complex low-budget), 12 Monkeys (dystopian), Looper (action-focused), Interstellar (space-time), Back to the Future (classic), and The Terminator (action).",
            "query_type": "recommendation"
        },
        {
            "question": "Movies with twist endings?",
            "ground_truth": "Movies with twist endings include The Sixth Sense, Fight Club, and Shutter Island.",
            "query_type": "recommendation"
        },
        {
            "question": "Compare Tarantino and Scorsese's directing styles",
            "ground_truth": "Tarantino is known for non-linear narratives, stylized violence, pop culture references, and sharp dialogue. Scorsese focuses on character studies, moral complexity, crime dramas, and masterful use of music and editing.",
            "query_type": "comparison"
        },
        {
            "question": "Recommend coming of age movies about love, poverty, and friendships growing apart",
            "ground_truth": "Coming of age movies with these themes include Moonlight (explores identity, love, and poverty in three chapters), City of God (Brazilian favela, friendship torn by crime and poverty), Stand By Me (childhood friendships that drift apart over time), The Florida Project (childhood innocence against backdrop of poverty), The Outsiders (class divisions and friendship bonds), and Lady Bird (navigating relationships while dealing with family financial struggles). These films authentically portray how economic hardship and life circumstances can strain friendships while young people navigate first love.",
            "query_type": "recommendation"
        }
    ]

Let's evaluate an easy chain

In [60]:
# Build basic chain
basic_chain = MovieRAGChain(
    plots_path=PLOTS_PATH,
    reviews_path=REVIEWS_PATH,
    max_movies=MAX_MOVIES,
    use_custom_retriever=True,
    use_custom_chunk=True,
    custom_prompt=ZERO_SHOT_QA_PROMPT,
    k=5,
    use_hyde=False,
    use_reranking=False,
)

basic_chain.build()

✓ MovieRAGChain initialized
  Retriever type: custom
  LLM: gpt-4o-mini

Building RAG Pipeline

1. Loading documents...
Limiting to 500 movies
Loading plots from /Users/saghar/Desktop/movie-rag/datasets/rotten-tomatoes-reviews/prep/movie_plots.csv...
Created 383 plot docs.
  ✓ 383 plot documents
Loading reviews from /Users/saghar/Desktop/movie-rag/datasets/rotten-tomatoes-reviews/prep/reviews_w_movies_full.csv...
Created 500 review docs.
  ✓ 500 review documents
✓ Total: 883 reviews and plots documents

2. Chunking with custom func...

Chunking documents...
Chunked 883 docs → 10114 chunks using 'sentence' strategy.

Building base retriever...

3. Building custom retriever...
Loading embedding model: text-embedding-3-small (provider: openai)
✓ Model loaded (dimension: 1536)
✓ FaissDenseRetriever initialized (index_type=flat)
Generating embeddings for 10114 documents...
Embeddings generated
Saving index...
✓ Added 10114 documents to FAISS index
  Index size: 10114

5. Creating QA chain..

<src.langchain.chains.movie_rag.MovieRAGChain at 0x3002beab0>

In [65]:
print("="*60)
print("Evaluating basic chain")
print("="*60)
test_set = create_movie_test_set()
scores = evaluate_chain(basic_chain, test_set)
print_results(scores)

Evaluating basic chain
Running chain on test set...
  [1/5] What makes Christopher Nolan's directing style unique?
answer:The provided information does not include any details about Christopher Nolan's directing style. Therefore, I cannot answer your question based on the available information....
contexts:["Movie title: Mission to Mars\n\nReview: It's hardly an original movie ... But its cinematography is li", "Movie title: Control\n\nReview: On a certain level, Corbijn's approach to filmmaking reminds me of Gus", 'Movie title: Heat\n\nReview: Ominous, operatic, often emulated but never equaled. This is go-for-broke', 'Movie title: Transcendence\n\nReview: In his first film as director, acclaimed cinematographer Wally P', "Movie title: Collateral\n\nReview: ... first and foremost a director's movie ... [one] that shows off "]...
  [2/5] Recommend sci-fi movies with time travel themes
answer:I recommend the following sci-fi movies with time travel themes:

1. **Time After Time** (1979)

Evaluating:   0%|          | 0/10 [00:00<?, ?it/s]


RAGAS EVALUATION RESULTS

Scores:
  Faithfulness:      [1.0, 1.0, 0.8888888888888888, 0.75, 0.7777777777777778]
  Answer Relevancy:  [np.float64(0.0), np.float64(0.9749786110846346), np.float64(0.9698494710799194), np.float64(0.0), np.float64(0.9555845319613662)]


In [66]:
print("="*60)
print("Evaluating full chain (hyde + reranking)")
print("="*60)
scores = evaluate_chain(full_chain, test_set)
print_results(scores)

Evaluating full chain (hyde + reranking)
Running chain on test set...
  [1/5] What makes Christopher Nolan's directing style unique?
answer:The provided information does not include specific details about Christopher Nolan's directing style. It primarily discusses the film "Transcendence," directed by Wally Pfister, and does not elaborate on Nolan's unique qualities as a director. Therefore, I cannot answer your question based on the av...
contexts:['Movie title: Transcendence\n\nReview: Depending on how the next few decades unfold, people will look b', 'Movie title: Transcendence\n\nReview: In his first film as director, acclaimed cinematographer Wally P', 'Movie title: Limitless\n\nReview: The cinematic equivalent of a meal high in calories and low in nutri', "Movie title: Collateral\n\nReview: Mann is not a playwright who films his work. He's an instinctive vi", "Movie title: Transcendence\n\nReview: I've always maintained that even the best actors cannot bring li"]...
  [2/5] Recom

Evaluating:   0%|          | 0/10 [00:00<?, ?it/s]


RAGAS EVALUATION RESULTS

Scores:
  Faithfulness:      [1.0, 0.8461538461538461, 1.0, 0.8823529411764706, 0.7058823529411765]
  Answer Relevancy:  [np.float64(0.0), np.float64(0.9749805168489232), np.float64(0.9783818627841634), np.float64(0.9464787211704856), np.float64(0.9140418491582442)]


In [67]:
print("="*60)
print("Evaluating self-rag, full chain (hyde + reranking)")
print("="*60)
scores = evaluate_chain(self_rag_chain, test_set)
print_results(scores)

Evaluating self-rag, full chain (hyde + reranking)
Running chain on test set...
  [1/5] What makes Christopher Nolan's directing style unique?

Question: What makes Christopher Nolan's directing style unique?

Initial answer:
The provided information does not specifically address Christopher Nolan's directing style. However, it does mention that Wally Pfister, who directed "Transcendence," worked with Nolan on films like "The Dark Knight Rises" and "Inception," suggesting a connection in visual style. For a detailed understanding of Nolan's unique directing style, additional information would be needed.

Critique: BAD

Refining using existing sources...

Refined answer:
Christopher Nolan's directing style is characterized by several unique elements that set him apart from other filmmakers. While the provided sources do not directly analyze Nolan's work, they do highlight aspects of his collaborations with cinematographer Wally Pfister, who worked on films like "The Dark Knight Rises" a

Evaluating:   0%|          | 0/10 [00:00<?, ?it/s]


RAGAS EVALUATION RESULTS

Scores:
  Faithfulness:      [0.16666666666666666, 0.9230769230769231, 1.0, 1.0, 0.75]
  Answer Relevancy:  [np.float64(0.9704492160679127), np.float64(0.9749885841397106), np.float64(0.9695695275407932), np.float64(0.9464787211704856), np.float64(0.9601143922522034)]


#### Conclusions:

1- I was not able to get correct context precision and recall from RAGAS (I wonder why. Are my contexts too hard to understand?)

2- As I guessed, basic chain is enough to get a good performance (hyde and reranking made 1 answer better, from unknown to an answer)

3- Self-rag was able to change another answer from unknown to an answer, however in the answer it did not use context (cause it doesn't exist) and it used its own generative abilities.