## Basic agent with two tools: text retrieval and visual retrieval

In [1]:
from src.langchain.chains.movie_rag import MovieRAGChain
from src. langchain.loaders import MoviePosterDocumentLoader
from src.retrievers.visual_retriever import VisualRetriever
from src.langchain.prompts import ZERO_SHOT_QA_PROMPT

In [2]:
poster_loader = MoviePosterDocumentLoader('/Users/saghar/Desktop/movie-rag/datasets/rotten-tomatoes-reviews/prep/movie_posters.csv', max_movies=1000)   # small number since model is local
visual_docs = poster_loader.load()
visual_retriever = VisualRetriever(model_name="ViT-B/32", use_text_fusion=True, alpha=0.8)
visual_retriever.add_documents(visual_docs)

PLOTS_PATH = "/Users/saghar/Desktop/movie-rag/datasets/rotten-tomatoes-reviews/prep/movie_plots.csv"
REVIEWS_PATH = "/Users/saghar/Desktop/movie-rag/datasets/rotten-tomatoes-reviews/prep/reviews_w_movies_full.csv"
MAX_MOVIES = 500  # Limit for faster demos

full_chain = MovieRAGChain(
    plots_path=PLOTS_PATH,
    reviews_path=REVIEWS_PATH,
    max_movies=MAX_MOVIES,
    use_custom_retriever=True,
    use_custom_chunk=True,
    custom_prompt=ZERO_SHOT_QA_PROMPT,
    k=5,
    use_hyde=True,
    hyde_model="gpt-4o-mini",
    use_reranking=True,
    reranker_cfg={'type':'llm'},
    initial_k=20, 
)
full_chain.build()

Loading posters from /Users/saghar/Desktop/movie-rag/datasets/rotten-tomatoes-reviews/prep/movie_posters.csv...
Created 947 poster docs.
  ✓ 947 poster documents
Loading CLIP model: ViT-B/32
✓ CLIP loaded on cpu
✓ VisualRetriever (text_fusion=True, method=weight_average)
  Alpha: 0.8 (image=0.8, text=0.2)
Encoding 947 posters with CLIP...
Encoding 947 text descriptions with CLIP...
✓ Fused embeddings using weight_average method (dim=512)
✓ Added 947 posters to index
✓ MovieRAGChain initialized
  Retriever type: custom + reranking + HyDE
  LLM: gpt-4o-mini

Building RAG Pipeline

1. Loading documents...
Limiting to 500 movies
Loading plots from /Users/saghar/Desktop/movie-rag/datasets/rotten-tomatoes-reviews/prep/movie_plots.csv...
Created 383 plot docs.
  ✓ 383 plot documents
Loading reviews from /Users/saghar/Desktop/movie-rag/datasets/rotten-tomatoes-reviews/prep/reviews_w_movies_full.csv...
Created 500 review docs.
  ✓ 500 review documents
✓ Total: 883 reviews and plots documents

2

<src.langchain.chains.movie_rag.MovieRAGChain at 0x16797aea0>

#### Creating and testing tools

In [3]:
from src.agents.tools import TextMovieTool, VisualMovieTool, CombinedMovieTool, SQLMovieTool

In [6]:
# Create tools
text_tool = TextMovieTool(full_chain).get_tool()
visual_tool = VisualMovieTool(visual_retriever).get_tool()
combined_tool = CombinedMovieTool(full_chain, visual_retriever).get_tool()
sql_tool = SQLMovieTool().get_tool()

print("=" * 60)
print("Testing TEXT tool:")
print("=" * 60)
result = text_tool.invoke("What is a funny romantic movie set in Europe?")
print(result)

print("\n" + "=" * 60)
print("Testing VISUAL tool:")
print("=" * 60)
result = visual_tool.invoke("dark moody sci-fi films")
print(result)

print("\n" + "=" * 60)
print("Testing COMBINED tool:")
print("=" * 60)
result = combined_tool.invoke("what cartoon is about family and has bright visuals?")
print(result)

print("\n" + "=" * 60)
print("Testing SQL tool:")
print("=" * 60)
result = sql_tool.invoke("what are the top rated romantic movies made after 2010?")
print(result)


Testing TEXT tool:


  documents = self.base_retriever.get_relevant_documents(hypothetical_doc)


A funny romantic movie set in Europe is "A Royal Night Out." It features a fictional depiction of British princesses partying with commoners at the end of World War II, blending romance and comedy in a light-hearted manner. Another option is "Chalet Girl," a feel-good British romcom that offers a witty script and an infectious sense of fun.

Testing VISUAL tool:
The posters listed in the search results primarily represent films that fall into the drama, action, and romance genres, with a notable emphasis on darker themes and moody atmospheres. However, only one of the films, "Virus" (1999), explicitly fits the sci-fi category, as it combines elements of action, horror, and science fiction. 

The commonality among these films is their exploration of intense emotional experiences, whether through drama, romance, or the high-stakes scenarios presented in action and horror. The presence of notable actors and directors also suggests a focus on character-driven narratives and complex themes.

In [14]:
print("\n" + "=" * 60)
print("Testing COMBINED tool:")
print("=" * 60)
result = combined_tool.invoke("dark moody sci-fi films")
print(result)


Testing COMBINED tool:
For the category of "dark moody sci-fi films," the most relevant movie from the provided list is **"9" (2009)**. 

### Textual Alignment:
"9" is an animated film that fits the dark and moody sci-fi genre. It presents a post-apocalyptic world where sentient dolls, created from the remnants of humanity, struggle for survival against machines that have wiped out civilization. The film's atmosphere is filled with tension, isolation, and existential themes, similar to "Alien." Both films explore the fragility of life and the consequences of humanity's actions, making "9" a compelling choice for fans of dark sci-fi.

### Visual Alignment:
While the other films listed do not primarily fall into the sci-fi genre, "9" features a distinct visual style that complements its dark themes. The animation is gritty and atmospheric, enhancing the sense of dread and despair that permeates the story. This aligns with the haunting visuals mentioned in the description of "Alien."

##

#### Testing agent

In [4]:
from src.agents.movie_agent import MovieAgent

In [5]:
agent = MovieAgent(full_chain, visual_retriever)

queries = [
    "What is the Matrix about?",
    "Dark moody sci-fi genre films",
    "what themes does titanic touch?",
    "Movies with vibrant colors",
    "drama about love and lust with engaging red poster",
    "top rated romantic movies made after 2012",
    "comedy movies featuring tom hanks with good critic reviews"
]

for query in queries:
    agent.query(query)
    print("\n")

Creating tools...
Building agent graph...
✓ Agent ready!

What is the Matrix about?
Tool Calls:
  search_movies_by_content (call_UGp1CRuZKin1LNw9Ca7tbuIB)
 Call ID: call_UGp1CRuZKin1LNw9Ca7tbuIB
  Args:
    query: What is The Matrix about?


  documents = self.base_retriever.get_relevant_documents(hypothetical_doc)


Name: search_movies_by_content

The provided information does not include a description or details about "The Matrix." Therefore, I cannot answer your question about what "The Matrix" is about.

"The Matrix" is a science fiction film that explores themes of reality, perception, and control. The story follows a computer hacker named Neo, who discovers that the world he lives in is a simulated reality created by sentient machines to subdue humanity. As he learns the truth about the Matrix, he joins a group of rebels led by Morpheus and Trinity, who fight against the machines to free humanity from this illusion. The film delves into philosophical questions about existence, free will, and the nature of reality, all while featuring groundbreaking visual effects and action sequences.

Done!



Dark moody sci-fi genre films
Tool Calls:
  search_movies_by_visual (call_fvGMH3AaVlBVIQeVmXHkH0sY)
 Call ID: call_fvGMH3AaVlBVIQeVmXHkH0sY
  Args:
    query: dark moody sci-fi films
Name: search_movie

In [11]:
_ = agent.query("recommend some scifi movie with yellow posters from the nineties")


recommend some scifi movie with yellow posters from the nineties
Tool Calls:
  search_movies_by_visual (call_QXKmGeqt0zmTXatOq2WG2qfc)
 Call ID: call_QXKmGeqt0zmTXatOq2WG2qfc
  Args:
    query: sci-fi movies with yellow posters from the nineties
Name: search_movies_by_visual

The posters listed in the search results share a common theme of being visually striking, with a focus on the science fiction genre, particularly in the context of the nineties. The films include a mix of action, adventure, horror, and drama, indicating a diverse range of storytelling within the sci-fi realm. 

1. **Virus (1999)** - This film combines elements of action, adventure, and horror, suggesting a thrilling narrative involving technology or extraterrestrial threats.
2. **Election (1999)** - While primarily a comedy, it may incorporate satirical elements that touch on societal issues, possibly reflecting a dystopian or speculative future.
3. **Jesus' Son (1999)** - This drama, while not strictly sci-fi, m

In [12]:
_ = agent.query("what's that movie where the kid is left alone")


what's that movie where the kid is left alone
Tool Calls:
  search_movies_by_content (call_gA5sadQAStQU6pRnR2RjmKNX)
 Call ID: call_gA5sadQAStQU6pRnR2RjmKNX
  Args:
    query: kid left alone
Name: search_movies_by_content

The movie "The Ant Bully" features a young boy named Lucas who is navigating his independence in a big world, which may resonate with the theme of a kid left alone. Additionally, "Walkabout" involves children who are left to survive in the wilderness, highlighting their struggle and need to return to civilization. Both films explore themes related to children facing challenges on their own.

The movie you're thinking of might be "Home Alone," where a young boy named Kevin is accidentally left behind when his family goes on vacation, leading to a series of comedic adventures as he defends his home from burglars. 

If you're looking for other films with similar themes, "The Ant Bully" features a boy navigating independence, and "Walkabout" involves children surviving 

#### Just to see what query the sql tool generates

In [7]:
class SQLMovieTool:

    def __init__(self, llm_model: str = "gpt-4o-mini", llm_temperature: float = 0.0) -> None:
        """
        SQL tool for structured queries about movies.
        Args:
            llm_model: language model name
            llm_temperature: llm temperature
        """
        self.db = MovieDatabase()
        self.db.connect()
        self.llm = ChatOpenAI(model=llm_model, temperature=llm_temperature)

    def get_tool(self):
        db = self.db
        llm = self.llm

        @tool
        def search_movies_by_metadata(question: str) -> str:
            """
            Search movies using structured metadata (ratings, years, counts).

            - Counts and statistics ("How many movies...", "What percentage...")
            - Ratings and scores ("movies rated above 8")
            - Years and dates ("movies from the 1990s")
            - Sorting, filtering, and ranking ("Top 10 highest rated")
            - Aggregations ("Average rating of sci-fi movies")
            - Comparisons ("Which has more reviews, X or Y?")

            DO NOT use for:
            - Plot summaries, story content, review sentiments and review contents (use text tool)
            - Visual style queries (use visual tool)

            Args:
                question: Question about structured movie data

            Returns:
                Structured query result
            """
            # Generate SQL from natural language
            sql_prompt = SQL_TOOL_PROMPT.format(question=question)

            sql_query = llm.invoke(sql_prompt).content.strip()

            # Clean up SQL (remove markdown, etc.)
            sql_query = sql_query.replace("```sql", "").replace("```", "").strip()
            print(sql_query)
            # Execute query, retry logic
            try:
                result = db.query(sql_query)

                if result.empty:
                    return f"No movies found matching: {question}. Tried query: SQL: {sql_query}"

                if len(result) > 20:
                    # Summarize large results
                    summary = f"Found {len(result)} results. Top 10:\n\n"
                    summary += result.head(10).to_string(index=False)
                    return summary
                else:
                    return result.to_string(index=False)

            except Exception as e:
                # Try to fix common SQL errors
                if "no such column" in str(e):
                    return f"Database doesn't have that information. Error executing query: {str(e)}\nSQL: {sql_query}"
                return f"Error: {e}. Error executing query: {str(e)}\nSQL: {sql_query}"

        return search_movies_by_metadata

In [13]:
from src.data.sqlite_database import MovieDatabase
from langchain_openai import ChatOpenAI
from langchain.tools import tool
from src.langchain.prompts import SQL_TOOL_PROMPT

sql_tool = SQLMovieTool().get_tool()

queries = [
    "top rated romantic movies made after 2012",
    "comedy movies featuring tom hanks with good critic reviews"
]

for query in queries:
    sql_tool(query)
    print("\n")

SELECT movie_title, release_year, imdb_rating, tomatometer_rating, audience_rating
FROM movies
WHERE genres LIKE '%Romance%'
AND release_year > 2012
ORDER BY imdb_rating DESC, tomatometer_rating DESC, audience_rating DESC;


SELECT * 
FROM movies 
WHERE genres LIKE '%Comedy%' 
AND actors LIKE '%Tom Hanks%' 
AND tomatometer_rating >= 75;


