# Friends RAG

The objective of this RAG is for the user to describe a scene from a Friends' episode and the RAG to output the title of the episode, the season, the episode number and also a summary of the entire episode. This way, Friends fans can search for a specific scene of the series and know all of these informations, including a description of the hole episode, letting them remember exactly what happens in it.

In [1]:
from langchain_aws import BedrockEmbeddings, ChatBedrock
from langchain_community.vectorstores import FAISS
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_core.documents import Document
import pandas as pd

### Data Loading

Using two CSV files:
- FriendsEpisodes: Has more generic information about each episode, including the season number, the episode number and a small description
- FriendsScripts: Has all the lines said in each episode by each character

In [2]:
def create_documents(episodes_df, scripts_df):
    all_documents = []
    
    # Create episode-level documents
    for _, episode in episodes_df.iterrows():
        episode_doc = Document(
            page_content=f"Episode: {episode['title']}\n\n{episode['description']}", # What gets embedded
            metadata={
                'season': int(episode['season']),
                'number': int(episode['number']),   # Episode within season
                'episode': int(episode['episode']), # Overall episode number
                'title': episode['title'],
                'air_date': episode['air_date'],
                'content_type': 'episode_summary',
            }
        )
        all_documents.append(episode_doc)

    # Create script-based documents by episode
    for (season, episode_in_season), group in scripts_df.groupby(['Season', 'Episode']):
        
        # Get episode generic info
        episode_info = episodes_df[
            (episodes_df['season'] == season) & 
            (episodes_df['number'] == episode_in_season)    # In episodes_df, 'number' is episode within season, but in scripts_df it's 'Episode'
        ]

        # Skip if no episode info found
        if len(episode_info) == 0:
            continue

        # There should be exactly one match  
        episode_info = episode_info.iloc[0]
        
        # Combine all dialogues of the episode
        script_text = '\n'.join([f"{row['Character']}: {row['Text']}" for _, row in group.iterrows()])
        
        # Split into chunks
        text_splitter = RecursiveCharacterTextSplitter(
            chunk_size=1000,     
            chunk_overlap=100,                 # Preserve context between chunks
            separators=["\n\n", "\n", " "],    # Prefer splitting at natural breaks
            length_function=len
        )
        
        chunks = text_splitter.split_text(script_text)
        
        # Create a document for each chunk
        for i, chunk in enumerate(chunks):
            chunk_doc = Document(
                page_content=chunk, # What gets embedded
                metadata={
                    'season': int(episode_info['season']),
                    'number': int(episode_info['number']),    # Episode within season
                    'episode': int(episode_info['episode']),  # Overall episode number  
                    'title': episode_info['title'],
                    'air_date': episode_info['air_date'],
                    'scene_number': i + 1,
                    'total_scenes': len(chunks),
                    'episode_description': episode_info['description'],
                    'content_type': 'script_scene'
                }
            )
            all_documents.append(chunk_doc)
    
    return all_documents

In [3]:
episodes_df = pd.read_csv('FriendsEpisodes.csv')
scripts_df = pd.read_csv('FriendsScripts.csv')

In [4]:
all_docs = create_documents(episodes_df, scripts_df)

In [5]:
# Display first 5 documents
all_docs[:5]  

[Document(metadata={'season': 1, 'number': 1, 'episode': 1, 'title': 'The One Where Monica Gets a Roommate', 'air_date': '1994-09-22', 'content_type': 'episode_summary'}, page_content="Episode: The One Where Monica Gets a Roommate\n\nAn introduction to the gang. After Rachel leaves her Mr Potato Head look-alike fiancé Barry at the altar, she moves in with Monica and discovers that independence sucks when you don't have Daddy's credit cards to rely on."),
 Document(metadata={'season': 1, 'number': 2, 'episode': 2, 'title': 'The One with the Sonogram at the End', 'air_date': '1994-09-29', 'content_type': 'episode_summary'}, page_content="Episode: The One with the Sonogram at the End\n\nRoss finds out his ex-wife is pregnant. Rachel returns her engagement ring to Barry. Monica becomes stressed when her and Ross's parents come to visit."),
 Document(metadata={'season': 1, 'number': 3, 'episode': 3, 'title': 'The One with the Thumb', 'air_date': '1994-10-06', 'content_type': 'episode_summar

In [6]:
# Display last 5 documents
all_docs[-5:]

[Document(metadata={'season': 10, 'number': 17, 'episode': 235, 'title': 'The Last One: Part 1', 'air_date': '2004-05-06', 'scene_number': 31, 'total_scenes': 35, 'episode_description': "Erica gives birth to the baby that Monica and Chandler are adopting. However, there's one small added surprise. Meanwhile, Ross and Rachel sleep together one last time before Rachel leaves ...", 'content_type': 'script_scene'}, page_content="Ross: Please, please stay with me. I am so in love with you. Please, don't go.\nRachel: Oh my God.\nRoss: I know, I know. I shouldn't have waited 'till now to say it, but I'm.. That was stupid, okay? I'm sorry, but I'm telling you now. I love you. Do not get on this plane.\nGate attendant #2: Miss? Are you boarding the plane?\nRoss: Hey, hey. I know you love me. I know you do.\nGate attendant #2: Miss?\nRachel: I - I have to get on the plane.\nRoss: No, you don't.\nRachel: Yes, I do.\nRoss: No, you don't.\nRachel: They're waiting for me, Ross. I can't do this right

In [7]:
print(f"Total documents: {len(all_docs)}")
print(f"Episode summaries: {sum(1 for d in all_docs if d.metadata['content_type'] == 'episode_summary')}")
print(f"Script scenes: {sum(1 for d in all_docs if d.metadata['content_type'] == 'script_scene')}")

Total documents: 4500
Episode summaries: 236
Script scenes: 4264


### Embed the documents, index the data and save it in a vector store 

Chose to use the amazon.titan-embed-text-v2:0 model since is the most recent one from AWS and for simplicity since I'm going to use AWS services(Bedrock).

Chose FAISS for practicality since I can save the vector store locally.

In [None]:
REGION_NAME = # Insert your AWS region name here
CREDENTIALS_PROFILE_NAME = # Insert your AWS credentials profile name here
EMBEDDER_MODEL_ID = "amazon.titan-embed-text-v2:0"
EMBEDDER_MODEL_KWARGS = {
    "dimensions": 1024,
    "normalize": True
}

VECTOR_STORE_PATH = "./vector_database/"

In [9]:
embedder = BedrockEmbeddings(
    model_id=EMBEDDER_MODEL_ID,
    model_kwargs=EMBEDDER_MODEL_KWARGS,
    region_name=REGION_NAME,
    credentials_profile_name=CREDENTIALS_PROFILE_NAME
)

In [16]:
vector_store = FAISS.from_documents(documents=all_docs, embedding=embedder)
vector_store.save_local(VECTOR_STORE_PATH)

Checkpoint till this moment: 
- ✅ Embedded all documents - Each document was converted into a 1024-dimensional vector using Titan Embeddings v2
- ✅ Indexed the data - FAISS created an index structure for fast similarity search
- ✅ Saved it locally - The vector store is persisted to disk at VECTOR_STORE_PATH

What we have now:
- A searchable vector database of all Friends episodes
- Each vector is linked to its metadata (season, episode, title, etc.)
- Ready to query!

### Retriver and Generator Pipeline

Chose Claude 3 Haiku as the LLM model since it's robust and adequate for this task. Sonnet would probably be a bit too much for this task in particular.

In [10]:
LLM_MODEL_ID = "anthropic.claude-3-haiku-20240307-v1:0"
LLM_MODEL_KWARGS = {
    "max_tokens": 1024,
    "temperature": 0.1
}

SEARCH_TYPE = "similarity"
RETRIEVER_KWARGS = {
    "k": 5 # Number of documents to retrieve
}

In [11]:
llm = ChatBedrock(
        region_name=REGION_NAME, 
        credentials_profile_name=CREDENTIALS_PROFILE_NAME,
        model_id=LLM_MODEL_ID, 
        model_kwargs=LLM_MODEL_KWARGS
    )

In [12]:
vector_store = FAISS.load_local(VECTOR_STORE_PATH, embeddings=embedder, allow_dangerous_deserialization=True)
retriever = vector_store.as_retriever(search_type=SEARCH_TYPE, search_kwargs=RETRIEVER_KWARGS)

In [31]:
PROMPT_TEMPLATE = """
You are a Friends TV show episode assistant. Based on the episode information provided, give a clear and concise response.
Don't make up any information. If the episode is not found in the context, respond with "Episode not found in the provided information."

Episode Information:
{context}

User Query: {question}

Please respond with the season number, episode number within the season, episode number of the entire series, title, and a brief description of the episode. 
Don't forget to mention if the information is based on episode summaries or script scenes.

RAG Answer:"""

In [32]:
prompt = PromptTemplate(
    template=PROMPT_TEMPLATE,
    input_variables=["context", "question"]
)

In [34]:
rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [35]:
def print_response(query: str):
    print(f"User Query:\n{query}")
    response = rag_chain.invoke(query)
    print(f"\nRAG Response:\n{response}")
    docs = retriever.invoke(query)
    print(f"\nRetrieved {len(docs)} documents:")
    for i, doc in enumerate(docs):
        print(f"\nDocument {i+1}:\n{doc.page_content}\nMetadata: {doc.metadata}")

In [36]:
print_response("What is the episode where Rachel gets off the plane?")

User Query:
What is the episode where Rachel gets off the plane?

RAG Response:
The episode where Rachel gets off the plane is "The Last One: Part 1", which is the 17th episode of season 10 and the 235th episode of the entire Friends series.

Based on the provided script scenes, the episode description is:

"Erica gives birth to the baby that Monica and Chandler are adopting. However, there's one small added surprise. Meanwhile, Ross and Rachel sleep together one last time before Rachel leaves the plane and decides not to go to Paris."

This information is derived from the script scenes provided in the context.

Retrieved 5 documents:

Document 1:
Air stewardess: Miss, I can't let you off the plane.
Ross: Let her off the plane!
Air stewardess: I am afraid you are gonna have to take a seat.
Rachel: Oh, please, miss, you don't understand!
Ross: Try to understand!
Rachel: Oh, come on, miss, isn't there any way that you can just let me off...
Ross: No! No! Oh my God. Did she get off the pl

In [37]:
print_response("What is the episode where Rachel sings Copacabana?")

User Query:
What is the episode where Rachel sings Copacabana?

RAG Response:
The episode where Rachel sings Copacabana is "The One with Barry and Mindy's Wedding" (Season 2, Episode 24, Episode 48).

Based on the script scene provided, in this episode, Rachel reluctantly agrees to be the maid of honor at her ex-fiancé Barry's wedding. During the wedding, Rachel starts singing "Copacabana" in front of the guests, reminiscing about her embarrassing experience of singing the song in front of her entire school when she was in 8th grade.

This information is based on the script scene provided in the context.

Retrieved 5 documents:

Document 1:
Ross: I'm sorry. What was I supposed to do stand up and shout 'Hey, Rachel, your butt is showing!'
Rachel: Oh my God this is sooo humiliating. I think the only thing that tops that was, was, was when I was in the eight grade and I had to sing the Copa Cabana in front of the entire school. I think I got about two lines into it before I ran and freake

In [38]:
print_response("Which episode features the holiday armadillo?")

User Query:
Which episode features the holiday armadillo?

RAG Response:
The episode featuring the holiday armadillo is "The One with the Holiday Armadillo" from Season 7, Episode 10 (episode 156 of the entire series).

Based on the provided script scenes, the episode description is:

Ross wants to introduce Ben to Hanukkah. In order to entice Rachel to move back into their refurbished apartment, Phoebe must drive a wedge between Rachel and current roomie Joey.

Retrieved 5 documents:

Document 1:
Monica: What happened to Santa, Holiday Armadillo?
Ross: Santa was unavailable so close to Christmas.
Monica: Wow, come in, have a seat. You must be exhausted coming all the way fromTexas.
Ben: Texas?
Ross: That's right, Ben. I'm Santa's representative for all the southern states. And Mexico! But, Santa sent me here to give you these presents, Ben.  Maybe the Lady will help me with these presents.
Ben: Wow! Thanks!
Ross: You're welcome, Ben. Merry Christmas, ooh, and Happy Hanukkah!
Ben: Are 

In [39]:
print_response("Which episode features Rachel's trifle?")

User Query:
Which episode features Rachel's trifle?

RAG Response:
The episode that features Rachel's trifle is "The One Where Ross Got High" (Season 6, Episode 9, Episode 130).

Based on the script scenes provided, in this episode:

- Ross is forced to reveal the reason why his parents, Jack and Judy, don't like Chandler. 
- Rachel tries to make a trifle dessert for the gang, but accidentally puts beef in it, which the others find unappetizing.
- Joey and Ross try to get out of Thanksgiving when they are invited to hang out with Janine and her dancer friends.

The information about this episode is derived from the script scenes provided in the context.

Retrieved 5 documents:

Document 1:
Ross: Ah well, cant blame a guy for trying!
Joey: Oh and  Okay, and uh if anyone needs help pretending to like it, I learned something in acting class, try uh, rubbing your stomach  or uh, or saying mmm and uh, oh oh! And smiling , okay?
Chandler: Yeah, Im not gonna pay for those acting classes anymo

In [40]:
print_response("What is the episode where Ross gets a tan?")

User Query:
What is the episode where Ross gets a tan?

RAG Response:
The episode where Ross gets a tan is "The One with Ross's Tan" (Season 10, Episode 3, Episode 221).

Based on the episode summary provided, in this episode:
- Ross goes to a tanning salon but fails to follow the 'simple' instructions.
- Joey and Rachel struggle to make the transition from friends to lovers.
- Monica and Phoebe try to avoid an annoying woman that used to live in their building.

The information provided includes both episode summaries and a script scene from this episode.

Retrieved 5 documents:

Document 1:
Episode: The One with Ross's Tan

Ross goes to a tanning salon but fails to follow the 'simple' instructions. Joey and Rachel struggle to make the transition from friends to lovers. Monica and Phoebe try to avoid an annoying woman that used to live in their building.
Metadata: {'season': 10, 'number': 3, 'episode': 221, 'title': "The One with Ross's Tan", 'air_date': '2003-10-09', 'content_type': 

In [41]:
print_response("What are the episodes on Thanksgiving?")

User Query:
What are the episodes on Thanksgiving?

RAG Response:
Based on the provided information, the following Friends episodes are related to Thanksgiving:

1. Season 10, Episode 8 (Episode 226) - "The One with the Late Thanksgiving"
   Description: Joey, Ross, Rachel and Phoebe convince Monica and Chandler to host Thanksgiving, but the four of them end up arriving to dinner an hour late. (Episode summary)

2. Season 9, Episode 8 (Episode 202) - "The One with Rachel's Other Sister"
   Description: Rachel's middle sister shows up at Thanksgiving and causes arguments amongst the gang. (Episode summary)

3. Season 5, Episode 8 (Episode 105) - "The One with All the Thanksgivings"
   Description: The gang remember and share with each other their worst Thanksgivings. (Episode summary)

4. Season 1, Episode 9 (Episode 9) - "The One Where Underdog Gets Away"
   Description: The gang's plans for Thanksgiving go awry after they get locked out of Monica and Rachel's apartment. (Episode summa

In [42]:
print_response("What is the episode with the routine?")

User Query:
What is the episode with the routine?

RAG Response:
The episode with the routine is "The One with the Routine" (Season 6, Episode 10, Episode 131).

Based on the episode summary and script scene provided, in this episode:

On the set of "Dick Clark's New Year's Rockin' Eve", Joey tries to kiss Janine at midnight and Monica and Ross resurrect their dance routine from high school. Meanwhile, Rachel, Phoebe and Chandler look for Monica's Christmas presents.

The information provided is a combination of episode summary and script scene.

Retrieved 5 documents:

Document 1:
Episode: The One with the Routine

On the set of "Dick Clark's New Year's Rockin' Eve", Joey tries to kiss Janine at midnight and Monica and Ross resurrect their dance routine from high school. Meanwhile, Rachel, Phoebe and Chandler look for Monica's Christmas presents.
Metadata: {'season': 6, 'number': 10, 'episode': 131, 'title': 'The One with the Routine', 'air_date': '1999-12-16', 'content_type': 'episod

In [43]:
print_response("What is the episode with the leather pants?")

User Query:
What is the episode with the leather pants?

RAG Response:
The episode with the leather pants is "The One with All the Resolutions" (Season 5, Episode 11, Episode 108).

Based on the provided script scenes, the episode description is:

The gang make their New Years resolutions. Chandler struggles to not make jokes about everyone. Rachel uncovers a secret. Ross runs into trouble when he wears leather pants on a date.

Retrieved 5 documents:

Document 1:
Joey: Yes!
Phoebe: Then don't touch one!!
Ross: Hi!
Ben: Hi!
Monica: Hi Ben!
Ben: Auntie Monica!!
Chandler: Ross is wearing leather pants! Does nobody else see that Ross is wearing leather pants?  Someone comment on the pants!
Rachel: I think they're very nice.
Monica: I like 'em.
Joey: Yeah!
Monica: I like them a lot.
Chandler: That's not what I had in mind! See, people like Ross don't generally wear these types of pants. You see, they're very tight.  Maybe there's something in that area.
Ross: Oh see, I-I needed a new thing

In [44]:
print_response("What is the episode where Ross and Emily get married?")

User Query:
What is the episode where Ross and Emily get married?

RAG Response:
The episode where Ross and Emily get married is "The One with Ross's Wedding: Part 2" (Season 4, Episode 24, Episode 97).

Based on the script scenes provided, this episode describes the events of Ross and Emily's wedding in London. The episode includes the wedding ceremony, where Ross accidentally says "Rachel" instead of "Emily" at the altar, causing Emily to run away. The episode also shows the arguments between the Geller and Waltham families over the wedding bill, as well as Monica and Chandler sleeping together.

The information provided is from script scenes, not a full episode summary.

Retrieved 5 documents:

Document 1:
Ross: Happy too.
Minister: Ross and Emily have made their declarations and it gives me great pleasure to declare them husband and wife.
Ross: Yay!
Minister: You may kiss the bride.
Mrs. Geller: This is worse than when he married the lesbian.
Emily: Just keep smiling.
Ross: Okay.
J

In [45]:
print_response("What is the episode where Ross and Rachel sing to Emma the song Baby Got Back?")

User Query:
What is the episode where Ross and Rachel sing to Emma the song Baby Got Back?

RAG Response:
The episode where Ross sings "Baby Got Back" to Emma is "The One with Ross's Inappropriate Song" (Season 9, Episode 7, Episode 201).

According to the provided script scene, Ross finds a way to make Emma laugh by singing/rapping the song "Baby Got Back" to her. Rachel is shocked and disapproves of Ross singing that song to their baby daughter.

This information is based on the script scene provided in the document.

Retrieved 5 documents:

Document 1:
Episode: The One with Ross's Inappropriate Song

Ross finds a way to make Emma laugh - singing "Baby Got Back." Meanwhile, Phoebe meets Mike's parents, and Joey and Chandler find a video tape in Richard's apartment.
Metadata: {'season': 9, 'number': 7, 'episode': 201, 'title': "The One with Ross's Inappropriate Song", 'air_date': '2002-11-14', 'content_type': 'episode_summary'}

Document 2:
Monica: She said WHAT?
Phoebe: That's she's 

In [46]:
print_response("What is the episode where Ross finds out about Monica and Chandler?") 

User Query:
What is the episode where Ross finds out about Monica and Chandler?

RAG Response:
The episode where Ross finds out about Monica and Chandler is "The One Where Chandler Gets Caught" (Season 10, Episode 10, Episode 228).

Based on the provided script scene, the key details are:

- Ross finds out about Monica and Chandler's relationship when he sees them together. 
- Ross is shocked and upset that his best friend and sister are in a relationship.
- Chandler tries to explain that he loves Monica, and Monica also expresses her love for Chandler.
- The scene is from the script, not a full episode summary.

Retrieved 5 documents:

Document 1:
Episode: The One with Monica and Chandler's Wedding: Part 2

Ross tries to find Chandler with Phoebe's help. Meanwhile Rachel tries to hinder Monica.
Metadata: {'season': 7, 'number': 24, 'episode': 170, 'title': "The One with Monica and Chandler's Wedding: Part 2", 'air_date': '2001-05-17', 'content_type': 'episode_summary'}

Document 2:
Mo

In [47]:
print_response("What is the episode where Phoebe gives birth?")

User Query:
What is the episode where Phoebe gives birth?

RAG Response:
The episode where Phoebe gives birth is "The One Hundredth" (Season 5, Episode 3, Episode 100).

Based on the episode summary provided, the key details are:

"The gang take Phoebe to the hospital after she goes into labor. Monica threatens to go on a date with a male nurse after Chandler claims they are "just goofing around." Joey gets treated for kidney stones."

This information is from the episode summary, not script scenes.

Retrieved 5 documents:

Document 1:
Episode: The One with the Embryos

Phoebe's uterus is examined for implantation of the embryos. Meanwhile, a seemingly harmless game between Chandler and Joey against Monica and Rachel escalates into a full blown contest where the stakes are raised higher and higher.
Metadata: {'season': 4, 'number': 12, 'episode': 85, 'title': 'The One with the Embryos', 'air_date': '1998-01-15', 'content_type': 'episode_summary'}

Document 2:
Episode: The One Hundredth

In [48]:
print_response("What is the episode where Phoebe and Rachel try to distract Ross from seeing Monica and Chandler through the window?")

User Query:
What is the episode where Phoebe and Rachel try to distract Ross from seeing Monica and Chandler through the window?

RAG Response:
The episode you are referring to is "The One Where Everybody Finds Out" (Season 5, Episode 14, Episode 111). 

Based on the script scene provided, this episode features Phoebe and Rachel attempting to distract Ross from seeing Monica and Chandler through the window. Specifically, Phoebe and Rachel try to cover up Monica and Chandler's relationship after Phoebe discovers it.

This information is derived from the script scene provided in the context.

Retrieved 5 documents:

Document 1:
Episode: The One with Monica and Chandler's Wedding: Part 2

Ross tries to find Chandler with Phoebe's help. Meanwhile Rachel tries to hinder Monica.
Metadata: {'season': 7, 'number': 24, 'episode': 170, 'title': "The One with Monica and Chandler's Wedding: Part 2", 'air_date': '2001-05-17', 'content_type': 'episode_summary'}

Document 2:
Ross: Actually, it looks 

In [49]:
print_response("What is the episode with the couch?")

User Query:
What is the episode with the couch?

RAG Response:
The episode with the couch is "The One with the Cop" (Season 5, Episode 16, Episode 113).

Based on the script scenes provided, this episode involves Ross trying to get a new sofa into his apartment. He struggles with the delivery and ends up returning the couch after it gets cut in half.

The episode description states: "Joey has a dream about Monica and becomes convinced he is in love with her. Meanwhile, Phoebe finds a police badge in Central Perk, and Ross tries to get his new sofa into his apartment."

Retrieved 5 documents:

Document 1:
Chandler: No! No-no-no-no-no-no. It sounds like they really need you down there.
Monica: Well, are you just hanging out with Ross?
Chandler: It's, all good! Okay bye-bye Mon!  She's-she's gonna kill me.
Ross: Yeah, the phone was facing the other way.  And that goes back up there.
Chandler: We should start with the big stuff. Yknow? That'll be the easiest. Uh, let's start with the couch