### Processing csv files into multiple structured stored txt files

In [None]:
import pandas as pd
import os

In [None]:
df = pd.read_csv("netflix_titles.csv")
df.fillna('Unknown', inplace=True)
df['rating'] = df['rating'].replace(['74 min', '84 min', '66 min'], 'Unknown')
print(df['rating'].unique())

['PG-13' 'TV-MA' 'PG' 'TV-14' 'TV-PG' 'TV-Y' 'TV-Y7' 'R' 'TV-G' 'G'
 'NC-17' 'Unknown' 'NR' 'TV-Y7-FV' 'UR']


In [None]:
data = df

if not os.path.exists('netflix'):
    os.makedirs('netflix')

for index, row in data.iterrows():

    filename = f"netflix/{row['show_id']}.txt"

    with open(filename, 'w', encoding='utf-8') as f:
        f.write(f"Show ID: {row['show_id']}\n")
        f.write(f"Type: {row['type']}\n")
        f.write(f"Show Title: {row['title']}\n")
        f.write(f"Director: {row['director']}\n")
        f.write(f"Cast: {row['cast']}\n")
        f.write(f"Country: {row['country']}\n")
        f.write(f"Date Added to netflix: {row['date_added']}\n")
        f.write(f"Release Year: {row['release_year']}\n")
        f.write(f"Rating: {row['rating']}\n")
        f.write(f"Duration: {row['duration']}\n")
        f.write(f"Listed In: {row['listed_in']}\n")
        f.write(f"Description: {row['description']}\n")


### Build Langchain

In [1]:
import os
os.environ["OPENAI_API_KEY"] = ' '
from langchain import hub
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.text_splitter import CharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores.faiss import FAISS
from langchain.document_loaders import DirectoryLoader

In [2]:
loader = DirectoryLoader('netflix/', glob='**/*.txt')
documents = loader.load()

In [3]:
text_splitter = CharacterTextSplitter( chunk_size=1000, chunk_overlap=50, separator="\n" ) 

docs = text_splitter.split_documents(documents)

In [4]:
embeddings = OpenAIEmbeddings()

In [5]:
# Set up local FAISS vector databases 
from langchain_community.vectorstores.faiss import FAISS

# vectorstore = FAISS.from_documents(docs, embeddings)    
# vectorstore.save_local("vector_db")

In [6]:
retrieval_qa_chat_prompt = hub.pull("langchain-ai/retrieval-qa-chat")

llm = ChatOpenAI()

combine_docs_chain = create_stuff_documents_chain(llm, retrieval_qa_chat_prompt)

In [7]:
from langchain.chains import create_retrieval_chain

retriever = FAISS.load_local("vector_db", embeddings, allow_dangerous_deserialization=True).as_retriever()

retrieval_chain = create_retrieval_chain(retriever, combine_docs_chain)

response = retrieval_chain.invoke({"input": "Can you recommend me some new comedy TV shows?"})

print(response["answer"])

Based on the information provided, I recommend checking out "The Comedy Lineup" and "COMEDIANS of the world" on Netflix. These shows feature a diverse group of up-and-coming comedians from different regions performing stand-up comedy sets, providing a fresh and entertaining comedic experience.


### Content Analysis and Correlation Exploration

In [9]:
response = retrieval_chain.invoke({"input": "Which director and cast work with each other frequently?"})

print(response["answer"])

There is no information provided in the context about any directors or casts that work together frequently.


### Q&A Start

In [8]:
response = retrieval_chain.invoke({"input": "What is the genre of the Birth of the Dragon"})

print(response["answer"])

The genre of "Birth of the Dragon" is Action & Adventure and Dramas.


In [9]:
response = retrieval_chain.invoke({"input": "Who is the director of the Birth of the Dragon"})

print(response["answer"])

The director of "Birth of the Dragon" is George Nolfi.


In [20]:
response = retrieval_chain.invoke({"input": "Who is Sara Colangelo?"})

print(response["answer"])

Sara Colangelo is the director of the movie "The Kindergarten Teacher."


In [22]:
response = retrieval_chain.invoke({"input": "What films has Jim Henson directed?"})

print(response["answer"])

Jim Henson has directed two films: "Labyrinth" and "The Dark Crystal."


In [23]:
response = retrieval_chain.invoke({"input": "tell me the story of Comedy Premium League"})

print(response["answer"])

Comedy Premium League is a TV show where 16 of India's wittiest entertainers compete in teams through satirical sketches, cheeky debates, and blistering roasts to be named the ultimate comedy champs. The show features a competitive atmosphere as these comedians showcase their humor and wit to win the title.


In [24]:
response = retrieval_chain.invoke({"input": "Who are the main actors and actresses of Comedy Premium League"})

print(response["answer"])

The main actors and actresses of "Comedy Premium League" are unknown as the information provided does not list the specific cast members.


#### the above answer is correct as the information provided does not list the specific cast members.

In [25]:
response = retrieval_chain.invoke({"input": "Who are the main actors and actresses of Korean Cold Noodle Rhapsody"})

print(response["answer"])

The main cast of "Korean Cold Noodle Rhapsody" includes Paik Jong-won.


In [26]:
response = retrieval_chain.invoke({"input": "How long is Korean Cold Noodle Rhapsody"})

print(response["answer"])

Korean Cold Noodle Rhapsody has a duration of 1 season.


In [28]:
response = retrieval_chain.invoke({"input": "Which country produced Lokillo: Nothing's the Same"})

print(response["answer"])

Colombia produced Lokillo: Nothing's the Same.


In [32]:
response = retrieval_chain.invoke({"input": "recommend a few movies that Lee Jung-jae was mainly in?"})

print(response["answer"])

One movie where Lee Jung-jae primarily starred in is "Chief of Staff," which is a TV Show.


In [33]:
response = retrieval_chain.invoke({"input": "recommend any recent popular movies and introduce"})

print(response["answer"])


One popular recent movie on Netflix is "Us and Them." Directed by Rene Liu, this Chinese romantic drama follows the story of two strangers who meet on a train and form a bond that evolves over the years. After a separation, they reconnect and reflect on their love for each other. The film, released in 2018, stars Jing Boran, Zhou Dongyu, Zhuangzhuang Tian, Qu Zheming, and Zhang Zixian. With a duration of 119 minutes and a TV-MA rating, "Us and Them" is listed in the categories of Dramas, International Movies, and Romantic Movies.


In [34]:
response = retrieval_chain.invoke({"input": "What are some movies with a similar plot to Jeans"})

print(response["answer"])

Some movies with a similar plot to "Jeans" could include romantic comedies where characters create alter egos or engage in elaborate schemes to win back their love interests. Examples could include movies like "Cyrano de Bergerac," "She's the Man," or "Tootsie."


In [35]:
response = retrieval_chain.invoke({"input": "It's been a long day at work and I'd like to relax. Do you have any movie recommendations?"})

print(response["answer"])

Based on your preference to relax, I recommend watching "Headspace: Unwind Your Mind." It is an interactive special that allows you to personalize the experience according to your mood or mindset, perfect for unwinding after a long day at work.


In [36]:
response = retrieval_chain.invoke({"input": "tell me something about the film industry in Singapore"})

print(response["answer"])

Singapore has a growing film industry that produces a variety of content, including movies and TV shows. The film "Singapore Social" provides a glimpse into the lives of young Singaporeans, while the movie "A Land Imagined" delves into a mystery involving a Chinese migrant worker in Singapore. The industry seems to be exploring diverse themes and collaborating with international partners, as seen in the countries listed for co-productions like France and the Netherlands.


In [37]:
response = retrieval_chain.invoke({"input": "recommend me some new comedy TV shows?"})

print(response["answer"])

Based on the information provided, I recommend checking out "The Comedy Lineup" and "COMEDIANS of the world" on Netflix. Both shows feature a diverse group of up-and-coming comedians and established comics performing stand-up sets, offering a mix of humor from different regions and perspectives.
