### Importing the libraries

In [1]:
import pandas as pd
import numpy as np
import google.generativeai as genai
from dotenv import dotenv_values

pd.set_option('display.max_colwidth', None)

### Importing the movie dataset

In [2]:
df = pd.read_csv('dataset/tmdb_5000_movies.csv')

In [3]:
columns = ['title', 'overview']

In [4]:
df = df[columns]

In [5]:
df.head()

Unnamed: 0,title,overview
0,Avatar,"In the 22nd century, a paraplegic Marine is dispatched to the moon Pandora on a unique mission, but becomes torn between following orders and protecting an alien civilization."
1,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, has come back to life and is headed to the edge of the Earth with Will Turner and Elizabeth Swann. But nothing is quite as it seems."
2,Spectre,"A cryptic message from Bond’s past sends him on a trail to uncover a sinister organization. While M battles political forces to keep the secret service alive, Bond peels back the layers of deceit to reveal the terrible truth behind SPECTRE."
3,The Dark Knight Rises,"Following the death of District Attorney Harvey Dent, Batman assumes responsibility for Dent's crimes to protect the late attorney's reputation and is subsequently hunted by the Gotham City Police Department. Eight years later, Batman encounters the mysterious Selina Kyle and the villainous Bane, a new terrorist leader who overwhelms Gotham's finest. The Dark Knight resurfaces to protect a city that has branded him an enemy."
4,John Carter,"John Carter is a war-weary, former military captain who's inexplicably transported to the mysterious and exotic planet of Barsoom (Mars) and reluctantly becomes embroiled in an epic conflict. It's a world on the brink of collapse, and Carter rediscovers his humanity when he realizes the survival of Barsoom and its people rests in his hands."


In [6]:
df.shape

(4803, 2)

In [7]:
df.isna().sum()

title       0
overview    3
dtype: int64

In [8]:
df.dropna(inplace=True)

In [9]:
df.sample(10)

Unnamed: 0,title,overview
4193,Conquest of the Planet of the Apes,"In a futuristic world that has embraced ape slavery, Caesar, the son of the late simians Cornelius and Zira, surfaces after almost twenty years of hiding out from the authorities, and prepares for a slave revolt against humanity."
921,We Bought a Zoo,"Benjamin has lost his wife and, in a bid to start his life over, purchases a large house that has a zoo – welcome news for his daughter, but his son is not happy about it. The zoo is need of renovation and Benjamin sets about the work with the head keeper and the rest of the staff, but, the zoo soon runs into financial trouble."
871,Gigli,"Gigli is ordered to kidnap the psychologically challenged younger brother of a powerful federal prosecutor. When plans go awry, Gigli's boss sends in Ricki, a gorgeous free-spirited female gangster who has her own set of orders to assist with the kidnapping. But Gigli begins falling for the decidedly unavailable Ricki, which could be a hazard to his occupation."
756,Intolerable Cruelty,A revenge-seeking gold digger marries a womanizing Beverly Hills lawyer with the intention of making a killing in the divorce.
4762,George Washington,A delicately told and deceptively simple story of a group of children in a depressed small town who band together to cover up a tragic mistake.
881,Beloved,"After Paul D. finds his old slave friend Sethe in Ohio and moves in with her and her daughter Denver, a strange girl comes along by the name of ""Beloved"". Sethe and Denver take her in and then strange things start to happen..."
2530,Beetlejuice,"Thanks to an untimely demise via drowning, a young couple end up as poltergeists in their New England farmhouse, where they fail to meet the challenge of scaring away the insufferable new owners, who want to make drastic changes. In desperation, the undead newlyweds turn to an expert frightmeister, but he's got a diabolical agenda of his own."
1483,Step Up Revolution,"Emily arrives in Miami with aspirations to become a professional dancer. She sparks with Sean, the leader of a dance crew whose neighborhood is threatened by Emily's father's development plans."
172,The Twilight Saga: Breaking Dawn - Part 2,"After the birth of Renesmee, the Cullens gather other vampire clans in order to protect the child from a false allegation that puts the family in front of the Volturi."
3437,Tracker,"An ex-Boer war guerrilla in New Zealand is sent out to bring back a Maori accused of killing a British soldier. Gradually they grow to know and respect one another but a posse, led by the British Commanding officer is close behind and his sole intention is to see the Maori hang. Written by Filmfinders 1903. A guerilla fighter from the South African Boer war called Arjan (Winstone) takes on a manhunt for Maori seaman Kereama (Morrison), who is accused of murdering a British soldier. What follows is a cat and mouse pursuit through the varied landscape of NZ with both hunter and huntee testing their bushcraft and wits against that of the other. Written by Anonymous"


### Embedding the dataframe

In [10]:
ENV = dotenv_values(".env")
api_key = ENV['API_KEY']
genai.configure(api_key=api_key)

In [11]:
def embed_content(title, text, model, task_type):
    if text is None:
        return genai.embed_content(
            model=model,
            content=text,
            task_type=task_type,
        )["embedding"]
    else:
        return genai.embed_content(
            model=model,
            title=title,
            content=text,
            task_type=task_type,
        )["embedding"]

In [12]:
embedding_model = "models/embedding-001"
df["Embeddings"] = df.apply(
    lambda x: embed_content(
        x["title"],
        x["overview"],
        embedding_model,
        "RETRIEVAL_DOCUMENT",
    ),
    axis=1,
)

In [14]:
#export to csv
df.to_csv('dataset/tmdb_5000_movies_embed.csv', index=False)