# Movies Reviews Dataset Preprocessing
In this file, we will basically be fetching the IMDB Id's of the all our final names from the last file. Since we would want to fetch reviews of any given particular movie, we want an IMDB ID to do so.

We will append it to our data as a new column for further use.

## Importing the Dependencies

In [4]:
import pandas as pd
import requests
from tmdbv3api import TMDb
import time

import os
from dotenv import load_dotenv
load_dotenv()

TMDB_API_KEY = os.getenv("TMDB_API_KEY")

## Loading the Data

In [54]:
data = pd.read_csv('/Users/dhruv/Desktop/Machine_Learning/Projects/Movie_Recommender/Movies_Final.csv')

In [55]:
data.shape

(26241, 3)

## Function to fetch IMDB IDs using TMDB API

In [6]:
# Setting up TMDB function with API Key
tmdb = TMDb()
tmdb.api_key = TMDB_API_KEY

In [57]:
# Function to get IMDB ID
def get_imdb_id(movie_title):
  movie_id = int(data['id'][data['title'] == movie_title].values[0]) # User will input movie name, from which we can get it's TMDB ID
  imdb_id = ""
  response = requests.get('https://api.themoviedb.org/3/movie/{}?api_key={}&append_to_response=credits'.format(movie_id,tmdb.api_key))
  # Sends a get request to TMDb API with the ID parsed through it and appends the credits
  if response.status_code == 429:
      print("Rate limit exceeded. Waiting...")
      time.sleep(2)
      return get_imdb_id(movie_title)  # Recursively call the function after waiting
  elif response.status_code != 200:
      print(f"Error for ID {movie_id}. Status Code: {response.status_code}")
      return None
      
  data_json = response.json()
  if data_json['imdb_id']:
    imdb_id = data_json['imdb_id']
  
  time.sleep(0.015)

  return imdb_id

Now, let us go ahead and use get request to fetch Director Name and store them in our dataset.

In [60]:
data['IMDB_ID'] = data['title'].apply(get_imdb_id)

In [61]:
data.head(5)

Unnamed: 0,id,title,tags,IMDB_ID
0,615656,Meg 2: The Trench,action sci-fi horror jason statham wu jing shu...,tt9224104
1,758323,The Pope's Exorcist,horror mystery thriller russell crowe daniel z...,tt13375076
2,667538,Transformers: Rise of the Beasts,action adventure sci-fi anthony ramos dominiqu...,tt5090568
3,640146,Ant-Man and the Wasp: Quantumania,action adventure sci-fi paul rudd evangeline l...,tt10954600
4,677179,Creed III,drama action michael b. jordan tessa thompson ...,tt11145118


Thus, we have the imdb id's as well. Let us go ahead and save our data.

In [62]:
data.to_csv("Movies_Final.csv",index=False)