#### Question 2: Gathering Movie Data via TMDB API

* Set up the API
    * Create a free TMDB account
    * Generate an API key and review their documentation, especially:
        * /discover/movie
        * /movie/{movie_id}
        * /search/movie
* Collect top movies (2015-2024)
    * For each year from 2015 to 2024:
        * Query TMDB for the top 100 movies (by vote count).
        * For each movie, gather:
            * Title
            * Release Year
            * Genre(s)
            * Vote Average
            * Vote Count
            * Budget
            * Revenue
            * TMDB ID
* Store all results in a single DataFrame and export to movies_2015_2024.csv.
* Hint: TMDB rate limits are generous for free accounts, but you should pause between requests (eg. time.sleep(0.25)).
* Some Oscar films may not appear in the top 100 by vote count. For any missing, use the /search/movie endpoint to add it.

In [36]:
import requests
import json
import time

In [43]:
# Load API key from keys file
with open('keys.json') as fi:
    credentials = json.load(fi)

api_key = credentials['api_key']

In [44]:
endpoint = 'https://api.themoviedb.org/3/discover/movie'

movie_data=[]

# Iterate through all years between 2015 and 2024
for year in range(2015,2025):
    
    # Each page contains 20 results, so we need to iterate through 5 pages to get 100 results
    for page in range(1,6):

        # Define params
        params = {
            'api_key': api_key,
            'primary_release_year': year,
            'sort_by': 'vote_count.desc',
            'page': page
        }
    
        # Get response
        response = requests.get(endpoint, params = params)
        res = response.json()['results']
        for movie in res:
            movie_data.append(movie)

        # Sleep before next API call 
        time.sleep(0.25)

In [45]:
movie_data[:2]

[{'adult': False,
  'backdrop_path': '/kIBK5SKwgqIIuRKhhWrJn3XkbPq.jpg',
  'genre_ids': [28, 12, 878],
  'id': 99861,
  'original_language': 'en',
  'original_title': 'Avengers: Age of Ultron',
  'overview': 'When Tony Stark tries to jumpstart a dormant peacekeeping program, things go awry and Earthâ€™s Mightiest Heroes are put to the ultimate test as the fate of the planet hangs in the balance. As the villainous Ultron emerges, it is up to The Avengers to stop him from enacting his terrible plans, and soon uneasy alliances and unexpected action pave the way for an epic and unique global adventure.',
  'popularity': 14.9911,
  'poster_path': '/4ssDuvEDkSArWEdyBl2X5EHvYKU.jpg',
  'release_date': '2015-04-22',
  'title': 'Avengers: Age of Ultron',
  'video': False,
  'vote_average': 7.271,
  'vote_count': 23837},
 {'adult': False,
  'backdrop_path': '/gqrnQA6Xppdl8vIb2eJc58VC1tW.jpg',
  'genre_ids': [28, 12, 878],
  'id': 76341,
  'original_language': 'en',
  'original_title': 'Mad Max: 

In [53]:
movie_titles = []
vote_counts = []
tmdb_ids = []
budgets = []
revenues = []

for movie in movie_data:
    movie_titles.append(movie['title'])
    vote_counts.append(movie['vote_count'])

    id = movie['id']
    tmdb_ids.append(id)
    
    # Use the movie id to search for budget and revenue
    endpoint = f'https://api.themoviedb.org/3/movie/{id}'
    # Define params
    params = {
        'api_key': api_key,
    }
    # Get response
    response = requests.get(endpoint, params = params)
    res = response.json()
    budgets.append(res['budget'])
    revenues.append(res['revenue'])

    # Sleep before next API call 
    time.sleep(0.25)