# Final Project for Course 3 - OMDB and TasteDive Mashup

This project will take you through the process of mashing up data from two different APIs to make movie recommendations. The TasteDive API lets you provide a movie (or bands, TV shows, etc.) as a query input, and returns a set of related items. The OMDB API lets you provide a movie title as a query input and get back data about the movie, including scores from various review sites (Rotten Tomatoes, IMDB, etc.).

You will put those two together. You will use TasteDive to get related movies for a whole list of titles. You’ll combine the resulting lists of related movies, and sort them according to their Rotten Tomatoes scores (which will require making API calls to the OMDB API.)

To avoid problems with rate limits and site accessibility, we have provided a cache file with results for all the queries you need to make to both OMDB and TasteDive. Use `requests_with_caching.get()` rather than `requests.get()`. If you’re having trouble, you may not be formatting your queries properly, or you may not be asking for data that exists in our cache. We will try to provide as much information as we can to help guide you to form queries for which data exists in the cache. 

**IMPORTANT: If you run any query on either API that does not find its results in the "permanent cache", then you haven't written the right query. This will probably cause a later step in the process to fail.**



## Step 1: Retrieve related movies from TasteDive

Your first task will be to fetch data from TasteDive. The documentation for the API is at https://tastedive.com/read/api. In case the documentation has changed, you may want to access an [archived version](https://web.archive.org/web/20230530015616/https://tastedive.com/read/api). (Queries will still work even if the API has changed or disappeared because our cache is based on the archived version.)

Define a function, called `get_movies_from_tastedive`. It should take one input parameter, a string that is the name of a movie or music artist. The function should return the 5 TasteDive results that are associated with that string; be sure to only get movies, not other kinds of media. It will be a python dictionary with just one key, ‘Similar’.

Try invoking your function with the input “Black Panther”.

HINT: Be sure to include **only** `q`, `type`, and `limit` as parameters in order to extract data from the cache. If any other parameters are included, then the function will not be able to recognize the data that you’re attempting to pull from the cache. Remember, you will *not* need an api key in order to complete the project, because all data will be found in the cache.

The cache includes data for the following queries:
<div class="alert alert-block alert-info">
<table style="width:100%">
  <tr>
    <th style="text-align:left">q</th>
    <th style="text-align:left">type</th>
    <th style="text-align:left">limit</th>
  </tr>
  <tr>
    <td style="text-align:left">Black Panther</td>
    <td style="text-align:left">< omitted ></td>
    <td style="text-align:left">< omitted ></td>
  </tr>
  <tr>
    <td style="text-align:left">Black Panther</td>
    <td style="text-align:left">< omitted ></td>
    <td style="text-align:left">5</td>
  </tr>
  <tr>
    <td style="text-align:left">Black Panther</td>
    <td style="text-align:left">movies</td>
    <td style="text-align:left">< omitted ></td>
  </tr>
  <tr>
    <td style="text-align:left">Black Panther</td>
    <td style="text-align:left">movies</td>
    <td style="text-align:left">5</td>
  </tr>
  <tr>
    <td style="text-align:left">Tony Bennett</td>
    <td style="text-align:left">< omitted ></td>
    <td style="text-align:left">5</td>
  </tr>
  <tr>
    <td style="text-align:left">Tony Bennett</td>
    <td style="text-align:left">movies</td>
    <td style="text-align:left">5</td>
  </tr>
  <tr>
    <td style="text-align:left">Captain Marvel</td>
    <td style="text-align:left">movies</td>
    <td style="text-align:left">5</td>
  </tr>
  <tr>
    <td style="text-align:left">Bridesmaids</td>
    <td style="text-align:left">movies</td>
    <td style="text-align:left">5</td>
  </tr>
  <tr>
    <td style="text-align:left">Sherlock Holmes</td>
    <td style="text-align:left">movies</td>
    <td style="text-align:left">5</td>
  </tr>    
</table></div>


In [1]:
import requests_with_caching

def get_movies_from_tastedive(movie_name):
    base_url = "https://tastedive.com/api/similar"
    params = {
        'q': movie_name,
        'type': 'movies',
        'limit': 5
    }
    response = requests_with_caching.get(base_url, params=params)
    return response.json()

# Test the function with "Black Panther"
print(get_movies_from_tastedive("Black Panther"))


found in permanent_cache
{'Similar': {'Info': [{'Name': 'Black Panther', 'Type': 'movie'}], 'Results': [{'Name': 'Captain Marvel', 'Type': 'movie'}, {'Name': 'Avengers: Infinity War', 'Type': 'movie'}, {'Name': 'Ant-Man And The Wasp', 'Type': 'movie'}, {'Name': 'The Fate Of The Furious', 'Type': 'movie'}, {'Name': 'Deadpool 2', 'Type': 'movie'}]}}


## Step 2: Extract the movie titles 
Next, you will need to write a function that extracts just the list of movie titles from a dictionary returned by `get_movies_from_tastedive`. Call the function `extract_movie_titles`.

In [2]:
def extract_movie_titles(movie_dict):
    return [movie['Name'] for movie in movie_dict['Similar']['Results']]

# Test the function with "Tony Bennett" and "Black Panther"
print(extract_movie_titles(get_movies_from_tastedive("Tony Bennett")))
print(extract_movie_titles(get_movies_from_tastedive("Black Panther")))


found in permanent_cache
['The Startup Kids', 'Charlie Chaplin', 'Venus In Fur', 'Loving', 'The African Queen']
found in permanent_cache
['Captain Marvel', 'Avengers: Infinity War', 'Ant-Man And The Wasp', 'The Fate Of The Furious', 'Deadpool 2']


## Step 3: Get related titles for a movie
Next, you’ll write a function called `get_related_titles`. It takes a list of movie titles as input. It gets five related movies for each from TasteDive, extracts the titles for all of them, and combines them all into a single list. Don’t include the same movie twice. 

[Hint: make use of the two functions you already wrote; don't rewrite that code!]

In [3]:
def get_related_titles(movie_titles):
    related_titles = []
    for title in movie_titles:
        related = get_movies_from_tastedive(title)
        related_titles.extend(extract_movie_titles(related))
    return list(set(related_titles))  # Remove duplicates

# Test the function with ["Black Panther", "Captain Marvel"]
print(get_related_titles(["Black Panther", "Captain Marvel"]))


found in permanent_cache
found in permanent_cache
['The Fate Of The Furious', 'Deadpool 2', 'Black Panther', 'Venom', 'American Assassin', 'Captain Marvel', 'Inhumans', 'Avengers: Infinity War', 'Ant-Man And The Wasp']


## Step 4. Get OMDB data about each movie
Your next task will be to fetch data from OMDB. The documentation for the API is at https://www.omdbapi.com/. In case the API has changed or disappeared, you may want to consult an [archived version](https://web.archive.org/web/20230701045926/https://www.omdbapi.com/). (Queries will still work even if the API has changed or disappeared because our cache is based on the archived version.)

Define a function called `get_movie_data`. It takes in one parameter which is a string that represents the title of a movie you want to search for. The function should return a dictionary with information about that movie.

Again, use `requests_with_caching.get()`. For the queries on movies that are already in the cache, you won’t need an api key. You will need to provide the following keys: `t` and `r`. As with the TasteDive cache, be sure to only include those two parameters in order to extract existing data from the cache.

In [4]:
def get_movie_data(movie_title):
    base_url = "http://www.omdbapi.com/"
    params = {
        't': movie_title,
        'r': 'json'
    }
    response = requests_with_caching.get(base_url, params=params)
    return response.json()

# Test the function with "Venom" and "Baby Mama"
print(get_movie_data("Venom"))
print(get_movie_data("Baby Mama"))


found in permanent_cache
{'Title': 'Venom', 'Year': '2018', 'Rated': 'PG-13', 'Released': '05 Oct 2018', 'Runtime': '112 min', 'Genre': 'Action, Adventure, Sci-Fi', 'Director': 'Ruben Fleischer', 'Writer': 'Jeff Pinkner, Scott Rosenberg, Kelly Marcel', 'Actors': 'Tom Hardy, Michelle Williams, Riz Ahmed', 'Plot': 'A failed reporter is bonded to an alien entity, one of many symbiotes who have invaded Earth. But the being takes a liking to Earth and decides to protect it.', 'Language': 'English, Mandarin, Malay', 'Country': 'United States, China', 'Awards': '2 wins & 9 nominations', 'Poster': 'https://m.media-amazon.com/images/M/MV5BNTFkZjdjN2QtOGE5MS00ZTgzLTgxZjAtYzkyZWQ5MjEzYmZjXkEyXkFqcGdeQXVyMTM0NTUzNDIy._V1_SX300.jpg', 'Ratings': [{'Source': 'Internet Movie Database', 'Value': '6.6/10'}, {'Source': 'Rotten Tomatoes', 'Value': '30%'}, {'Source': 'Metacritic', 'Value': '35/100'}], 'Metascore': '35', 'imdbRating': '6.6', 'imdbVotes': '537,584', 'imdbID': 'tt1270797', 'Type': 'movie', 'D

## Step 5. Extract Rotten Tomatoes Rating
Now write a function called `get_movie_rating`. It takes an OMDB dictionary result for one movie and extracts the Rotten Tomatoes rating as an integer. For example, if given the OMDB dictionary for “Black Panther”, it would return 97. If there is no Rotten Tomatoes rating, return 0.

In [5]:
def get_movie_rating(movie_data):
    ratings = movie_data['Ratings']
    for rating in ratings:
        if rating['Source'] == 'Rotten Tomatoes':
            return int(rating['Value'].replace('%', ''))
    return 0  # Return 0 if Rotten Tomatoes rating is not found

# Test the function with "Deadpool 2"
print(get_movie_rating(get_movie_data("Deadpool 2")))


found in permanent_cache
84


## Step 6. Put it all together to make a recommender system
Now, you’ll put it all together. Define a function `get_sorted_recommendations`. It takes a list of movie titles as an input. It returns a sorted list of related movie titles as output, up to five related movies for each input movie title. The movies should be sorted in descending order by their Rotten Tomatoes rating, as returned by the `get_movie_rating` function. Break ties in reverse alphabetic order, so that ‘Yahşi Batı’ comes before ‘Eyyvah Eyvah’.

(Hint 1: remember that if you sort based on a tuple, the second attribute of the tuple is used to break ties.)
(Hint 2: we made the sort order easier for you, not harder, by specifying that ties be broken in reverse alphabetic order. The primary sort attribute, the Rotten Tomatoes rating, is also in reverse order, from highest to lowest.)

In [6]:
def get_sorted_recommendations(movie_titles):
    related_titles = get_related_titles(movie_titles)
    related_titles.sort(key=lambda title: (get_movie_rating(get_movie_data(title)), title), reverse=True)
    return related_titles[:5]

# Test the function with ["Bridesmaids", "Sherlock Holmes"]
print(get_sorted_recommendations(["Bridesmaids", "Sherlock Holmes"]))


found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
['Date Night', 'The Heat', 'The Five-Year Engagement', 'Baby Mama', 'Sherlock Holmes: A Game Of Shadows']
