# Final Project for Course 3 - OMDB and TasteDive Mashup

This project will take you through the process of mashing up data from two different APIs to make movie recommendations. The TasteDive API lets you provide a movie (or bands, TV shows, etc.) as a query input, and returns a set of related items. The OMDB API lets you provide a movie title as a query input and get back data about the movie, including scores from various review sites (Rotten Tomatoes, IMDB, etc.).

You will put those two together. You will use TasteDive to get related movies for a whole list of titles. You’ll combine the resulting lists of related movies, and sort them according to their Rotten Tomatoes scores (which will require making API calls to the OMDB API.)

To avoid problems with rate limits and site accessibility, we have provided a cache file with results for all the queries you need to make to both OMDB and TasteDive. Use `requests_with_caching.get()` rather than `requests.get()`. If you’re having trouble, you may not be formatting your queries properly, or you may not be asking for data that exists in our cache. We will try to provide as much information as we can to help guide you to form queries for which data exists in the cache. 

**IMPORTANT: If you run any query on either API that does not find its results in the "permanent cache", then you haven't written the right query. This will probably cause a later step in the process to fail.**



## Step 1: Retrieve related movies from TasteDive

Your first task will be to fetch data from TasteDive. The documentation for the API is at https://tastedive.com/read/api. In case the documentation has changed, you may want to access an [archived version](https://web.archive.org/web/20230530015616/https://tastedive.com/read/api). (Queries will still work even if the API has changed or disappeared because our cache is based on the archived version.)

Define a function, called `get_movies_from_tastedive`. It should take one input parameter, a string that is the name of a movie or music artist. The function should return the 5 TasteDive results that are associated with that string; be sure to only get movies, not other kinds of media. It will be a python dictionary with just one key, ‘Similar’.

Try invoking your function with the input “Black Panther”.

HINT: Be sure to include **only** `q`, `type`, and `limit` as parameters in order to extract data from the cache. If any other parameters are included, then the function will not be able to recognize the data that you’re attempting to pull from the cache. Remember, you will *not* need an api key in order to complete the project, because all data will be found in the cache.

The cache includes data for the following queries:
<div class="alert alert-block alert-info">
<table style="width:100%">
  <tr>
    <th style="text-align:left">q</th>
    <th style="text-align:left">type</th>
    <th style="text-align:left">limit</th>
  </tr>
  <tr>
    <td style="text-align:left">Black Panther</td>
    <td style="text-align:left">< omitted ></td>
    <td style="text-align:left">< omitted ></td>
  </tr>
  <tr>
    <td style="text-align:left">Black Panther</td>
    <td style="text-align:left">< omitted ></td>
    <td style="text-align:left">5</td>
  </tr>
  <tr>
    <td style="text-align:left">Black Panther</td>
    <td style="text-align:left">movies</td>
    <td style="text-align:left">< omitted ></td>
  </tr>
  <tr>
    <td style="text-align:left">Black Panther</td>
    <td style="text-align:left">movies</td>
    <td style="text-align:left">5</td>
  </tr>
  <tr>
    <td style="text-align:left">Tony Bennett</td>
    <td style="text-align:left">< omitted ></td>
    <td style="text-align:left">5</td>
  </tr>
  <tr>
    <td style="text-align:left">Tony Bennett</td>
    <td style="text-align:left">movies</td>
    <td style="text-align:left">5</td>
  </tr>
  <tr>
    <td style="text-align:left">Captain Marvel</td>
    <td style="text-align:left">movies</td>
    <td style="text-align:left">5</td>
  </tr>
  <tr>
    <td style="text-align:left">Bridesmaids</td>
    <td style="text-align:left">movies</td>
    <td style="text-align:left">5</td>
  </tr>
  <tr>
    <td style="text-align:left">Sherlock Holmes</td>
    <td style="text-align:left">movies</td>
    <td style="text-align:left">5</td>
  </tr>    
</table></div>


In [1]:
# some invocations that we use in the automated tests; uncomment these if you are getting errors and want better error messages
# get_movies_from_tastedive("Bridesmaids")
# get_movies_from_tastedive("Black Panther")

### BEGIN SOLUTION
import requests_with_caching


def get_movies_from_tastedive(name):
    params = {"q": name, "type": "movies", "limit": 5}
    ret = requests_with_caching.get(
        "https://tastedive.com/api/similar", params=params
    ).json()
    return ret


### END SOLUTION

In [2]:
### BEGIN HIDDEN TESTS
results = get_movies_from_tastedive("Bridesmaids")["Similar"]["Info"][0]["Name"]
expected = "Bridesmaids"
assert type(get_movies_from_tastedive("Bridesmaids")) == type(
    []
), "get_movies_from_tastedive('Bridesmaids') did not return a dictionary."
assert (
    results == expected
), "The results for get_movies_from_tastedive('Bridesmaids') are not correct."

movies = get_movies_from_tastedive("Black Panther")["Similar"]["Results"]
assert len(movies) <= 5, "get_movies_from_tastedive returned more than 5 results"

results = get_movies_from_tastedive("Tony Bennett")
non_movies = filter((lambda x: x["Type"] != "movie"), results["Similar"]["Results"])
assert (
    len(list(non_movies)) == 0
), "get_movies_from_tastedive did not retrieve only movies"
### END HIDDEN TESTS

found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache


## Step 2: Extract the movie titles 
Next, you will need to write a function that extracts just the list of movie titles from a dictionary returned by `get_movies_from_tastedive`. Call the function `extract_movie_titles`.

In [3]:
# some invocations that we use in the automated tests; uncomment these if you are getting errors and want better error messages
# extract_movie_titles(get_movies_from_tastedive("Tony Bennett"))
# extract_movie_titles(get_movies_from_tastedive("Black Panther"))

### BEGIN SOLUTION
def extract_movie_titles(res_json):
    lst_res = res_json["Similar"]["Results"]
    ret = [i["Name"] for i in lst_res]
    return ret


### END SOLUTION

In [4]:
### BEGIN HIDDEN TESTS
results = [
    "The Startup Kids",
    "Charlie Chaplin",
    "Venus In Fur",
    "Loving",
    "The African Queen",
]
assert (
    extract_movie_titles(get_movies_from_tastedive("Tony Bennett")) == results
), 'Correct results are not returned for extract_movie_titles(get_movies_from_tastedive("Tony Bennett")).'
### END HIDDEN TESTS

found in permanent_cache


## Step 3: Get related titles for a movie
Next, you’ll write a function called `get_related_titles`. It takes a list of movie titles as input. It gets five related movies for each from TasteDive, extracts the titles for all of them, and combines them all into a single list. Don’t include the same movie twice. 

[Hint: make use of the two functions you already wrote; don't rewrite that code!]

In [5]:
# some invocations that we use in the automated tests; uncomment these if you are getting errors and want better error messages
# get_related_titles(["Black Panther", "Captain Marvel"])
# get_related_titles([])

### BEGIN SOLUTION
def get_related_titles(lst_movie_names):
    ret = set()
    for movie_name in lst_movie_names:
        ret.update(extract_movie_titles(get_movies_from_tastedive(movie_name)))
    return list(ret)


### END SOLUTION


### BEGIN SOLUTION
def get_related_titles(lst_movie_names):
    results = []
    for movie_name in lst_movie_names:
        for movie in extract_movie_titles(get_movies_from_tastedive(movie_name)):
            if movie not in results:
                results.append(movie)

    return results


### END SOLUTION

In [6]:
### BEGIN HIDDEN TESTS
assert (
    get_related_titles([]) == []
), "The correct response is not returned when no titles are included."
actual_results_as_list = get_related_titles(["Black Panther", "Captain Marvel"])
actual_results_as_set = set(actual_results_as_list)
assert len(list(actual_results_as_set)) == len(
    actual_results_as_list
), "it looks like you included some titles more than once"
expected_results_from_cache = set(
    [
        "Captain Marvel",
        "Avengers: Infinity War",
        "Ant-Man And The Wasp",
        "The Fate Of The Furious",
        "Deadpool 2",
        "Inhumans",
        "Venom",
        "American Assassin",
        "Black Panther",
    ]
)
assert (
    actual_results_as_set == expected_results_from_cache
), "The correct response is not returned when searching for Black Panther and Captain Marvel."
### END HIDDEN TESTS

found in permanent_cache
found in permanent_cache


## Step 4. Get OMDB data about each movie
Your next task will be to fetch data from OMDB. The documentation for the API is at https://www.omdbapi.com/. In case the API has changed or disappeared, you may want to consult an [archived version](https://web.archive.org/web/20230701045926/https://www.omdbapi.com/). (Queries will still work even if the API has changed or disappeared because our cache is based on the archived version.)

Define a function called `get_movie_data`. It takes in one parameter which is a string that represents the title of a movie you want to search for. The function should return a dictionary with information about that movie.

Again, use `requests_with_caching.get()`. For the queries on movies that are already in the cache, you won’t need an api key. You will need to provide the following keys: `t` and `r`. As with the TasteDive cache, be sure to only include those two parameters in order to extract existing data from the cache.

In [7]:
# some invocations that we use in the automated tests; uncomment these if you are getting errors and want better error messages
# get_movie_data("Venom")
# get_movie_data("Baby Mama")

### BEGIN SOLUTION
def get_movie_data(name):
    params = {"t": name, "r": "json"}
    return requests_with_caching.get(
        "http://www.omdbapi.com/",  # NOTE: http instead of https
        params=params,
    ).json()


### END SOLUTION

In [8]:
### BEGIN HIDDEN TESTS
assert type(get_movie_data("Venom")) == type(
    {}
), "The correct python type is not returned."
assert (
    get_movie_data("Baby Mama")["Title"] == "Baby Mama"
), "The results do not match the query."
### END HIDDEN TESTS

found in permanent_cache
found in permanent_cache


## Step 5. Extract Rotten Tomatoes Rating
Now write a function called `get_movie_rating`. It takes an OMDB dictionary result for one movie and extracts the Rotten Tomatoes rating as an integer. For example, if given the OMDB dictionary for “Black Panther”, it would return 97. If there is no Rotten Tomatoes rating, return 0.

In [9]:
# some invocations that we use in the automated tests; uncomment these if you are getting errors and want better error messages
# get_movie_rating(get_movie_data("Deadpool 2"))

### BEGIN SOLUTION
def get_movie_rating(movie_data):
    lst_ratings = movie_data["Ratings"]
    for ratings in lst_ratings:
        if ratings["Source"] == "Rotten Tomatoes":
            return int(ratings["Value"][:-1])
    return 0


### END SOLUTION

In [11]:
### BEGIN HIDDEN TESTS
assert type(get_movie_rating(get_movie_data("Deadpool 2"))) == type(
    9
), "a rating is not returned."
assert (
    get_movie_rating(get_movie_data("Venom")) == 30
), "The movie ratings is not acurate for Venom."
assert (
    get_movie_rating(get_movie_data("Deadpool 2")) == 84
), "The movie rating for 'Deadpool 2' is not correct."
### END HIDDEN TESTS

found in permanent_cache
found in permanent_cache
found in permanent_cache


## Step 6. Put it all together to make a recommender system
Now, you’ll put it all together. Define a function `get_sorted_recommendations`. It takes a list of movie titles as an input. It returns a sorted list of related movie titles as output, up to five related movies for each input movie title. The movies should be sorted in descending order by their Rotten Tomatoes rating, as returned by the `get_movie_rating` function. Break ties in reverse alphabetic order, so that ‘Yahşi Batı’ comes before ‘Eyyvah Eyvah’.

(Hint 1: remember that if you sort based on a tuple, the second attribute of the tuple is used to break ties.)
(Hint 2: we made the sort order easier for you, not harder, by specifying that ties be broken in reverse alphabetic order. The primary sort attribute, the Rotten Tomatoes rating, is also in reverse order, from highest to lowest.)

In [12]:
# some invocations that we use in the automated tests; uncomment these if you are getting errors and want better error messages
# get_sorted_recommendations(["Bridesmaids", "Sherlock Holmes"])

### BEGIN SOLUTION
# import functools
# def compare(elem1, elem2):
#     if elem1['rating'] > elem2['rating']:
#         return -1
#     elif elem1['rating'] < elem2['rating']:
#         return 1
#     else: # ties
#         if elem1['name'] > elem2['name']:
#             return -1
#         else:
#             return 1
def get_sorted_recommendations(lst_movie_names):
    lst_related = get_related_titles(lst_movie_names)
    return sorted(
        lst_related,
        key=lambda title: (get_movie_rating(get_movie_data(title)), title),
        reverse=True,
    )
    # lst_movies = []
    # for rel_name in lst_related:
    #     lst_movies.append({'name': rel_name, 'rating': get_movie_rating(get_movie_data(rel_name))})
    # # lst_movies.sort(key=lambda x: (x['rating'], x['name']), reverse=True)
    # lst_movies.sort(key=functools.cmp_to_key(compare))
    # return [m['name'] for m in lst_movies]


### END SOLUTION

In [13]:
### BEGIN HIDDEN TESTS
sample_actual_recommendations = get_sorted_recommendations(
    ["Bridesmaids", "Sherlock Holmes"]
)
assert type(sample_actual_recommendations) == type(
    []
), "The correct python type is not returned."
sample_expected_recommendations = [
    "Date Night",
    "The Heat",
    "The Five-Year Engagement",
    "Baby Mama",
    "Sherlock Holmes: A Game Of Shadows",
    "Bachelorette",
    "Prince Of Persia: The Sands Of Time",
    "Pirates Of The Caribbean: On Stranger Tides",
    "Yahşi Batı",
    "Eyyvah Eyvah",
]
assert (
    sample_actual_recommendations == sample_expected_recommendations
), "The actual value returned is not the expected value."
### END HIDDEN TESTS

found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
found in permanent_cache
