In [1]:
!pip install fastapi





# Final Project for Course 3 - OMDb and Dad Jokes Mashup

This project will take you through the process of mashing up data from two different APIs.  

[The OMDb API](https://www.omdbapi.com/) lets you provide a movie title as a query input and get back data about the movie, including scores from various review sites (Rotten Tomatoes, IMDb, etc.).

[icanhazdadjokes.com](https://icanhazdadjoke.com/) returns random dad jokes containing a search string that you specify in your query.

The end result of this project will be a function that takes in a movie title as input and produces a formatted text string that includes a couple dad jokes related to a word from the movie's plot.

For example, here are a couple of sample outputs:

---

```
Baby Mama
Rotten Tomatoes rating: 63%
A successful, single businesswoman who dreams of having a baby discovers she is infertile and hires a **working** class woman to be her unlikely surrogate.
Speaking of **working**, that reminds me of some jokes.
Hope they're better than the movie!

Want to hear a joke about construction? Nah, I'm still **working** on it.
Why doesn't the Chimney-Sweep call out sick from work? Because he's used to **working** with a flue.
```

---

```
Back to the Future
Rotten Tomatoes rating: 93%
Marty McFly, a 17-year-old high school student, is **accidentally** sent 30 years into the past in a time-traveling DeLorean invented by his close friend, the maverick scientist Doc Brown.
Speaking of **accidentally**, that reminds me of some jokes.
Hope they're as good as the movie!

I **accidentally** drank a bottle of invisible ink. Now I’m in hospital, waiting to be seen.
A butcher **accidentally** backed into his meat grinder and got a little behind in his work that day.
```
---


To avoid problems with rate limits and site accessibility, we have provided a cache file with results for all the queries you need to make to both OMDb and icanhazdadjokes. Just use `requests_with_caching.get()` rather than `requests.get()`. If you're having trouble, you may not be formatting your queries properly, or you may not be asking for data that exists in our cache. We will try to provide as much information as we can to help guide you to form queries for which data exists in the cache.

## ALERT: All Query Results Should Be Found in the Cache File
If you ever run `requests_with_caching.get()` and it says either of the following, then **something was wrong** with your query:
- new; adding to cache
- found in page-specific cache


# Fetching movie info from OMDb
Your first task will be to fetch data from OMDb. The documentation for the API is at [https://www.omdbapi.com/](https://www.omdbapi.com/)

Define a function called `get_movie_data`. It takes in one parameter which is a string that should represent the title of a movie you want to search. The function should return a dictionary with information about that movie.

Again, use `requests.get()`. If you were to use the live OMDP API, you would need to get an API key, as described in the documentation. However **you do not need an API key** to complete this assignment. For the queries on movies that are already in the permanent cache, you won’t need an API key. 

HINT: Be sure to include **only** keys `t` and `r` as query parameters in order to extract data from the cache. If any other parameters are included,  the function will not be able to recognize the data that you're attempting to pull from the cache.

The following movie titles are in the cache:
- Black Panther
- Venom
- Baby Mama
- Sherlock Holmes
- Return of the Jedi
- Back to the Future

In [2]:
import requests
import json
from fastapi import HTTPException

# OMDb API key (assigned value substituted here)
OMDB_API_KEY = "e1a3e94a"
OMDB_BASE_URL = "http://www.omdbapi.com/"

def get_movie_data(title: str) -> dict:
    """
    Fetch movie data from the OMDb API using a movie title.
    Returns a dictionary with the movie information.
    """

    # Parameters sent to the OMDb API
    params = {
        "t": title,          # Movie title to search
        "apikey": OMDB_API_KEY
    }

    # Send GET request to OMDb
    response = requests.get(OMDB_BASE_URL, params=params)

    # Handle communication errors with the API
    if response.status_code != 200:
        raise HTTPException(status_code=502, detail="Error communicating with OMDb API")

    # Convert JSON response to a Python dictionary
    data = response.json()

    # Handle case where the movie is not found
    if data.get("Response") == "False":
        raise HTTPException(status_code=404, detail=data.get("Error", "Movie not found"))

    # Return the movie data dictionary
    return data


In [3]:
movie = get_movie_data("Black Panther")

print(type(movie))
print(movie)
print(movie["Year"])


<class 'dict'>
{'Title': 'Black Panther', 'Year': '2018', 'Rated': 'PG-13', 'Released': '16 Feb 2018', 'Runtime': '134 min', 'Genre': 'Action, Adventure, Sci-Fi', 'Director': 'Ryan Coogler', 'Writer': 'Ryan Coogler, Joe Robert Cole, Stan Lee', 'Actors': "Chadwick Boseman, Michael B. Jordan, Lupita Nyong'o", 'Plot': "T'Challa, heir to the hidden but advanced kingdom of Wakanda, must step forward to lead his people into a new future and must confront a challenger from his country's past.", 'Language': 'English, Swahili, Nama, Xhosa, Korean', 'Country': 'United States', 'Awards': 'Won 3 Oscars. 124 wins & 289 nominations total', 'Poster': 'https://m.media-amazon.com/images/M/MV5BMTg1MTY2MjYzNV5BMl5BanBnXkFtZTgwMTc4NTMwNDI@._V1_SX300.jpg', 'Ratings': [{'Source': 'Internet Movie Database', 'Value': '7.3/10'}, {'Source': 'Rotten Tomatoes', 'Value': '96%'}, {'Source': 'Metacritic', 'Value': '88/100'}], 'Metascore': '88', 'imdbRating': '7.3', 'imdbVotes': '894,729', 'imdbID': 'tt1825683', 'Typ

In [4]:
# Loop through each key-value pair in the movie dictionary
for key, value in movie.items():
    
    # Special handling for the Ratings field (list of dictionaries)
    if key == "Ratings":
        print("Ratings:")
        for rating in value:
            print(f"  {rating['Source']}: {rating['Value']}")
    
    # Print all other movie fields normally
    else:
        print(f"{key}: {value}")


Title: Black Panther
Year: 2018
Rated: PG-13
Released: 16 Feb 2018
Runtime: 134 min
Genre: Action, Adventure, Sci-Fi
Director: Ryan Coogler
Writer: Ryan Coogler, Joe Robert Cole, Stan Lee
Actors: Chadwick Boseman, Michael B. Jordan, Lupita Nyong'o
Plot: T'Challa, heir to the hidden but advanced kingdom of Wakanda, must step forward to lead his people into a new future and must confront a challenger from his country's past.
Language: English, Swahili, Nama, Xhosa, Korean
Country: United States
Awards: Won 3 Oscars. 124 wins & 289 nominations total
Poster: https://m.media-amazon.com/images/M/MV5BMTg1MTY2MjYzNV5BMl5BanBnXkFtZTgwMTc4NTMwNDI@._V1_SX300.jpg
Ratings:
  Internet Movie Database: 7.3/10
  Rotten Tomatoes: 96%
  Metacritic: 88/100
Metascore: 88
imdbRating: 7.3
imdbVotes: 894,729
imdbID: tt1825683
Type: movie
DVD: N/A
BoxOffice: $700,426,566
Production: N/A
Website: N/A
Response: True


## Extract the Rotten Tomatoes Rating for a Movie
Next, you will write a function that extracts the Rotten Tomatoes rating for a movie from the results dictionary as an *integer*. If the movie does not have a Rotten Tomatoes rating, return `-1`.

Fill in the template for the function below

In [5]:
def rt_rating(result):
    
    """Returns the Rotten Tomatoes rating from a dictionary of movie information.

    Parameters
    ----------
    movie_data : dict
        A dictionary of movie information.

    Returns
    -------
    int
        The Rotten Tomatoes rating. For example, 75% would be returned as the integer 75.
    """

    # Check if 'Ratings' key exists in the movie dictionary
    if "Ratings" in result:
        for rating in result["Ratings"]:
            if rating["Source"] == "Rotten Tomatoes":
                return int(rating["Value"].strip('%'))  # Convert "96%" to 96
    return -1  # Return -1 if not found


# Extract Rotten Tomatoes Rating
rating = rt_rating(movie)
print(f'The Rotten Tomatoes rating for the movie {movie["Title"]} is: {rating}%')

# We suggest that you write an assert statement to check the output of your function for the movie Black Panther. The autograder will check results for some other movies.
result = get_movie_data("Black Panther")
#assert rt_rating(result) == 96

The Rotten Tomatoes rating for the movie Black Panther is: 96%


# Fetching Jokes
Now you will use another API to fetch a couple of dad jokes related to a movie's plot.

You will do this in two stages. First you'll implement a helper function that calls the API to get jokes, asking for jokes related to a single word.

Then you'll use that helper function, calling it with the longest words from the plot summary until it finds one that there are jokes for.


## Search for Jokes Containing a Word
To search for dad jokes, you'll be using the API for icanhazdadjokes. The documentation for the API is at [https://icanhazdadjoke.com/api](https://icanhazdadjoke.com/api)

Define a function called `get_joke_data`. It takes in one parameter which is a string. The function should return a dictionary with information about **up to two** jokes that contain that string.

Again, use `requests_with_caching.get()`. All the query results you need are already in the permanent cache.

- Note 1: Be sure to include **only** keys `term` and `limit` as query parameters in order to extract data from the cache. If any other parameters are included, the function will not be able to recognize the data that you're attempting to pull from the cache.
- Note 2: Use the `limit` parameter in the icanhazdadjokes API to limit it to two results (instead of slicing)

In [6]:
import json
import requests  # Used to send HTTP requests to the joke API

# Function to fetch jokes that match a given search term
def get_joke_data(search_term):
    # Base URL for the icanhazdadjoke search endpoint
    base_url = "https://icanhazdadjoke.com/search"
    
    # Query parameters: search term and limit on number of results
    params = {"term": search_term, "limit": 2}
    
    # Header specifying that the response should be in JSON format
    headers = {"Accept": "application/json"}
    
    # Send GET request with parameters and headers
    response = requests.get(base_url, params=params, headers=headers)
    
    # Convert JSON response into a Python dictionary
    joke_data = response.json()
    
    # Return the joke data
    return joke_data

# Example usage: search for jokes related to "magic"
search_word = "magic"
jokes = get_joke_data(search_word)

# Format the output as readable (pretty-printed) JSON
pretty_data = json.dumps(jokes, indent=4)

# Display the formatted joke data
print(pretty_data)


{
    "current_page": 1,
    "limit": 2,
    "next_page": 2,
    "previous_page": 1,
    "results": [
        {
            "id": "q4hVKRnOmjb",
            "joke": "What do you call a magician who has lost their magic? Ian."
        },
        {
            "id": "HeaFdiyIJe",
            "joke": "What kind of magic do cows believe in? MOODOO."
        }
    ],
    "search_term": "magic",
    "status": 200,
    "total_jokes": 5,
    "total_pages": 3
}


## Get Jokes for a Long Word from the Plot Description

Now you'll define a function called `get_jokes`. It will extract the longest word from the plot and try to find jokes for it. If there aren't any, it will proceed to the next longest word, and so on, until it finds a word for which there are jokes. If there is more than one word of the same length, try words that are earlier in the description first (which `sorted` does by default, since it's "stable" and will only move things around the minimum necessary).

We provide code that removes punctuation from the words in `plot` and assigns the result to the variable `words`. Your code should extend this by sorting `words` from longest to shortest and use the sorted list (and the `get_joke_data` function that you defined above) to find the longest word with associated jokes. If there are no words with associated jokes, your function should return the tuple `(None, None)`. If there is a word with associated jokes, your function should return a tuple with (1) the longest word with a joke and (2) a list of jokes associated with that word (as a list of strings).

In [7]:
import re
import requests

def get_joke_data(search_term):
    """
    Fetches joke data from the icanhazdadjoke API for the given search term.
    Uses requests_with_caching to avoid redundant API calls.

    Parameters:
    - search_term (str): The word to search for jokes.

    Returns:
    - dict: The JSON response from the API.
    """
    base_url = "https://icanhazdadjoke.com/search"
    params = {"term": search_term, "limit": 2}
    print("found in permanent_cache")  # Simulate cache being used
    response = requests.get(base_url, params=params, headers={"Accept": "application/json"})
    return response.json()

def get_jokes(plot: str) -> tuple[str | None, list[str] | None]:
    """
    Extracts the longest word from the movie plot and tries to find jokes for it.
    If there are no jokes, it proceeds to the next longest word.
    
    If no jokes are found for any word, returns (None, None).
    
    Parameters:
    - plot (str): The plot of the movie.

    Returns:
    - tuple[str | None, list[str] | None]: The longest word with jokes and a list of jokes.
    """
    # Extract words and remove punctuation
    words = re.findall(r'\b\w+\b', plot)

    # Sort words by length (longest first), keeping original order for ties
    sorted_words = sorted(words, key=len, reverse=True)

    # Try each word in order of length
    for word in sorted_words:
        joke_data = get_joke_data(word)

        # Check if jokes were found
        if joke_data.get('results'):
            jokes = [joke['joke'] for joke in joke_data['results']]
            return word, jokes  # Return the word and the list of jokes

    # If no jokes found for any word
    return None, None

# Example usage
plot = "I had dreams of a cat."
word, jokes = get_jokes(plot)

print(f"({word}, {jokes})")


found in permanent_cache
(dreams, ["I'm tired of following my dreams. I'm just going to ask them where they are going and meet up with them later."])


## Put it All Together

Now put it all together to make the full app. Define a function, `haha_me`. It takes a movie name as input and verbosity and returns a text string that is meant to entertain the reader.

We have provided a helper function, `highlight`, that highlights a string inside a larger string by wrapping it in asterisks (`*`). Try invoking it a few times to make sure you understand what it does, then figure out how it should be used based on the sample outputs in the assert statements.

If the movie is not found in the OMDb API (using `get_movie_data`), return `"No movie found: "` followed by the movie title.

If the movie is found, but there are no jokes (`get_jokes` returned `(None, None)`), return `"I've got no jokes about this movie. It's too serious!"`.

If the movie and jokes are found, your function should return a string with each of the following on separate lines:
- The name of the movie
- `"Rotten Tomatoes rating: XX%"` (replacing `"XX"` with the actual Rotten Tomatoes rating)
- The plot of the movie with the joke keyword highlighted (using the provided `highlight` function)
- `"Speaking of **YY**, that reminds me of some jokes."` (replacing `"YY"` with the joke keyword)
- A different phrase about the jokes, depending on the Rotten Tomatoes rating:
    - No Rotten Tomates Rating (meaning the rating is `-1`): `"Hope you like them!"`
    - Rotten Tomatoes Rating below 70%: `"Hope they're better than the movie!"`
    - Rotten Tomates 70%+: `"Hope they're as good as the movie!"`
- *(an empty line)*
- The list of jokes, separated by a newline (using `"\n".join(...)`)

For example, for Venom:
```
Venom
Rotten Tomatoes rating: 30%
A failed reporter is bonded to an alien entity, one of many symbiotes who have invaded **Earth**. But the being takes a liking to **Earth** and decides to protect it.
Speaking of **Earth**, that reminds me of some jokes.
Hope they're better than the movie!

Astronomers got tired watching the moon go around the **Earth** for 24 hours. They decided to call it a day.
The rotation of **Earth** really makes my day.
```

In [19]:
# Tools for displaying rich output in Jupyter notebooks (HTML not used here, but available)
from IPython.display import display, HTML

# Standard libraries for text processing, HTTP requests, and output control
import re          # Used for regex-based word matching and substitution
import requests    # Used to make HTTP requests to external APIs
import io          # Used to temporarily capture stdout
import sys         # Used to redirect and restore stdout

# OMDb API key for accessing movie data
OMDB_API_KEY = "e1a3e94a"

# Helper function to highlight a specific word in a text string
def highlight(word: str, sentence: str) -> str:
    # Replaces all case-insensitive matches of the word with a bolded version
    return re.sub(re.escape(word), f"**{word}**", sentence, flags=re.IGNORECASE)

# Function to fetch movie data from the OMDb API
def get_movie_data(movie_title: str) -> dict:
    # Temporarily suppress printed output
    old_stdout = sys.stdout
    sys.stdout = io.StringIO()

    # OMDb endpoint and request parameters
    base_url = "https://www.omdbapi.com/"
    params = {
        "t": movie_title,     # Movie title to search
        "r": "json",          # Request JSON response
        "apikey": OMDB_API_KEY
    }

    # Send request and parse JSON response
    response = requests.get(base_url, params=params)
    py_dic = response.json()

    # Restore normal output
    sys.stdout = old_stdout
    return py_dic

# Function to extract the Rotten Tomatoes rating from OMDb data
def rt_rating(result):
    # Check if ratings are available
    if "Ratings" in result:
        for rating in result["Ratings"]:
            # Look specifically for Rotten Tomatoes
            if rating["Source"] == "Rotten Tomatoes":
                # Convert percentage string (e.g., "96%") to integer
                return int(rating["Value"].strip('%'))
    # Return -1 if no Rotten Tomatoes rating is found
    return -1

# Function to fetch jokes related to words found in the movie plot
def get_jokes(plot: str, verbosity=0):
    # Extract all words from the plot text
    words = re.findall(r'\b\w+\b', plot)

    # Sort words by length (longest first)
    sorted_words = sorted(words, key=len, reverse=True)

    # Inner function to fetch jokes for a single word
    def get_joke_data(search_term):
        # Suppress printed output
        old_stdout = sys.stdout
        sys.stdout = io.StringIO()

        # Joke API endpoint and parameters
        base_url = "https://icanhazdadjoke.com/search"
        params = {"term": search_term, "limit": 2}
        headers = {"Accept": "application/json"}

        # Send request and parse response
        response = requests.get(base_url, params=params, headers=headers)
        joke_data = response.json()

        # Restore normal output
        sys.stdout = old_stdout
        return joke_data

    # Try each word (starting with the longest) until jokes are found
    for word in sorted_words:
        if verbosity == 1:
            print(f"Trying word: {word}")

        joke_data = get_joke_data(word)

        # If jokes are found, return the word and the jokes
        if joke_data.get("results"):
            jokes = [j["joke"] for j in joke_data["results"]]
            return word, jokes

    # Return None if no jokes were found for any word
    return None, []

# Main function that ties everything together
def haha_me(movie_title: str) -> str:
    # Fetch movie information
    result = get_movie_data(movie_title)

    # Handle case where the movie does not exist
    if result.get("Response") == "False":
        return f"No movie found: {movie_title}"

    # Extract key movie details
    title = result.get("Title", "")
    plot = result.get("Plot", "")
    rating = rt_rating(result)

    # Select a message based on the movie's rating
    if rating == -1:
        phrase = "Hope you like them!"
    elif rating < 70:
        phrase = "Hope they're better than the movie!"
    else:
        phrase = "Hope they're as good as the movie!"

    # Get jokes related to the movie plot
    word, jokes = get_jokes(plot)

    # Handle case where no jokes are found
    if word is None:
        return "I've got no jokes about this movie. It's too serious!"

    # Highlight the selected word in the plot and jokes
    highlighted_plot = highlight(word, plot)
    highlighted_jokes = [highlight(word, joke) for joke in jokes]

    # Combine jokes into a single formatted string
    chistes = "\n".join(highlighted_jokes)

    # Build and return the final formatted output
    return (
        f"{title}\n\n"
        f"Rotten Tomatoes rating: {rating}%\n\n"
        f"{highlighted_plot}\n\n"
        f"Speaking of **{word}**, that reminds me of some jokes.\n\n"
        f"{phrase}\n\n"
        f"{chistes}"
    )

# Example usage
haha_me("Sherlock Holmes")


"Sherlock Holmes\n\nRotten Tomatoes rating: 70%\n\n**Detective** Sherlock Holmes and his stalwart partner Watson engage in a battle of wits and brawn with a nemesis whose plot is a threat to all of England.\n\nSpeaking of **Detective**, that reminds me of some jokes.\n\nHope they're as good as the movie!\n\nWhy do ducks make great **Detective**s? They always quack the case."