# Python Web

- REST stands for REpresentational State Transfer
- API stands for Application Programming Interface

### Anatomy of URL
- ```<scheme>://<host>:<port>/<path>```

![image.png](https://fopp.umsi.education/books/published/fopp/_images/internet_requests.png)

![image.png](https://fopp.umsi.education/books/published/fopp/_images/parameterformat.png)

![image.png](https://fopp.umsi.education/books/published/fopp/_images/urlstructure.png)

# The Request Module

- You don’t need to use a browser to fetch the contents of a page, though. In Python, there’s a module available, called requests. You can use the get function in the requests module to fetch the contents of a page

To summarize, a Response object, in the full implementation of the requests module has the following useful attributes that can be accessed in your program:

- .text

- .url

- .json()

- .status_code (not available in Runestone implementation)

- .headers (not available in Runestone implementation)

- .history (not available in Runestone implementation)

! self = lookup HTTP error code list

In [None]:
import requests
import json

page = requests.get("https://api.datamuse.com/words?rel_rhy=funny")
print(type(page))
print(page.text[:150]) # print the first 150 characters
print(page.url) # print the url that was fetched
print("------")
x = page.json() # turn page.text into a python object
print(type(x))
print("---first item in the list---")
print(x[0])
print("---the whole list, pretty printed---")
print(json.dumps(x, indent=2)) # pretty print the results

# Using requests.get to encode URL parameters

- Fortunately, when you want to pass information as a URL parameter value, you don’t have to remember all the substitutions that are required to encode special characters. Instead, that capability is built into the requests module.

-  The get function in the requests module takes an optional parameter called params. 

In [None]:
# doing a google search using the request
d = {'q': '"violins and guitars"', 'tbm': 'isch'}
results = requests.get("https://google.com/search", params=d)
print(results.url)

![image.png](https://fopp.umsi.education/books/published/fopp/_images/urlexamples.png)

In [3]:
# Example of passing dictionary as arguments 
import requests

# page = requests.get("https://api.datamuse.com/words?rel_rhy=funny")
kval_pairs = {'rel_rhy': 'funny'}
page = requests.get("https://api.datamuse.com/words", params=kval_pairs)
print(page.text[:150]) # print the first 150 characters
print(page.url) # print the url that was fetched


[{"word":"money","score":4415,"numSyllables":2},{"word":"honey","score":1207,"numSyllables":2},{"word":"sunny","score":717,"numSyllables":2},{"word":"
https://api.datamuse.com/words?rel_rhy=funny


# Defining a function to make repeated invocations

In [4]:
import requests

def get_rhymes(word):
    baseurl = "https://api.datamuse.com/words"
    params_diction = {} # Set up an empty dictionary for query parameters
    params_diction["rel_rhy"] = word
    params_diction["max"] = "3" # get at most 3 results
    resp = requests.get(baseurl, params=params_diction)
    # return the top three words
    word_ds = resp.json()
    return [d['word'] for d in word_ds]
    return resp.json() # Return a python object (a list of dictionaries in this case)

print(get_rhymes("funny"))

['money', 'honey', 'sunny']


# Caching Response Content

To avoid re-requesting the same data, we will use a programming pattern known as caching. It works like this:

- Before doing some expensive operation (like calling requests.get to get data from a REST API), check whether you have already saved (“cached”) the results that would be generated by making that request.

- If so, return that same data.

- If not, perform the expensive operation and save (“cache”) the results (e.g. the complicated data) in your cache so you won’t have to perform it again the next time.

## The requests_with_caching module
In this book, we are providing a special module, called request_with_caching.

Here’s how you’ll use this module.

- Your code will include a statement to import the module, import requests_with_caching.

- Instead of invoking requests.get(), you’ll invoke requests_with_caching.get().

There are a couple of other optional parameters for the function requests_with_caching.get().

- cache_file– it’s value should be a string specifying the name of the file containing the permanent cache. If you don’t specify anything, the default value is “permanent_cache.txt”. For the datamuse API, we’ve provide a cache in a file called datamuse_cache.txt. It just contains the saved response to the query for “https://api.datamuse.com/words?rel_rhy=funny”.

- private_keys_to_ignore– its value should be a list of strings. These are keys from the parameters dictionary that should be ignored when deciding whether the current request matches a previous request. The main purpose of this is that it allows us to return a result from the cache for some REST APIs that would otherwise require you to provide an API key in order to make a request. By default, it is set to [“api_key”], which is a query parameter used with the flickr API. You should not need to set this optional parameter.

In [None]:
import json

PERMANENT_CACHE_FNAME = "permanent_cache.txt"
TEMP_CACHE_FNAME = "this_page_cache.txt"

def _write_to_file(cache, fname):
    with open(fname, 'w') as outfile:
        outfile.write(json.dumps(cache, indent=2))

def _read_from_file(fname):
    try:
        with open(fname, 'r') as infile:
            res = infile.read()
            return json.loads(res)
    except:
        return {}

def add_to_cache(cache_file, cache_key, cache_value):
    temp_cache = _read_from_file(cache_file)
    temp_cache[cache_key] = cache_value
    _write_to_file(temp_cache, cache_file)

def clear_cache(cache_file=TEMP_CACHE_FNAME):
    _write_to_file({}, cache_file)

def make_cache_key(baseurl, params_d, private_keys=["api_key"]):
    """Makes a long string representing the query.
    Alphabetize the keys from the params dictionary so we get the same order each time.
    Omit keys with private info."""
    alphabetized_keys = sorted(params_d.keys())
    res = []
    for k in alphabetized_keys:
        if k not in private_keys:
            res.append("{}-{}".format(k, params_d[k]))
    return baseurl + "_".join(res)

def get(baseurl, params={}, private_keys_to_ignore=["api_key"], permanent_cache_file=PERMANENT_CACHE_FNAME, temp_cache_file=TEMP_CACHE_FNAME):
    full_url = requests.requestURL(baseurl, params)
    cache_key = make_cache_key(baseurl, params, private_keys_to_ignore)
    # Load the permanent and page-specific caches from files
    permanent_cache = _read_from_file(permanent_cache_file)
    temp_cache = _read_from_file(temp_cache_file)
    if cache_key in temp_cache:
        print("found in temp_cache")
        # make a Response object containing text from the change, and the full_url that would have been fetched
        return requests.Response(temp_cache[cache_key], full_url)
    elif cache_key in permanent_cache:
        print("found in permanent_cache")
        # make a Response object containing text from the change, and the full_url that would have been fetched
        return requests.Response(permanent_cache[cache_key], full_url)
    else:
        print("new; adding to cache")
        # actually request it
        resp = requests.get(baseurl, params)
        # save it
        add_to_cache(temp_cache_file, cache_key, resp.text)
        return resp

# Searching for Media on iTunes

- You’ve already seen an example using the iTunes API in Generating Request URLs. The iTunes API allows users to search for movies, podcasts, music, music videos, tv shows, and books that are hosted on the iTunes site. You can explore the official iTunes API documentation.


In [1]:
import requests_with_caching
import json

parameters = {"term": "Ann Arbor", "entity": "podcast"}
iTunes_response = requests_with_caching.get("https://itunes.apple.com/search", params = parameters, permanent_cache_file="itunes_cache.txt")

py_data = json.loads(iTunes_response.text)
for r in py_data['results']:
    print(r['trackName'])


ModuleNotFoundError: No module named 'requests_with_caching'

# Unicode for non-English characters

- Python’s strings are in unicode, which allows for characters to be from a much larger alphabet, including more than 75,000 ideographic characters used in Chinese, Japanese, and Korean alphabets. Everything works fine inside Python, for operations like slicing and appending and concatenating strings and using .find() or the in operator

- Fortunately, the requests module will normally handle this for us automatically. When we fetch a webpage that is in json format, the webpage will have a header called ‘content-type’ that will say something like application/json; charset=utf8. If it specifies the utf8 character set in that way, the requests module will automatically decode the contents into unicode: requests.get('that web page').text will yield a string, with each of those sequences of one to four bytes coverted into a single character.

- If, for some reason, you get json-formatted text that is utf-encoded but the requests module hasn’t magically decoded it for you, the json.loads() function call can take care of the decoding for you. loads() takes an optional parameter, encoding. Its default value is ‘utf-8’, so you don’t need to specify it unless you think the text you have received was in some other encoding than ‘utf-8’.

# Assignment 

In [2]:
import requests

def get_movies_from_tastedive(movieName, key="327878-course3p-I4ZNBN4A"):
    baseurl="https://tastedive.com/api/similar"
    params_d = {}
    params_d["q"]= movieName
    params_d["k"]= key
    params_d["type"]= "movies"
    params_d["limit"] = "5"
    resp = requests.get(baseurl, params=params_d)
    print(resp.url)
    respDic = resp.json()
    return respDic 

def extract_movie_titles(movieName):
    result=[]
    for listRes in movieName['Similar']['Results']:
        result.append(listRes['Name'])
    return result

def get_related_titles(listMovieName):
    if listMovieName != []:
        auxList=[]
        relatedList=[]
        for movieName in listMovieName:
            auxList = extract_movie_titles(get_movies_from_tastedive(movieName))
            for movieNameAux in auxList:
                if movieNameAux not in relatedList:
                    relatedList.append(movieNameAux)
        
        return relatedList
    return listMovieName

def get_movie_data(movieName, key="546c6742"):
    baseurl= "http://www.omdbapi.com/"
    params_d = {}
    params_d["t"]= movieName
    params_d["apikey"]= key
    params_d["r"]= "json"
    resp = requests.get(baseurl, params=params_d)
    print(resp.url)
    respDic = resp.json()
    return respDic

def get_movie_rating(movieNameJson):
    strRanting=""
    for typeRantingList in movieNameJson["Ratings"]:
        if typeRantingList["Source"]== "Rotten Tomatoes":
            strRanting = typeRantingList["Value"]
    if strRanting != "":
        ranting = int(strRanting[:2])
    else: ranting = 0
    return ranting

def get_sorted_recommendations(listMovieTitle):
    listMovie= get_related_titles(listMovieTitle)
    listMovie= sorted(listMovie, key = lambda movieName: (get_movie_rating(get_movie_data(movieName)), movieName), reverse=True)
    
    return listMovie

print(get_sorted_recommendations(["Bridesmaids", "Sherlock Holmes"]))




https://tastedive.com/api/similar?q=Bridesmaids&k=327878-course3p-I4ZNBN4A&type=movies&limit=5
https://tastedive.com/api/similar?q=Sherlock+Holmes&k=327878-course3p-I4ZNBN4A&type=movies&limit=5
http://www.omdbapi.com/?t=Baby+Mama&apikey=546c6742&r=json
http://www.omdbapi.com/?t=The+Five-Year+Engagement&apikey=546c6742&r=json
http://www.omdbapi.com/?t=Date+Night&apikey=546c6742&r=json
http://www.omdbapi.com/?t=Bachelorette&apikey=546c6742&r=json
http://www.omdbapi.com/?t=Bad+Teacher&apikey=546c6742&r=json
http://www.omdbapi.com/?t=Sherlock+Holmes%3A+A+Game+Of+Shadows&apikey=546c6742&r=json
http://www.omdbapi.com/?t=Prince+Of+Persia%3A+The+Sands+Of+Time&apikey=546c6742&r=json
http://www.omdbapi.com/?t=Angels+%26+Demons&apikey=546c6742&r=json
http://www.omdbapi.com/?t=Iron+Man&apikey=546c6742&r=json
http://www.omdbapi.com/?t=The+Tourist&apikey=546c6742&r=json
['Iron Man', 'Date Night', 'The Five-Year Engagement', 'Baby Mama', 'Sherlock Holmes: A Game Of Shadows', 'Bachelorette', 'Bad Teac