## Live Code: Python Sets

In this example we'll play around with call results from the [New York Times Books API](https://developer.nytimes.com/docs/books-product/1/overview) to demonstrate the use of __set operations__ (stay tuned for week 6, to learn more about APIs). 

With the Book API we can access data from the NY Times Bestseller List
The Books API has service, that returns best sellers for a specified date and list-name.
The request requires two parameters: {publishing date} and {list}

We'll look at following categories: 
* Hardcover Fiction
* Hardcover Nonfiction
* Paperback Trade Fiction
* Paperback Nonfiction

These are updated weekly, we’ll look at lists of the current and previous week for comparison.

In the first part of this code we'll create sets of titles for each category and week, in the second section we'll make use of set operations to get insights about the bestsellers. 

Things that we cand find out:
- which books have stayed in the top 15 compared to the previous week? 
- which titles are newcomers?
- ...

### Generating Sets

In [1]:
# import requests and json libraries
import requests
import json

# this function will make requests to the Books API
# and generate sets of bestsellers for different lists
# by passing 'date' as an argument, we can later call this function 
# several times for the lists of the current and the previous weeks
def generateSets(date):
    
    # if you want to play around with the API, please make your own key at https://developer.nytimes.com/
    authorized_key = "QftZeSssSfBqTSFet3RBaTE9inc3iWAw"
    # create list of the categories we want to access:
    categories = ['hardcover-fiction', 'hardcover-nonfiction', 'paperback-nonfiction', 'trade-fiction-paperback']
    
    """ This is an excerpt of the data structure the API will return:      
{(...)
 (...)
 'results': {(...)
     (...)
     'books': [{(...)
         (...)
         'title': 'LITTLE FIRES EVERYWHERE',
         'contributor': 'by Celeste Ng',
    
    """
    
    # our goal is to create a set for each of the above categories, 
    # containing the title of the top 15 books
    
    # step 1: 
    # declare a global variable
    global bestseller_titles 
    
    # create an empty, nested list (one list for each category)
    bestseller_titles = [[],[],[],[]] # they will hold information from the 'title' key 
    
    # step 2: 
    # populate those lists in a nested while loop:
                
    """ PSEUDO CODE: 
    
# iterate through list in 'bestseller_titles': 
n = 0
while n < number of lists in 'bestseller_titles'(4)
    call the api_url, and pass category[n]
    get the response, and store as json 
    
    # access the 'books' key, define 'path' in json structure
    books = data['results']['books']
    # iterate through titles in 'books':
    j = 0
    while j < number of books in 'books':
        add books[j]['title'] to bestseller_titles[n]
        j += 1
        
    n += 1 
    
    """

    n = 0 # create variable 'n' to count
    while n < len(bestseller_titles): # for each empty list 'bestseller_titles'
        # call the API-url
        # use string formatters to parse in the date ('current'), category (with index 'n'), and the API-key
        api_url = "https://api.nytimes.com/svc/books/v3/lists/{}/{}.json?api-key={}".format(date, categories[n], authorized_key)

        # call the API with requests
        response = requests.get(api_url)
        # create a variable called 'data' to hold the json formatted result
        data = response.json()

        # define the 'path' inside the json structure
        books = data['results']['books']
        
        # then iterate through 'titles' in 'books':
        j = 0 # create variable 'j' to count
        # while 'j' is smaller than the number of books
        while j < len(books):
            # add the title to the 'nth' list in 'bestseller_titles'
            bestseller_titles[n].append(books[j]['title'])
            j += 1 # count +1

        n += 1 # count +1
    
    # step 3:
    # print the populated lists as a sanity check
    print(bestseller_titles)

In [2]:
# call the generatSets() function 
# with 'date' = 'current' to recieve this week's bestseller list
generateSets('current')
print(len(bestseller_titles))

4


In [3]:
# create a set from each nested list
hc_fiction_jun21 = set(bestseller_titles[0]) 
hc_nonfiction_jun21 = set(bestseller_titles[1])
pb_nonfiction_jun21 = set(bestseller_titles[2])
pb_fiction_jun21 = set(bestseller_titles[3])

print('Hardcover Fiction, June 21:\n', hc_fiction_jun21)
print('\nHardcover Nonfiction, June 21:\n', hc_nonfiction_jun21)
print('\nPaperback Nonfiction, June 21:\n', pb_nonfiction_jun21)
print('\nPaperback Fiction, June 21:\n', pb_fiction_jun21)

Hardcover Fiction, June 21:

Hardcover Nonfiction, June 21:
 {'TALKING TO STRANGERS', "THE DEVIANT'S WAR", 'BETWEEN THE WORLD AND ME', 'HOW TO BE AN ANTIRACIST', 'THE SPLENDID AND THE VILE', 'EDUCATED', 'BREATH', 'HUMANKIND', 'ME AND WHITE SUPREMACY', 'PLAGUE OF CORRUPTION', 'UNITED STATES OF SOCIALISM', 'UNTAMED', 'MY VANISHING COUNTRY', 'BECOMING', 'THE MAMBA MENTALITY'}

Paperback Nonfiction, June 21:
 {'WHITE RAGE', 'THE NEW JIM CROW', 'WHITE FRAGILITY', 'JUST MERCY', 'SO YOU WANT TO TALK ABOUT RACE', 'THE COLOR OF LAW', 'THE GREAT INFLUENZA', 'THE BODY KEEPS THE SCORE', 'RAISING WHITE KIDS', 'WAKING UP WHITE', "MY GRANDMOTHER'S HANDS", 'WHY ARE ALL THE BLACK KIDS SITTING TOGETHER IN THE CAFETERIA?', 'ELOQUENT RAGE', 'BORN A CRIME', 'STAMPED FROM THE BEGINNING'}

Paperback Fiction, June 21:
 {'BELOVED', 'THE NIGHTINGALE', 'HUSH', 'THE FAMILY UPSTAIRS', 'BEFORE WE WERE YOURS', 'THEN SHE WAS GONE', 'NORMAL PEOPLE', 'CIRCE', 'THE TATTOOIST OF AUSCHWITZ', 'LITTLE FIRES EVERYWHERE', 'AM

In [4]:
# call the generatSets() function again
# with 'date' = '2020-06-14' to recieve last week's bestseller list
generateSets('2020-06-14')



In [5]:
# create a set from each nested list
hc_fiction_jun14 = set(bestseller_titles[0]) 
hc_nonfiction_jun14 = set(bestseller_titles[1]) 
pb_nonfiction_jun14 = set(bestseller_titles[2]) 
pb_fiction_jun14 = set(bestseller_titles[3]) 

print('Hardcover Fiction, June 14:\n', hc_fiction_jun14)
print('\nHardcover Nonfiction, June 14:\n', hc_nonfiction_jun14)
print('\nPaperback Nonfiction, June 14:\n', pb_nonfiction_jun14)
print('\nPaperback Fiction, June 14:\n', pb_fiction_jun14)

Hardcover Fiction, June 14:

Hardcover Nonfiction, June 14:
 {'AMERICAN CRUSADE', 'THE SPLENDID AND THE VILE', 'HOW TO BE AN ANTIRACIST', 'BREATH', 'EDUCATED', 'ME AND WHITE SUPREMACY', 'THE CHIFFON TRENCHES', 'PLAGUE OF CORRUPTION', 'MY VANISHING COUNTRY', 'UNTAMED', 'HIDDEN VALLEY ROAD', 'HOLLYWOOD PARK', 'FORTITUDE', 'BECOMING', 'THE MAMBA MENTALITY'}

Paperback Nonfiction, June 14:
 {'THE NEW JIM CROW', 'WHITE FRAGILITY', 'BRAIDING SWEETGRASS', 'JUST MERCY', 'SO YOU WANT TO TALK ABOUT RACE', 'THE GREAT INFLUENZA', 'SAPIENS', 'OUTLIERS', 'THE BODY KEEPS THE SCORE', 'A WOMAN OF NO IMPORTANCE', 'UNORTHODOX', 'THE COLOR OF LAW', 'THINKING, FAST AND SLOW', 'BORN A CRIME', 'GRIT'}

Paperback Fiction, June 14:
 {'THE BOOK WOMAN OF TROUBLESOME CREEK', 'CALL ME BY YOUR NAME', 'THE OVERSTORY', 'A GENTLEMAN IN MOSCOW', 'BEFORE WE WERE YOURS', 'THEN SHE WAS GONE', 'NORMAL PEOPLE', 'CIRCE', 'THE TATTOOIST OF AUSCHWITZ', 'LITTLE FIRES EVERYWHERE', 'THIS TENDER LAND', 'THE WOMAN IN THE WINDOW', '

## Set Operations

Now that we have declared multiple sets of books, let's make use of set operations to get insights about the bestsellers.

In [None]:
# create an intersection function

In [None]:
# create a difference function

In [None]:
# create a union function 

In [None]:
# perform an operation on more than two sets