## Live Code: Python Sets

In this example we'll play around with call results from the [New York Times Books API](https://developer.nytimes.com/docs/books-product/1/overview) to demonstrate the use of __set operations__ (stay tuned for week 6, to learn more about APIs). 

With the Book API we can access data from the NY Times Bestseller List
The Books API has service, that returns best sellers for a specified date and list-name.
The request requires two parameters: {publishing date} and {list}

We'll look at following categories: 
* Hardcover Fiction
* Hardcover Nonfiction
* Paperback Trade Fiction
* Paperback Nonfiction

These are updated weekly, we’ll look at lists of the current and previous week for comparison.

In the first part of this code we'll create sets of titles for each category and week, in the second section we'll make use of set operations to get insights about the bestsellers. 

Things that we cand find out:
- which books have stayed in the top 15 compared to the previous week? 
- which titles are newcomers?
- ...

### Generating Sets

In [15]:
# import requests and json libraries
import requests
import json

# this function will make requests to the Books API
# and generate sets of bestsellers for different lists
# by passing 'date' as an argument, we can later call this function 
# several times for the lists of the current and the previous weeks
def generateSets(date):
    
    # if you want to play around with the API, please make your own key at https://developer.nytimes.com/
    authorized_key = "QftZeSssSfBqTSFet3RBaTE9inc3iWAw"
    # create list of the categories we want to access:
    categories = ['hardcover-fiction', 'hardcover-nonfiction', 'paperback-nonfiction', 'trade-fiction-paperback']
    
    """ This is an excerpt of the data structure the API will return:      
{(...)
 (...)
 'results': {(...)
     (...)
     'books': [{(...)
         (...)
         'title': 'LITTLE FIRES EVERYWHERE',
         'contributor': 'by Celeste Ng',
    
    """
    
    # our goal is to create a set for each of the above categories, 
    # containing the title of the top 15 books
    
    # step 1: 
    # declare a global variable
    global bestseller_titles 
    
    # create an empty, nested list (one list for each category)
    bestseller_titles = [[],[],[],[]] # they will hold information from the 'title' key 
    
    # step 2: 
    # populate those lists in a nested while loop:
                
    """ PSEUDO CODE: 
    
# iterate through list in 'bestseller_titles': 
n = 0
while n < number of lists in 'bestseller_titles'(4)
    call the api_url, and pass category[n]
    get the response, and store as json 
    
    # access the 'books' key, define 'path' in json structure
    books = data['results']['books']
    # iterate through titles in 'books':
    j = 0
    while j < number of books in 'books':
        add books[j]['title'] to bestseller_titles[n]
        j += 1
        
    n += 1 
    
    """

    n = 0 # create variable 'n' to count
    while n < len(bestseller_titles): # for each empty list 'bestseller_titles'
        # call the API-url
        # use string formatters to parse in the date ('current'), category (with index 'n'), and the API-key
        api_url = "https://api.nytimes.com/svc/books/v3/lists/{}/{}.json?api-key={}".format(date, categories[n], authorized_key)

        # call the API with requests
        response = requests.get(api_url)
        # create a variable called 'data' to hold the json formatted result
        data = response.json()

        # define the 'path' inside the json structure
        books = data['results']['books']
        
        # then iterate through 'titles' in 'books':
        j = 0 # create variable 'j' to count
        # while 'j' is smaller than the number of books
        while j < len(books):
            # add the title to the 'nth' list in 'bestseller_titles'
            bestseller_titles[n].append(books[j]['title'])
            j += 1 # count +1

        n += 1 # count +1
    
    # step 3:
    # print the populated lists as a sanity check
    print(bestseller_titles)

In [17]:
# call the generatSets() function 
# with 'date' = 'current' to recieve this week's bestseller list
generateSets('current')
print(len(bestseller_titles))

4


In [18]:
# create a set from each nested list
hc_fiction_jun21 = set(bestseller_titles[0]) 
hc_nonfiction_jun21 = set(bestseller_titles[1])
pb_nonfiction_jun21 = set(bestseller_titles[2])
pb_fiction_jun21 = set(bestseller_titles[3])

print('Hardcover Fiction, June 21:\n', hc_fiction_jun21)
print('\nHardcover Nonfiction, June 21:\n', hc_nonfiction_jun21)
print('\nPaperback Nonfiction, June 21:\n', pb_nonfiction_jun21)
print('\nPaperback Fiction, June 21:\n', pb_fiction_jun21)

Hardcover Fiction, June 21:

Hardcover Nonfiction, June 21:
 {'PLAGUE OF CORRUPTION', 'EDUCATED', 'UNITED STATES OF SOCIALISM', 'MY VANISHING COUNTRY', 'THE MAMBA MENTALITY', 'TALKING TO STRANGERS', 'BREATH', 'UNTAMED', "THE DEVIANT'S WAR", 'HOW TO BE AN ANTIRACIST', 'THE SPLENDID AND THE VILE', 'ME AND WHITE SUPREMACY', 'BETWEEN THE WORLD AND ME', 'HUMANKIND', 'BECOMING'}

Paperback Nonfiction, June 21:
 {'RAISING WHITE KIDS', 'ELOQUENT RAGE', 'WHITE RAGE', 'JUST MERCY', 'BORN A CRIME', 'WAKING UP WHITE', "MY GRANDMOTHER'S HANDS", 'THE COLOR OF LAW', 'THE BODY KEEPS THE SCORE', 'STAMPED FROM THE BEGINNING', 'WHITE FRAGILITY', 'THE GREAT INFLUENZA', 'SO YOU WANT TO TALK ABOUT RACE', 'WHY ARE ALL THE BLACK KIDS SITTING TOGETHER IN THE CAFETERIA?', 'THE NEW JIM CROW'}

Paperback Fiction, June 21:
 {'THE BLUEST EYE', 'THE FAMILY UPSTAIRS', 'THE TATTOOIST OF AUSCHWITZ', 'BELOVED', 'AMERICANAH', 'HUSH', 'LITTLE FIRES EVERYWHERE', 'THEN SHE WAS GONE', 'CIRCE', 'THE NIGHTINGALE', 'THE WOMAN I

In [19]:
# call the generatSets() function again
# with 'date' = '2020-06-14' to recieve last week's bestseller list
generateSets('2020-06-14')



In [20]:
# create a set from each nested list
hc_fiction_jun14 = set(bestseller_titles[0]) 
hc_nonfiction_jun14 = set(bestseller_titles[1]) 
pb_nonfiction_jun14 = set(bestseller_titles[2]) 
pb_fiction_jun14 = set(bestseller_titles[3]) 

print('Hardcover Fiction, June 14:\n', hc_fiction_jun14)
print('\nHardcover Nonfiction, June 14:\n', hc_nonfiction_jun14)
print('\nPaperback Nonfiction, June 14:\n', pb_nonfiction_jun14)
print('\nPaperback Fiction, June 14:\n', pb_fiction_jun14)

Hardcover Fiction, June 14:

Hardcover Nonfiction, June 14:
 {'PLAGUE OF CORRUPTION', 'EDUCATED', 'MY VANISHING COUNTRY', 'THE MAMBA MENTALITY', 'HIDDEN VALLEY ROAD', 'BREATH', 'UNTAMED', 'AMERICAN CRUSADE', 'HOW TO BE AN ANTIRACIST', 'HOLLYWOOD PARK', 'THE SPLENDID AND THE VILE', 'ME AND WHITE SUPREMACY', 'THE CHIFFON TRENCHES', 'FORTITUDE', 'BECOMING'}

Paperback Nonfiction, June 14:
 {'GRIT', 'JUST MERCY', 'BORN A CRIME', 'SAPIENS', 'OUTLIERS', 'THE COLOR OF LAW', 'THE BODY KEEPS THE SCORE', 'A WOMAN OF NO IMPORTANCE', 'UNORTHODOX', 'WHITE FRAGILITY', 'THE GREAT INFLUENZA', 'SO YOU WANT TO TALK ABOUT RACE', 'BRAIDING SWEETGRASS', 'THINKING, FAST AND SLOW', 'THE NEW JIM CROW'}

Paperback Fiction, June 14:
 {'THE TATTOOIST OF AUSCHWITZ', 'CALL ME BY YOUR NAME', 'CITY OF GIRLS', 'LITTLE FIRES EVERYWHERE', 'THE OVERSTORY', 'THEN SHE WAS GONE', 'BEACH READ', 'CIRCE', 'THE NIGHTINGALE', 'THE WOMAN IN THE WINDOW', 'A GENTLEMAN IN MOSCOW', 'THIS TENDER LAND', 'BEFORE WE WERE YOURS', 'THE BO

## Set Operations

Now that we have declared multiple sets of books, let's make use of set operations to get insights about the bestsellers.

In [11]:
# create an intersection function to test if a books shows up in two categories
def intersection(A , B): 
    inter = set(A) & set(B)
    print('A & B\nFollowing books match your criteria:\n{}\n'.format(inter))

# call the function
# show titles in paperback nonfiction, that were both on this and last week's bestseller list
intersection(pb_nonfiction_jun21, pb_nonfiction_jun14)

A & B
Following books match your criteria:
{'JUST MERCY', 'BORN A CRIME', 'THE COLOR OF LAW', 'THE BODY KEEPS THE SCORE', 'WHITE FRAGILITY', 'THE GREAT INFLUENZA', 'SO YOU WANT TO TALK ABOUT RACE', 'THE NEW JIM CROW'}



In [12]:
# create a difference function
def difference(A , B): 
    diff = set(A) - set(B)
    print('A - B\nFollowing books match your criteria:\n{}\n'.format(diff))

# call the function
# show this week's newcomers in the paperback nonfiction category
difference(pb_nonfiction_jun21, pb_nonfiction_jun14)

A - B
Following books match your criteria:
{'ELOQUENT RAGE', 'WHITE RAGE', "MY GRANDMOTHER'S HANDS", 'WAKING UP WHITE', 'WHY ARE ALL THE BLACK KIDS SITTING TOGETHER IN THE CAFETERIA?', 'STAMPED FROM THE BEGINNING', 'RAISING WHITE KIDS'}



In [13]:
# create a union function to show two categories combined
def union(A , B): 
    union = set(A) | set(B)
    print('A | B\nFollowing books match your criteria:\n{}\n'.format(union))

# call the function
# show paperback nonfiction titles of this and last week combined
union(pb_nonfiction_jun21, pb_nonfiction_jun14)

A | B
Following books match your criteria:
{'GRIT', 'JUST MERCY', 'BORN A CRIME', 'THE COLOR OF LAW', 'UNORTHODOX', 'WHITE FRAGILITY', 'A WOMAN OF NO IMPORTANCE', 'THE GREAT INFLUENZA', 'RAISING WHITE KIDS', 'BRAIDING SWEETGRASS', 'THE NEW JIM CROW', 'ELOQUENT RAGE', 'WHITE RAGE', 'SAPIENS', "MY GRANDMOTHER'S HANDS", 'OUTLIERS', 'THE BODY KEEPS THE SCORE', 'SO YOU WANT TO TALK ABOUT RACE', 'WAKING UP WHITE', 'STAMPED FROM THE BEGINNING', 'WHY ARE ALL THE BLACK KIDS SITTING TOGETHER IN THE CAFETERIA?', 'THINKING, FAST AND SLOW'}



In [14]:
# Show ALL nonfiction bestsellers, current and last week combined
all_nonfiction = pb_nonfiction_jun21 | pb_nonfiction_jun14 | hc_nonfiction_jun21 | hc_nonfiction_jun14
print(all_nonfiction)

{'PLAGUE OF CORRUPTION', 'UNITED STATES OF SOCIALISM', 'JUST MERCY', 'UNTAMED', 'AMERICAN CRUSADE', 'THE COLOR OF LAW', "THE DEVIANT'S WAR", 'HOW TO BE AN ANTIRACIST', 'HUMANKIND', 'RAISING WHITE KIDS', 'BRAIDING SWEETGRASS', 'FORTITUDE', 'BECOMING', 'THE MAMBA MENTALITY', 'HIDDEN VALLEY ROAD', 'SAPIENS', 'WAKING UP WHITE', "MY GRANDMOTHER'S HANDS", 'OUTLIERS', 'THE BODY KEEPS THE SCORE', 'STAMPED FROM THE BEGINNING', 'UNORTHODOX', 'WHY ARE ALL THE BLACK KIDS SITTING TOGETHER IN THE CAFETERIA?', 'GRIT', 'EDUCATED', 'MY VANISHING COUNTRY', 'BORN A CRIME', 'THE CHIFFON TRENCHES', 'WHITE FRAGILITY', 'A WOMAN OF NO IMPORTANCE', 'THE GREAT INFLUENZA', 'THE SPLENDID AND THE VILE', 'THE NEW JIM CROW', 'ELOQUENT RAGE', 'WHITE RAGE', 'BREATH', 'HOLLYWOOD PARK', 'TALKING TO STRANGERS', 'ME AND WHITE SUPREMACY', 'SO YOU WANT TO TALK ABOUT RACE', 'BETWEEN THE WORLD AND ME', 'THINKING, FAST AND SLOW'}
