## Live Code: Python Sets

In this example we'll play around with call results from the [New York Times Books API](https://developer.nytimes.com/docs/books-product/1/overview) to demonstrate the use of __set operations__ (stay tuned for week 6, to learn more about APIs). The Books API provides information about the The New York Times bestsellers lists. We'll look at following categories, which are updated weekly: 
* Hardcover Fiction
* Hardcover Nonfiction
* Paperback Trade Fiction
* Paperback Nonfiction

In the first part of this code we'll create sets of titles for each category and week, in the second section we'll make use of set operations to get insights about the bestsellers. Which books have stayed in the top 15 compared to the previous week? Which titles are newcomers?

### Generating Sets

In [None]:
# import requests and json libraries - needed to make an API call
import requests
import json

# this function will make requests to the nytimes Books API
def generateSets(date):
    
    authorized_key = "QftZeSssSfBqTSFet3RBaTE9inc3iWAw" # if you want to play around with the API, please make your own key at https://developer.nytimes.com/
    # to make a request with this API, we need to choose the specific bestseller lists we want to access 
    categories = ['hardcover-fiction', 'hardcover-nonfiction', 'paperback-nonfiction', 'trade-fiction-paperback']
    
    # our goal is to create a set for each of these categories, containing the respective books' titles and authors
    
    # step 1: 
    # set two global variables
    global bestseller_titles # this one will be populated with information from the 'title' key 
    global bestseller_authors # this one will be populated with information from the 'author' key 
    # declare them as nested list (one list for each category)
    bestseller_titles = [[],[],[],[]] 
    bestseller_authors = [[],[],[],[]] 
    
    # step 2: 
    # populate those lists in a while loop:
    n = 0 # create variable 'n' to count
    while n < len(bestseller_titles): # while 'n' is smaller than number of empty lists in 'bestseller_titles'
        # call the API-url
        # use string formatters to parse in the date ('current'), category (with index 'n'), and the API-key
        api_url = "https://api.nytimes.com/svc/books/v3/lists/{}/{}.json?api-key={}".format(date, categories[n], authorized_key)

        # call the API with requests
        response = requests.get(api_url)
        # create a variable called 'data' to hold the json formatted result
        data = response.json()
        
        """ This is an excerpt of the data structure the API returns:      
{(...)
 (...)
 'results': {(...)
     (...)
     'books': [{(...)
         (...)
         'title': 'LITTLE FIRES EVERYWHERE',
         'contributor': 'by Celeste Ng',
        
        """

        # we want to access the information stored in the key, 'results'
        # 'results' maps to a dictionariy that contains the 'books' list
        # define the 'path'
        books = data['results']['books']
        
        # then iterate through 'titles' and 'contributor' in 'books':
        j = 0 # create variable 'j' to count
        # while 'j' is smaller than the number of books
        while j < len(books):
            # add the title to the 'nth' list in 'bestseller_titles'
            bestseller_titles[n].append(books[j]['title'])
            # add the contributor to the 'nth' list in 'bestseller_authors'
            bestseller_authors[n].append(books[j]['contributor'])
            j += 1 # count +1

        n += 1 # count +1
    
    # step 3:
    # print the populated lists as a sanity check
    print(bestseller_titles)
    print('\n', bestseller_authors)

In [None]:
# call the generateSets() function 
# with 'date' = 'current' to recieve this week's bestseller list
generateSets('current')

In [None]:
# create a set from each nested list
# concatenate the 'title' and 'author' list, using list comprehension and the zip() function
# -> set([template for item1, item2 in zip(list1, list2)])
hc_fiction_jun21 = set([i + ' ' + j for i, j in zip(bestseller_titles[0], bestseller_authors[0])]) 
hc_nonfiction_jun21 = set([i + ' ' + j for i, j in zip(bestseller_titles[1], bestseller_authors[1])])
pb_nonfiction_jun21 = set([i + ' ' + j for i, j in zip(bestseller_titles[2], bestseller_authors[2])])
pb_fiction_jun21 = set([i + ' ' + j for i, j in zip(bestseller_titles[3], bestseller_authors[3])])

print('Hardcover Fiction, June 21:\n', hc_fiction_jun21)
print('\nHardcover Nonfiction, June 21:\n', hc_nonfiction_jun21)
print('\nPaperback Nonfiction, June 21:\n', pb_nonfiction_jun21)
print('\nPaperback Fiction, June 21:\n', pb_fiction_jun21)

In [None]:
# call the generateSets() function again
# with 'date' = '2020-06-14' to recieve last week's bestseller list
generateSets('2020-06-14')

In [None]:
# create a set from each nested list
# concatenate the 'title' and 'author' list, using list comprehension and the zip() function
# -> set([template for item1, item2 in zip(list1, list2)])
hc_fiction_jun14 = set([i + ' ' + j for i, j in zip(bestseller_titles[0], bestseller_authors[0])]) 
hc_nonfiction_jun14 = set([i + ' ' + j for i, j in zip(bestseller_titles[1], bestseller_authors[1])])
pb_nonfiction_jun14 = set([i + ' ' + j for i, j in zip(bestseller_titles[2], bestseller_authors[2])])
pb_fiction_jun14 = set([i + ' ' + j for i, j in zip(bestseller_titles[3], bestseller_authors[3])])

print('Hardcover Fiction, June 14:\n', hc_fiction_jun14)
print('\nHardcover Nonfiction, June 14:\n', hc_nonfiction_jun14)
print('\nPaperback Nonfiction, June 14:\n', pb_nonfiction_jun14)
print('\nPaperback Fiction, June 14:\n', pb_fiction_jun14)

## Set Operations

Now that we have declared multiple sets of books, let's make use of set operations to get insights about the bestsellers.

In [None]:
# create an intersection function to test if a book shows up in two categories

In [None]:
# create a difference function

In [None]:
# create a union function to show two categories combined

In [None]:
# Show ALL nonfiction bestsellers, current and last week combined