# CSV + API

In this reboot, we are going to use:

- The [Goodreads books](https://www.kaggle.com/jealousleopard/goodreadsbooks) dataset from Kaggle.
- The [Open Library Books API](https://openlibrary.org/dev/docs/api/books)

The goal of this livecode is to load the data from a CSV + loop over rows to enrich each row with information such as:

- List of subjects (Science, Humor, Travel, etc.)
- The cover URL of the book
- Other information you'd find useful in the JSON API

First, download the CSV in the local folder:

In [1]:
!curl -L https://gist.githubusercontent.com/ssaunier/351b17f5a7a009808b60aeacd1f4a036/raw/books.csv > books.csv

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1509k  100 1509k    0     0  2779k      0 --:--:-- --:--:-- --:--:-- 2836k


In [2]:
!ls -lh

total 3056
-rw-r--r--  1 viana.abreu  staff   579B Oct  7 09:59 README.md
-rw-r--r--  1 viana.abreu  staff   2.8K Oct  7 09:59 Recap.ipynb
-rw-r--r--  1 viana.abreu  staff   1.5M Oct 29 14:11 books.csv
drwx------  5 viana.abreu  staff   160B Oct 29 14:11 [1m[36msolution_02-Data-Toolkit_02-Data-Sourcing_Recap[m[m
-rw-r--r--@ 1 viana.abreu  staff   6.2K Oct 29 14:10 solution_02-Data-Toolkit_02-Data-Sourcing_Recap.zip


Then import the usual suspects!

In [3]:
import requests
import pandas as pd
import numpy as np

## Load books from CSV

In [80]:
# YOUR CODE HERE
books_df = pd.read_csv('books.csv')[['isbn13','title','authors']]
books_df.head()

Unnamed: 0,isbn13,title,authors
0,9780439785969,Harry Potter and the Half-Blood Prince (Harry ...,J.K. Rowling-Mary GrandPré
1,9780439358071,Harry Potter and the Order of the Phoenix (Har...,J.K. Rowling-Mary GrandPré
2,9780439554930,Harry Potter and the Sorcerer's Stone (Harry P...,J.K. Rowling-Mary GrandPré
3,9780439554893,Harry Potter and the Chamber of Secrets (Harry...,J.K. Rowling
4,9780439655484,Harry Potter and the Prisoner of Azkaban (Harr...,J.K. Rowling-Mary GrandPré


The goal of this reboot is to load the data from the CSV + loop over rows to enrich each row with information such as:

- List of subjects (Science, Humor, Travel, etc.)
- The cover URL of the book
- Other information you’d find useful in the JSON API

Let's add a new column

In [26]:
# YOUR CODE HERE
books_df['cover_url'] = ''
books_df.head()

Unnamed: 0,isbn,title,authors,cover_url
0,0439785960,Harry Potter and the Half-Blood Prince (Harry ...,J.K. Rowling-Mary GrandPré,
1,0439358078,Harry Potter and the Order of the Phoenix (Har...,J.K. Rowling-Mary GrandPré,
2,0439554934,Harry Potter and the Sorcerer's Stone (Harry P...,J.K. Rowling-Mary GrandPré,
3,0439554896,Harry Potter and the Chamber of Secrets (Harry...,J.K. Rowling,
4,043965548X,Harry Potter and the Prisoner of Azkaban (Harr...,J.K. Rowling-Mary GrandPré,


## API - Open Library

In [14]:
# YOUR CODE HERE
# http://openlibrary.org/api/books?

url = 'http://openlibrary.org/api/books?'
params = {
    'bibkeys':'ISBN:0451526538',
    'format':'json',
    'jscmd':'data'
}

response = requests.get(url,params=params)
response.status_code

200

In [36]:
"['ISBN:0451526538']['cover']['small']"
response.json().get('ISBN:0451526538',{}).get('cover',{}).get('small','')

''

In [56]:
def get_cover(isbn_):
    str_isbn = f'ISBN:{isbn_}'
    url = 'http://openlibrary.org/api/books?'
    params = {
        'bibkeys':str_isbn,
        'format':'json',
        'jscmd':'data'
    }
    response = requests.get(url,params=params)
    if (response.status_code) != 200:
        return ''
    return response.json().get(str_isbn,{}).get('cover',{}).get('small','')
   

In [57]:
get_cover('0439785960')

'https://covers.openlibrary.org/b/id/9326654-S.jpg'

## Calling the API with multiple ISBNs at a time

In [75]:
# YOUR CODE HERE
def get_multiple_cover(isbn_list):
    list_isbn = [f'ISBN:{isbn_}' for isbn_ in isbn_list]
    str_isbn = ','.join(list_isbn_el for list_isbn_el in list_isbn)
    print(str_isbn)
    url = 'http://openlibrary.org/api/books?'
    params = {
        'bibkeys':str_isbn,
        'format':'json',
        'jscmd':'data'
    }
    response = requests.get(url,params=params)
    if (response.status_code) != 200:
        return ''
    #print(response.json().keys())
    return response.json()

In [73]:
my_list = ['0439785960','0439358078','0439554896']
print(get_multiple_cover(my_list))

{'ISBN:0439785960': {'url': 'http://openlibrary.org/books/OL24280830M/Harry_Potter_and_the_Half-Blood_Prince', 'key': '/books/OL24280830M', 'title': 'Harry Potter and the Half-Blood Prince', 'authors': [{'url': 'http://openlibrary.org/authors/OL23919A/J._K._Rowling', 'name': 'J. K. Rowling'}], 'identifiers': {'amazon': ['0439785960'], 'goodreads': ['53178655'], 'isbn_10': ['0439785960'], 'isbn_13': ['9780439785969'], 'oclc': ['70666878', '819153929'], 'openlibrary': ['OL24280830M']}, 'classifications': {'lc_classifications': ['PZ7.R79835Halc 2005']}, 'publishers': [{'name': 'Scholastic'}], 'publish_places': [{'name': 'New York, USA'}], 'publish_date': '2006-09', 'subjects': [{'name': 'orphans', 'url': 'https://openlibrary.org/subjects/orphans'}, {'name': 'foster homes', 'url': 'https://openlibrary.org/subjects/foster_homes'}, {'name': 'romans', 'url': 'https://openlibrary.org/subjects/romans'}, {'name': 'magie', 'url': 'https://openlibrary.org/subjects/magie'}, {'name': 'adolescence', 

In [81]:
books_df.set_index('isbn13', inplace=True)

In [82]:
books_df.head()

Unnamed: 0_level_0,title,authors
isbn13,Unnamed: 1_level_1,Unnamed: 2_level_1
9780439785969,Harry Potter and the Half-Blood Prince (Harry ...,J.K. Rowling-Mary GrandPré
9780439358071,Harry Potter and the Order of the Phoenix (Har...,J.K. Rowling-Mary GrandPré
9780439554930,Harry Potter and the Sorcerer's Stone (Harry P...,J.K. Rowling-Mary GrandPré
9780439554893,Harry Potter and the Chamber of Secrets (Harry...,J.K. Rowling
9780439655484,Harry Potter and the Prisoner of Azkaban (Harr...,J.K. Rowling-Mary GrandPré


In [83]:
get_multiple_cover(list(books_df.head(5).index))

ISBN:9780439785969,ISBN:9780439358071,ISBN:9780439554930,ISBN:9780439554893,ISBN:9780439655484


{'ISBN:9780439785969': {'url': 'http://openlibrary.org/books/OL24280830M/Harry_Potter_and_the_Half-Blood_Prince',
  'key': '/books/OL24280830M',
  'title': 'Harry Potter and the Half-Blood Prince',
  'authors': [{'url': 'http://openlibrary.org/authors/OL23919A/J._K._Rowling',
    'name': 'J. K. Rowling'}],
  'identifiers': {'amazon': ['0439785960'],
   'goodreads': ['53178655'],
   'isbn_10': ['0439785960'],
   'isbn_13': ['9780439785969'],
   'oclc': ['70666878', '819153929'],
   'openlibrary': ['OL24280830M']},
  'classifications': {'lc_classifications': ['PZ7.R79835Halc 2005']},
  'publishers': [{'name': 'Scholastic'}],
  'publish_places': [{'name': 'New York, USA'}],
  'publish_date': '2006-09',
  'subjects': [{'name': 'orphans',
    'url': 'https://openlibrary.org/subjects/orphans'},
   {'name': 'foster homes',
    'url': 'https://openlibrary.org/subjects/foster_homes'},
   {'name': 'romans', 'url': 'https://openlibrary.org/subjects/romans'},
   {'name': 'magie', 'url': 'https://o

In [84]:
books = get_multiple_cover(list(books_df.head(20).index))
for isbn_code, book in books.items():
        isbn = int(isbn_code.strip("ISBN:"))
        books_df.loc[isbn, "cover_url"] = book.get("cover", {}).get("large", "")

ISBN:9780439785969,ISBN:9780439358071,ISBN:9780439554930,ISBN:9780439554893,ISBN:9780439655484,ISBN:9780439682589,ISBN:9780976540601,ISBN:9780439827607,ISBN:9780517226957,ISBN:9780345453747,ISBN:9781400052929,ISBN:9780739322208,ISBN:9780517149256,ISBN:9780767908184,ISBN:9780767915069,ISBN:9780767910439,ISBN:9780767903868,ISBN:9780767903820,ISBN:9780060920081,ISBN:9780380713806


In [85]:
books_df

Unnamed: 0_level_0,title,authors,cover_url
isbn13,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
9780439785969,Harry Potter and the Half-Blood Prince (Harry ...,J.K. Rowling-Mary GrandPré,https://covers.openlibrary.org/b/id/9326654-L.jpg
9780439358071,Harry Potter and the Order of the Phoenix (Har...,J.K. Rowling-Mary GrandPré,https://covers.openlibrary.org/b/id/12025650-L...
9780439554930,Harry Potter and the Sorcerer's Stone (Harry P...,J.K. Rowling-Mary GrandPré,https://covers.openlibrary.org/b/id/7572543-L.jpg
9780439554893,Harry Potter and the Chamber of Secrets (Harry...,J.K. Rowling,https://covers.openlibrary.org/b/id/10301720-L...
9780439655484,Harry Potter and the Prisoner of Azkaban (Harr...,J.K. Rowling-Mary GrandPré,https://covers.openlibrary.org/b/id/10580458-L...
...,...,...,...
9780061186424,M Is for Magic,Neil Gaiman-Teddy Kristiansen,
9780930289553,Black Orchid,Neil Gaiman-Dave McKean,
9780061238963,InterWorld (InterWorld #1),Neil Gaiman-Michael Reaves,
9780743201117,The Faeries' Oracle,Brian Froud-Jessica Macbeth,
