# CSV + API

In this reboot, we are going to use:

- The [Goodreads books](https://www.kaggle.com/jealousleopard/goodreadsbooks) dataset from Kaggle.
- The [Open Library Books API](https://openlibrary.org/dev/docs/api/books)

The goal of this livecode is to load the data from a CSV + loop over rows to enrich each row with information such as:

- List of subjects (Science, Humor, Travel, etc.)
- The cover URL of the book
- Other information you'd find useful in the JSON API

First, download the CSV in the local folder:

In [1]:
!curl -L https://gist.githubusercontent.com/ssaunier/351b17f5a7a009808b60aeacd1f4a036/raw/books.csv > books.csv

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1509k  100 1509k    0     0  2784k      0 --:--:-- --:--:-- --:--:-- 2779k


In [2]:
!ls -lh

total 4712
-rw-r--r--  1 luciengeorge  staff   580B Oct 13 10:59 README.md
-rw-r--r--  1 luciengeorge  staff   1.5M Oct 13 18:19 books.csv
-rw-r--r--  1 luciengeorge  staff    58K Oct 13 18:16 reboot.ipynb
-rw-r--r--  1 luciengeorge  staff   179K Oct 13 12:48 reboot_solution.ipynb


Then import the usual suspects!

In [3]:
import requests
import pandas as pd
import numpy as np

### Load csv into dataframe

In [4]:
books_df = pd.read_csv('books.csv')

In [5]:
books_df.head(15)

Unnamed: 0,bookID,title,authors,average_rating,isbn,isbn13,language_code,# num_pages,ratings_count,text_reviews_count
0,1,Harry Potter and the Half-Blood Prince (Harry ...,J.K. Rowling-Mary GrandPré,4.56,0439785960,9780439785969,eng,652,1944099,26249
1,2,Harry Potter and the Order of the Phoenix (Har...,J.K. Rowling-Mary GrandPré,4.49,0439358078,9780439358071,eng,870,1996446,27613
2,3,Harry Potter and the Sorcerer's Stone (Harry P...,J.K. Rowling-Mary GrandPré,4.47,0439554934,9780439554930,eng,320,5629932,70390
3,4,Harry Potter and the Chamber of Secrets (Harry...,J.K. Rowling,4.41,0439554896,9780439554893,eng,352,6267,272
4,5,Harry Potter and the Prisoner of Azkaban (Harr...,J.K. Rowling-Mary GrandPré,4.55,043965548X,9780439655484,eng,435,2149872,33964
5,8,Harry Potter Boxed Set Books 1-5 (Harry Potte...,J.K. Rowling-Mary GrandPré,4.78,0439682584,9780439682589,eng,2690,38872,154
6,9,"Unauthorized Harry Potter Book Seven News: ""Ha...",W. Frederick Zimmerman,3.69,0976540606,9780976540601,en-US,152,18,1
7,10,Harry Potter Collection (Harry Potter #1-6),J.K. Rowling,4.73,0439827604,9780439827607,eng,3342,27410,820
8,12,The Ultimate Hitchhiker's Guide: Five Complete...,Douglas Adams,4.38,0517226952,9780517226957,eng,815,3602,258
9,13,The Ultimate Hitchhiker's Guide to the Galaxy,Douglas Adams,4.38,0345453743,9780345453747,eng,815,240189,3954


In [6]:
books_df.shape

(13719, 10)

In [7]:
books_df.dtypes

bookID                  int64
title                  object
authors                object
average_rating        float64
isbn                   object
isbn13                  int64
language_code          object
# num_pages             int64
ratings_count           int64
text_reviews_count      int64
dtype: object

### Fetch `cover_url` from open library books APIS

In [8]:
def fetch_book_cover_url(isbn):
    url = "https://openlibrary.org/api/books"
    response = requests.get(url, params={
        'bibkeys': f"ISBN:{isbn}",
        'format': 'json',
        'jscmd': 'data'
    })
    data = response.json()
#     if f"ISBN:{isbn}" in data.keys():
    if len(data) > 0:
#         if data[f"ISBN:{isbn}"].get('cover'):
#             return data[f"ISBN:{isbn}"]['cover']['large']
        return data[f"ISBN:{isbn}"].get('cover', {}).get('large', 'N/A')
    return 'N/A'

In [9]:
isbn = 9780439785970 #books_df.loc[0, 'isbn13']
print(fetch_book_cover_url(isbn))

N/A


In [10]:
books_df['cover_url'] = None

In [11]:
%%time
# IDEMPOTENT SCRIPT
for index, row in books_df.head(20).iterrows():
    if row['cover_url'] is None:
        print(f"fetching cover for {row['title']}")
        cover_url = fetch_book_cover_url(row['isbn13'])
        books_df.loc[index, 'cover_url'] = cover_url

fetching cover for Harry Potter and the Half-Blood Prince (Harry Potter  #6)
fetching cover for Harry Potter and the Order of the Phoenix (Harry Potter  #5)
fetching cover for Harry Potter and the Sorcerer's Stone (Harry Potter  #1)
fetching cover for Harry Potter and the Chamber of Secrets (Harry Potter  #2)
fetching cover for Harry Potter and the Prisoner of Azkaban (Harry Potter  #3)
fetching cover for Harry Potter Boxed Set  Books 1-5 (Harry Potter  #1-5)
fetching cover for Unauthorized Harry Potter Book Seven News: "Half-Blood Prince" Analysis and Speculation
fetching cover for Harry Potter Collection (Harry Potter  #1-6)
fetching cover for The Ultimate Hitchhiker's Guide: Five Complete Novels and One Story (Hitchhiker's Guide to the Galaxy  #1-5)
fetching cover for The Ultimate Hitchhiker's Guide to the Galaxy
fetching cover for The Hitchhiker's Guide to the Galaxy (Hitchhiker's Guide to the Galaxy  #1)
fetching cover for The Hitchhiker's Guide to the Galaxy (Hitchhiker's Guide t

In [12]:
books_df[['title', 'cover_url']].head(15)

Unnamed: 0,title,cover_url
0,Harry Potter and the Half-Blood Prince (Harry ...,https://covers.openlibrary.org/b/id/9326654-L.jpg
1,Harry Potter and the Order of the Phoenix (Har...,https://covers.openlibrary.org/b/id/9326212-L.jpg
2,Harry Potter and the Sorcerer's Stone (Harry P...,https://covers.openlibrary.org/b/id/7572543-L.jpg
3,Harry Potter and the Chamber of Secrets (Harry...,https://covers.openlibrary.org/b/id/10301720-L...
4,Harry Potter and the Prisoner of Azkaban (Harr...,https://covers.openlibrary.org/b/id/8778528-L.jpg
5,Harry Potter Boxed Set Books 1-5 (Harry Potte...,https://covers.openlibrary.org/b/id/278981-L.jpg
6,"Unauthorized Harry Potter Book Seven News: ""Ha...",https://covers.openlibrary.org/b/id/742235-L.jpg
7,Harry Potter Collection (Harry Potter #1-6),https://covers.openlibrary.org/b/id/279436-L.jpg
8,The Ultimate Hitchhiker's Guide: Five Complete...,https://covers.openlibrary.org/b/id/321859-L.jpg
9,The Ultimate Hitchhiker's Guide to the Galaxy,


In [13]:
def fetch_books(isbns):
    url = "https://openlibrary.org/api/books"
    response = requests.get(url, params={
        'bibkeys': ','.join([f"ISBN:{isbn}" for isbn in isbns]),
        'format': 'json',
        'jscmd': 'data'
    })
    return response.json()

In [14]:
isbns = [9780439785969, 9780439358071, 9780439554930]
data = fetch_books(isbns)

In [15]:
len(data)

3

In [16]:
books_df['cover_url'] = None
books_df.set_index('isbn13', inplace=True)
# books_df = books_df.set_index('isbn13')

In [17]:
books_df.head()

Unnamed: 0_level_0,bookID,title,authors,average_rating,isbn,language_code,# num_pages,ratings_count,text_reviews_count,cover_url
isbn13,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
9780439785969,1,Harry Potter and the Half-Blood Prince (Harry ...,J.K. Rowling-Mary GrandPré,4.56,0439785960,eng,652,1944099,26249,
9780439358071,2,Harry Potter and the Order of the Phoenix (Har...,J.K. Rowling-Mary GrandPré,4.49,0439358078,eng,870,1996446,27613,
9780439554930,3,Harry Potter and the Sorcerer's Stone (Harry P...,J.K. Rowling-Mary GrandPré,4.47,0439554934,eng,320,5629932,70390,
9780439554893,4,Harry Potter and the Chamber of Secrets (Harry...,J.K. Rowling,4.41,0439554896,eng,352,6267,272,
9780439655484,5,Harry Potter and the Prisoner of Azkaban (Harr...,J.K. Rowling-Mary GrandPré,4.55,043965548X,eng,435,2149872,33964,


In [18]:
list(np.array_split(books_df.head(12), 3)[0].index)

[9780439785969, 9780439358071, 9780439554930, 9780439554893]

In [19]:
books_df['cover_url'] = None

In [20]:
%%time
for group in np.array_split(books_df.head(100), 10):
    book_isbns = list(group.index)
    print(f"fetching covers for {book_isbns}")
    books = fetch_books(book_isbns)
    for isbn_code, book in books.items():
        isbn = isbn_code.strip('ISBN:')
        books_df.loc[int(isbn), 'cover_url'] = book.get('cover', {}).get('large', 'N/A')

fetching covers for [9780439785969, 9780439358071, 9780439554930, 9780439554893, 9780439655484, 9780439682589, 9780976540601, 9780439827607, 9780517226957, 9780345453747]
fetching covers for [9781400052929, 9780739322208, 9780517149256, 9780767908184, 9780767915069, 9780767910439, 9780767903868, 9780767903820, 9780060920081, 9780380713806]
fetching covers for [9780380727506, 9780380715435, 9780345538376, 9780618517657, 9780618346240, 9780618346257, 9780618260584, 9780618391004, 9780618510825, 9780618153978]
fetching covers for [9781933372013, 9780976694007, 9780689840920, 9781557344496, 9780385326506, 9781575606248, 9781595580276, 9781595962805, 9780670059676, 9780141312620]
fetching covers for [9780595321803, 9781590301944, 9780449146972, 9780061159176, 9780060762735, 9780060749910, 9780273704744, 9781932386103, 9780965136716, 9780374517199]
fetching covers for [9780374280390, 9780374519742, 9780374522599, 9780374518738, 9780374522872, 9780374519322, 9780374516000, 9780374520656, 9780

In [21]:
books_df[['title', 'cover_url']]

Unnamed: 0_level_0,title,cover_url
isbn13,Unnamed: 1_level_1,Unnamed: 2_level_1
9780439785969,Harry Potter and the Half-Blood Prince (Harry ...,https://covers.openlibrary.org/b/id/9326654-L.jpg
9780439358071,Harry Potter and the Order of the Phoenix (Har...,https://covers.openlibrary.org/b/id/9326212-L.jpg
9780439554930,Harry Potter and the Sorcerer's Stone (Harry P...,https://covers.openlibrary.org/b/id/7572543-L.jpg
9780439554893,Harry Potter and the Chamber of Secrets (Harry...,https://covers.openlibrary.org/b/id/10301720-L...
9780439655484,Harry Potter and the Prisoner of Azkaban (Harr...,https://covers.openlibrary.org/b/id/8778528-L.jpg
9780439682589,Harry Potter Boxed Set Books 1-5 (Harry Potte...,https://covers.openlibrary.org/b/id/278981-L.jpg
9780976540601,"Unauthorized Harry Potter Book Seven News: ""Ha...",https://covers.openlibrary.org/b/id/742235-L.jpg
9780439827607,Harry Potter Collection (Harry Potter #1-6),https://covers.openlibrary.org/b/id/279436-L.jpg
9780517226957,The Ultimate Hitchhiker's Guide: Five Complete...,https://covers.openlibrary.org/b/id/321859-L.jpg
9780345453747,The Ultimate Hitchhiker's Guide to the Galaxy,
