# CSV + API

In this reboot, we are going to use:

- The [Goodreads books](https://www.kaggle.com/jealousleopard/goodreadsbooks) dataset from Kaggle.
- The [Open Library Books API](https://openlibrary.org/dev/docs/api/books)

The goal of this livecode is to load the data from a CSV + loop over rows to enrich each row with information such as:

- List of subjects (Science, Humor, Travel, etc.)
- The cover URL of the book
- Other information you'd find useful in the JSON API

First, download the CSV in the local folder:

In [8]:
!curl -L https://gist.githubusercontent.com/ssaunier/351b17f5a7a009808b60aeacd1f4a036/raw/books.csv > books.csv

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1509k  100 1509k    0     0  11.0M      0 --:--:-- --:--:-- --:--:-- 11.8M


In [9]:
!ls -lh

total 4888
-rw-r--r--  1 bingobango  staff   579B Nov 29 11:58 README.md
-rw-r--r--  1 bingobango  staff   162K Apr 25 15:42 RECAP - LIVE CODE.ipynb
-rw-r--r--  1 bingobango  staff   161K Apr 25 12:58 Recap.ipynb
-rw-r--r--  1 bingobango  staff   1.5M Apr 25 15:44 books.csv


Then import the usual suspects!

In [10]:
import requests
import pandas as pd
import numpy as np

## Load books from CSV

In [11]:
books_df = pd.read_csv('books.csv')
books_df

Unnamed: 0,bookID,title,authors,average_rating,isbn,isbn13,language_code,# num_pages,ratings_count,text_reviews_count
0,1,Harry Potter and the Half-Blood Prince (Harry ...,J.K. Rowling-Mary GrandPré,4.56,0439785960,9780439785969,eng,652,1944099,26249
1,2,Harry Potter and the Order of the Phoenix (Har...,J.K. Rowling-Mary GrandPré,4.49,0439358078,9780439358071,eng,870,1996446,27613
2,3,Harry Potter and the Sorcerer's Stone (Harry P...,J.K. Rowling-Mary GrandPré,4.47,0439554934,9780439554930,eng,320,5629932,70390
3,4,Harry Potter and the Chamber of Secrets (Harry...,J.K. Rowling,4.41,0439554896,9780439554893,eng,352,6267,272
4,5,Harry Potter and the Prisoner of Azkaban (Harr...,J.K. Rowling-Mary GrandPré,4.55,043965548X,9780439655484,eng,435,2149872,33964
...,...,...,...,...,...,...,...,...,...,...
13714,47699,M Is for Magic,Neil Gaiman-Teddy Kristiansen,3.82,0061186422,9780061186424,eng,260,11317,1060
13715,47700,Black Orchid,Neil Gaiman-Dave McKean,3.72,0930289552,9780930289553,eng,160,8710,361
13716,47701,InterWorld (InterWorld #1),Neil Gaiman-Michael Reaves,3.53,0061238961,9780061238963,en-US,239,14334,1485
13717,47708,The Faeries' Oracle,Brian Froud-Jessica Macbeth,4.43,0743201116,9780743201117,eng,224,1550,38


In [12]:
books_df = pd.read_csv('books.csv')
books_df = books_df[['title','authors','isbn13', '# num_pages']]

In [13]:
books_df

Unnamed: 0,title,authors,isbn13,# num_pages
0,Harry Potter and the Half-Blood Prince (Harry ...,J.K. Rowling-Mary GrandPré,9780439785969,652
1,Harry Potter and the Order of the Phoenix (Har...,J.K. Rowling-Mary GrandPré,9780439358071,870
2,Harry Potter and the Sorcerer's Stone (Harry P...,J.K. Rowling-Mary GrandPré,9780439554930,320
3,Harry Potter and the Chamber of Secrets (Harry...,J.K. Rowling,9780439554893,352
4,Harry Potter and the Prisoner of Azkaban (Harr...,J.K. Rowling-Mary GrandPré,9780439655484,435
...,...,...,...,...
13714,M Is for Magic,Neil Gaiman-Teddy Kristiansen,9780061186424,260
13715,Black Orchid,Neil Gaiman-Dave McKean,9780930289553,160
13716,InterWorld (InterWorld #1),Neil Gaiman-Michael Reaves,9780061238963,239
13717,The Faeries' Oracle,Brian Froud-Jessica Macbeth,9780743201117,224


Let's add a new column

In [14]:
books_df.dtypes

title          object
authors        object
isbn13          int64
# num_pages     int64
dtype: object

In [15]:
books_df['cover_url'] = None
books_df

Unnamed: 0,title,authors,isbn13,# num_pages,cover_url
0,Harry Potter and the Half-Blood Prince (Harry ...,J.K. Rowling-Mary GrandPré,9780439785969,652,
1,Harry Potter and the Order of the Phoenix (Har...,J.K. Rowling-Mary GrandPré,9780439358071,870,
2,Harry Potter and the Sorcerer's Stone (Harry P...,J.K. Rowling-Mary GrandPré,9780439554930,320,
3,Harry Potter and the Chamber of Secrets (Harry...,J.K. Rowling,9780439554893,352,
4,Harry Potter and the Prisoner of Azkaban (Harr...,J.K. Rowling-Mary GrandPré,9780439655484,435,
...,...,...,...,...,...
13714,M Is for Magic,Neil Gaiman-Teddy Kristiansen,9780061186424,260,
13715,Black Orchid,Neil Gaiman-Dave McKean,9780930289553,160,
13716,InterWorld (InterWorld #1),Neil Gaiman-Michael Reaves,9780061238963,239,
13717,The Faeries' Oracle,Brian Froud-Jessica Macbeth,9780743201117,224,


## API - Open Library

In [16]:
def fetch_book(isbn):
    
    url = "https://openlibrary.org/api/books?"
    
    params = {
        'bibkeys': f'ISBN:{isbn}',
        'format': 'json',
        'jscmd': 'data'
    }
    
    response = requests.get(url, params=params).json()
    if f'ISBN:{isbn}' in response:
        return response[f'ISBN:{isbn}']
    else:
        return ''

In [17]:
fetch_book(9780439554930)

{'url': 'https://openlibrary.org/books/OL26018592M/Harry_Potter_and_the_Sorcerers_Stone',
 'key': '/books/OL26018592M',
 'title': 'Harry Potter and the Sorcerers Stone',
 'authors': [{'url': 'https://openlibrary.org/authors/OL23919A/J._K._Rowling',
   'name': 'J. K. Rowling'}],
 'identifiers': {'alibris_id': ['9780439554930'],
  'amazon': ['0439554934'],
  'goodreads': ['3'],
  'isbn_10': ['0439554934'],
  'isbn_13': ['9780439554930'],
  'oclc': ['54367447'],
  'openlibrary': ['OL26018592M']},
 'publishers': [{'name': 'Arthur A. Levine Books'}],
 'publish_date': '2003',
 'subjects': [{'name': 'Ghosts',
   'url': 'https://openlibrary.org/subjects/ghosts'},
  {'name': 'Monsters', 'url': 'https://openlibrary.org/subjects/monsters'},
  {'name': 'Vampires', 'url': 'https://openlibrary.org/subjects/vampires'},
  {'name': 'Witches', 'url': 'https://openlibrary.org/subjects/witches'},
  {'name': 'Challenges and Overcoming Obstacles',
   'url': 'https://openlibrary.org/subjects/challenges_and_o

In [18]:
%%time

for index, row in books_df.head(15).iterrows():
    
    if row['cover_url'] is None:
        isbn = row['isbn13']
        print(f"fetching cover for {row['title']}")
        
        book = fetch_book(isbn)
        
        if book:
            cover_url = book.get('cover', {}).get('large', '')
            books_df.loc[index, 'cover_url'] = cover_url
        else:
            books_df.loc[index, 'cover_url'] = ''

fetching cover for Harry Potter and the Half-Blood Prince (Harry Potter  #6)
fetching cover for Harry Potter and the Order of the Phoenix (Harry Potter  #5)
fetching cover for Harry Potter and the Sorcerer's Stone (Harry Potter  #1)
fetching cover for Harry Potter and the Chamber of Secrets (Harry Potter  #2)
fetching cover for Harry Potter and the Prisoner of Azkaban (Harry Potter  #3)
fetching cover for Harry Potter Boxed Set  Books 1-5 (Harry Potter  #1-5)
fetching cover for Unauthorized Harry Potter Book Seven News: "Half-Blood Prince" Analysis and Speculation
fetching cover for Harry Potter Collection (Harry Potter  #1-6)
fetching cover for The Ultimate Hitchhiker's Guide: Five Complete Novels and One Story (Hitchhiker's Guide to the Galaxy  #1-5)
fetching cover for The Ultimate Hitchhiker's Guide to the Galaxy
fetching cover for The Hitchhiker's Guide to the Galaxy (Hitchhiker's Guide to the Galaxy  #1)
fetching cover for The Hitchhiker's Guide to the Galaxy (Hitchhiker's Guide t

In [19]:
books_df.head(20)

Unnamed: 0,title,authors,isbn13,# num_pages,cover_url
0,Harry Potter and the Half-Blood Prince (Harry ...,J.K. Rowling-Mary GrandPré,9780439785969,652,https://covers.openlibrary.org/b/id/9326654-L.jpg
1,Harry Potter and the Order of the Phoenix (Har...,J.K. Rowling-Mary GrandPré,9780439358071,870,https://covers.openlibrary.org/b/id/12025650-L...
2,Harry Potter and the Sorcerer's Stone (Harry P...,J.K. Rowling-Mary GrandPré,9780439554930,320,https://covers.openlibrary.org/b/id/7572543-L.jpg
3,Harry Potter and the Chamber of Secrets (Harry...,J.K. Rowling,9780439554893,352,https://covers.openlibrary.org/b/id/10301720-L...
4,Harry Potter and the Prisoner of Azkaban (Harr...,J.K. Rowling-Mary GrandPré,9780439655484,435,https://covers.openlibrary.org/b/id/10580458-L...
5,Harry Potter Boxed Set Books 1-5 (Harry Potte...,J.K. Rowling-Mary GrandPré,9780439682589,2690,https://covers.openlibrary.org/b/id/278981-L.jpg
6,"Unauthorized Harry Potter Book Seven News: ""Ha...",W. Frederick Zimmerman,9780976540601,152,https://covers.openlibrary.org/b/id/742235-L.jpg
7,Harry Potter Collection (Harry Potter #1-6),J.K. Rowling,9780439827607,3342,https://covers.openlibrary.org/b/id/279436-L.jpg
8,The Ultimate Hitchhiker's Guide: Five Complete...,Douglas Adams,9780517226957,815,https://covers.openlibrary.org/b/id/12617870-L...
9,The Ultimate Hitchhiker's Guide to the Galaxy,Douglas Adams,9780345453747,815,


## Calling the API with multiple ISBNs at a time

In [20]:
isbns = [9780439785969, 9780439358071, 9780439554930]

In [21]:
[f"ISBN:{isbn}" for isbn in isbns]

['ISBN:9780439785969', 'ISBN:9780439358071', 'ISBN:9780439554930']

In [22]:
result = []

for isbn in isbns:
    result.append(f"ISBN:{isbn}")

In [23]:
result

['ISBN:9780439785969', 'ISBN:9780439358071', 'ISBN:9780439554930']

In [24]:
",".join([f"ISBN:{isbn}" for isbn in isbns])

'ISBN:9780439785969,ISBN:9780439358071,ISBN:9780439554930'

In [25]:
def fetch_books(isbns):
    
    url = "https://openlibrary.org/api/books?"
    bibkeys = ",".join([f"ISBN:{isbn}" for isbn in isbns])
    
    params = {
        'bibkeys': bibkeys,
        'format': 'json',
        'jscmd': 'data'
    }
    
    response = requests.get(url, params=params).json()
    return response

In [26]:
books_df['cover_url'] = None

In [69]:
# books_df.head(20)

In [27]:
books_df

Unnamed: 0,title,authors,isbn13,# num_pages,cover_url
0,Harry Potter and the Half-Blood Prince (Harry ...,J.K. Rowling-Mary GrandPré,9780439785969,652,
1,Harry Potter and the Order of the Phoenix (Har...,J.K. Rowling-Mary GrandPré,9780439358071,870,
2,Harry Potter and the Sorcerer's Stone (Harry P...,J.K. Rowling-Mary GrandPré,9780439554930,320,
3,Harry Potter and the Chamber of Secrets (Harry...,J.K. Rowling,9780439554893,352,
4,Harry Potter and the Prisoner of Azkaban (Harr...,J.K. Rowling-Mary GrandPré,9780439655484,435,
...,...,...,...,...,...
13714,M Is for Magic,Neil Gaiman-Teddy Kristiansen,9780061186424,260,
13715,Black Orchid,Neil Gaiman-Dave McKean,9780930289553,160,
13716,InterWorld (InterWorld #1),Neil Gaiman-Michael Reaves,9780061238963,239,
13717,The Faeries' Oracle,Brian Froud-Jessica Macbeth,9780743201117,224,


In [28]:
books_df.head().index

RangeIndex(start=0, stop=5, step=1)

In [29]:
list(books_df.head().index)

[0, 1, 2, 3, 4]

In [30]:
%%time

books = fetch_books(list(books_df.head().index))

CPU times: user 17.2 ms, sys: 2.68 ms, total: 19.9 ms
Wall time: 921 ms


In [35]:
books.keys()

dict_keys(['ISBN:0', 'ISBN:1', 'ISBN:2', 'ISBN:3', 'ISBN:4'])

In [88]:
# books

In [33]:
for isbn_code, book in books.items():
    isbn = int(isbn_code.strip("ISBN:"))
    books_df.loc[isbn, "cover_url"] = book.get("cover", {}).get("large", "")

In [34]:
books_df

Unnamed: 0,title,authors,isbn13,# num_pages,cover_url
0,Harry Potter and the Half-Blood Prince (Harry ...,J.K. Rowling-Mary GrandPré,9780439785969,652,
1,Harry Potter and the Order of the Phoenix (Har...,J.K. Rowling-Mary GrandPré,9780439358071,870,
2,Harry Potter and the Sorcerer's Stone (Harry P...,J.K. Rowling-Mary GrandPré,9780439554930,320,
3,Harry Potter and the Chamber of Secrets (Harry...,J.K. Rowling,9780439554893,352,https://covers.openlibrary.org/b/id/7348653-L.jpg
4,Harry Potter and the Prisoner of Azkaban (Harr...,J.K. Rowling-Mary GrandPré,9780439655484,435,https://covers.openlibrary.org/b/id/7880051-L.jpg
...,...,...,...,...,...
13714,M Is for Magic,Neil Gaiman-Teddy Kristiansen,9780061186424,260,
13715,Black Orchid,Neil Gaiman-Dave McKean,9780930289553,160,
13716,InterWorld (InterWorld #1),Neil Gaiman-Michael Reaves,9780061238963,239,
13717,The Faeries' Oracle,Brian Froud-Jessica Macbeth,9780743201117,224,
