# CSV + API

In this reboot, we are going to use:

- The [Goodreads books](https://www.kaggle.com/jealousleopard/goodreadsbooks) dataset from Kaggle.
- The [Open Library Books API](https://openlibrary.org/dev/docs/api/books)

The goal of this livecode is to load the data from a CSV + loop over rows to enrich each row with information such as:

- List of subjects (Science, Humor, Travel, etc.)
- The cover URL of the book
- Other information you'd find useful in the JSON API

First, download the CSV in the local folder:

In [1]:
!curl -L https://gist.githubusercontent.com/ssaunier/351b17f5a7a009808b60aeacd1f4a036/raw/books.csv > books.csv

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1509k  100 1509k    0     0  2194k      0 --:--:-- --:--:-- --:--:-- 2193k


In [2]:
!ls -lh

total 1.5M
-rw-r--r-- 1 emirhankumus emirhankumus  651 Feb 20 00:34 README.md
-rw-r--r-- 1 emirhankumus emirhankumus 2.8K Feb 20 00:34 Recap.ipynb
-rw-r--r-- 1 emirhankumus emirhankumus 1.5M Feb 20 00:36 books.csv


Then import the usual suspects!

In [3]:
import requests
import pandas as pd
import numpy as np

## Load books from CSV

In [4]:
books = pd.read_csv("books.csv")

books.head()

Unnamed: 0,bookID,title,authors,average_rating,isbn,isbn13,language_code,# num_pages,ratings_count,text_reviews_count
0,1,Harry Potter and the Half-Blood Prince (Harry ...,J.K. Rowling-Mary GrandPrÃ©,4.56,0439785960,9780439785969,eng,652,1944099,26249
1,2,Harry Potter and the Order of the Phoenix (Har...,J.K. Rowling-Mary GrandPrÃ©,4.49,0439358078,9780439358071,eng,870,1996446,27613
2,3,Harry Potter and the Sorcerer's Stone (Harry P...,J.K. Rowling-Mary GrandPrÃ©,4.47,0439554934,9780439554930,eng,320,5629932,70390
3,4,Harry Potter and the Chamber of Secrets (Harry...,J.K. Rowling,4.41,0439554896,9780439554893,eng,352,6267,272
4,5,Harry Potter and the Prisoner of Azkaban (Harr...,J.K. Rowling-Mary GrandPrÃ©,4.55,043965548X,9780439655484,eng,435,2149872,33964


Let's add a new column

In [5]:
# KaÃ§ satÄ±r, kaÃ§ sÃ¼tun var?
print(books.shape)

# SÃ¼tun isimlerini listele â€” ISBN'i bulalÄ±m
print(books.columns.tolist())

(13719, 10)
['bookID', 'title', 'authors', 'average_rating', 'isbn', 'isbn13', 'language_code', '# num_pages', 'ratings_count', 'text_reviews_count']


## API - Open Library

In [7]:
test_isbn = books["isbn"].iloc[0]
print(f"Test ISBN: {test_isbn}")

url = f"https://openlibrary.org/api/books?bibkeys=ISBN:{test_isbn}&format=json&jscmd=data"

response = requests.get(url)

data = response.json()
print(data)

Test ISBN: 0439785960
{'ISBN:0439785960': {'url': 'https://openlibrary.org/books/OL24280830M/Harry_Potter_and_the_Half-Blood_Prince', 'key': '/books/OL24280830M', 'title': 'Harry Potter and the Half-Blood Prince', 'authors': [{'url': 'https://openlibrary.org/authors/OL23919A/J._K._Rowling', 'name': 'J. K. Rowling'}], 'number_of_pages': 652, 'pagination': '652p', 'identifiers': {'amazon': ['0439785960'], 'goodreads': ['53178655'], 'isbn_10': ['0439785960'], 'isbn_13': ['9780439785969'], 'oclc': ['70666878', '819153929'], 'openlibrary': ['OL24280830M']}, 'classifications': {'lc_classifications': ['PZ7.R79835Halc 2005']}, 'publishers': [{'name': 'Scholastic'}], 'publish_places': [{'name': 'New York, USA'}], 'publish_date': '2006-09', 'subjects': [{'name': 'orphans', 'url': 'https://openlibrary.org/subjects/orphans'}, {'name': 'foster homes', 'url': 'https://openlibrary.org/subjects/foster_homes'}, {'name': 'romans', 'url': 'https://openlibrary.org/subjects/romans'}, {'name': 'magie', 'url

In [8]:
# Ana anahtarÄ± yakala (dinamik â€” her ISBN iÃ§in deÄŸiÅŸir)
book_key = f"ISBN:{test_isbn}"
book_data = data[book_key]

# 1. Subjects â€” sadece isimleri al
subjects = [s['name'] for s in book_data.get('subjects', [])]
print("Subjects:", subjects[:5])  # Ä°lk 5'i gÃ¶ster

# 2. Kapak URL'i â€” medium boyut
cover_url = book_data.get('cover', {}).get('medium', None)
print("Cover URL:", cover_url)

# 3. YayÄ±ncÄ±
publisher = book_data.get('publishers', [{}])[0].get('name', None)
print("Publisher:", publisher)

# 4. YayÄ±n tarihi
publish_date = book_data.get('publish_date', None)
print("Publish Date:", publish_date)

Subjects: ['orphans', 'foster homes', 'romans', 'magie', 'adolescence']
Cover URL: https://covers.openlibrary.org/b/id/15156081-M.jpg
Publisher: Scholastic
Publish Date: 2006-09


## Calling the API with multiple ISBNs at a time

In [9]:
def get_book_info(isbn):
    """Bir ISBN iÃ§in Open Library API'den bilgi Ã§eker."""
    try:
        url = f"https://openlibrary.org/api/books?bibkeys=ISBN:{isbn}&format=json&jscmd=data"
        response = requests.get(url, timeout=10)
        data = response.json()
        
        book_key = f"ISBN:{isbn}"
        
        # Kitap API'de bulunamadÄ±ysa boÅŸ dÃ¶n
        if book_key not in data:
            return None, None, None, None
        
        book_data = data[book_key]
        
        # Subjects listesi â€” sadece isimler
        subjects = [s['name'] for s in book_data.get('subjects', [])]
        
        # Kapak URL
        cover_url = book_data.get('cover', {}).get('medium', None)
        
        # YayÄ±ncÄ±
        publisher = book_data.get('publishers', [{}])[0].get('name', None)
        
        # YayÄ±n tarihi
        publish_date = book_data.get('publish_date', None)
        
        return subjects, cover_url, publisher, publish_date
    
    except Exception as e:
        print(f"Hata â€” ISBN {isbn}: {e}")
        return None, None, None, None


# --- Ã–NCE Ä°LK 5 SATIRLA TEST ---
books_sample = books.head(5).copy()

# Yeni sÃ¼tunlarÄ± oluÅŸtur
books_sample['subjects']     = None
books_sample['cover_url']    = None
books_sample['publisher']    = None
books_sample['publish_date'] = None

# DÃ¶ngÃ¼
for index, row in books_sample.iterrows():
    isbn = row['isbn']
    print(f"Ä°ÅŸleniyor: {row['title'][:40]} | ISBN: {isbn}")
    
    subjects, cover_url, publisher, publish_date = get_book_info(isbn)
    
    books_sample.at[index, 'subjects']     = str(subjects)
    books_sample.at[index, 'cover_url']    = cover_url
    books_sample.at[index, 'publisher']    = publisher
    books_sample.at[index, 'publish_date'] = publish_date

# Sonucu gÃ¶ster
books_sample[['title', 'subjects', 'cover_url', 'publisher', 'publish_date']]

Ä°ÅŸleniyor: Harry Potter and the Half-Blood Prince ( | ISBN: 0439785960
Ä°ÅŸleniyor: Harry Potter and the Order of the Phoeni | ISBN: 0439358078
Ä°ÅŸleniyor: Harry Potter and the Sorcerer's Stone (H | ISBN: 0439554934
Ä°ÅŸleniyor: Harry Potter and the Chamber of Secrets  | ISBN: 0439554896
Ä°ÅŸleniyor: Harry Potter and the Prisoner of Azkaban | ISBN: 043965548X


Unnamed: 0,title,subjects,cover_url,publisher,publish_date
0,Harry Potter and the Half-Blood Prince (Harry ...,"['orphans', 'foster homes', 'romans', 'magie',...",https://covers.openlibrary.org/b/id/15156081-M...,Scholastic,2006-09
1,Harry Potter and the Order of the Phoenix (Har...,"[""Children's Books/Ages 9-12 Fiction"", 'Witche...",https://covers.openlibrary.org/b/id/12025650-M...,Scholastic Inc.,2004-09
2,Harry Potter and the Sorcerer's Stone (Harry P...,"['series:Harry_Potter', 'Ghosts', 'Monsters', ...",https://covers.openlibrary.org/b/id/7572543-M.jpg,Arthur A. Levine Books,2003
3,Harry Potter and the Chamber of Secrets (Harry...,"['series:Harry_Potter', 'Fantasy fiction', 'sc...",https://covers.openlibrary.org/b/id/10301720-M...,Arthur A. Levine Books,"Nov 01, 2003"
4,Harry Potter and the Prisoner of Azkaban (Harr...,"['Fantasy fiction', 'orphans', 'foster homes',...",https://covers.openlibrary.org/b/id/8778528-M.jpg,Scholastic,"May 01, 2004"


In [10]:
import time

# Orijinal DataFrame'in kopyasÄ±nÄ± al
books_enriched = books.copy()

# Yeni sÃ¼tunlarÄ± ekle
books_enriched['subjects']     = None
books_enriched['cover_url']    = None
books_enriched['publisher']    = None
books_enriched['publish_date'] = None

total = len(books_enriched)

for index, row in books_enriched.iterrows():
    isbn = row['isbn']
    
    subjects, cover_url, publisher, publish_date = get_book_info(isbn)
    
    books_enriched.at[index, 'subjects']     = str(subjects)
    books_enriched.at[index, 'cover_url']    = cover_url
    books_enriched.at[index, 'publisher']    = publisher
    books_enriched.at[index, 'publish_date'] = publish_date
    
    # Her 100 kitapta bir ilerleme gÃ¶ster
    if index % 100 == 0:
        print(f"âœ… {index}/{total} kitap iÅŸlendi...")
    
    # API'ye nazik ol â€” spam yapma
    time.sleep(0.1)

print("ðŸŽ‰ TÃ¼m kitaplar iÅŸlendi!")

# Sonucu Ã¶nizle
books_enriched[['title', 'subjects', 'cover_url', 'publisher']].head(10)

âœ… 0/13719 kitap iÅŸlendi...
âœ… 100/13719 kitap iÅŸlendi...
âœ… 200/13719 kitap iÅŸlendi...
âœ… 300/13719 kitap iÅŸlendi...
Hata â€” ISBN 0671823493: HTTPSConnectionPool(host='openlibrary.org', port=443): Max retries exceeded with url: /api/books?bibkeys=ISBN:0671823493&format=json&jscmd=data (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7b465c8a1100>: Failed to resolve 'openlibrary.org' ([Errno -3] Temporary failure in name resolution)"))
âœ… 400/13719 kitap iÅŸlendi...
âœ… 500/13719 kitap iÅŸlendi...
âœ… 600/13719 kitap iÅŸlendi...
âœ… 700/13719 kitap iÅŸlendi...
Hata â€” ISBN 0553103741: HTTPSConnectionPool(host='openlibrary.org', port=443): Max retries exceeded with url: /api/books?bibkeys=ISBN:0553103741&format=json&jscmd=data (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7b465c8da780>: Failed to resolve 'openlibrary.org' ([Errno -3] Temporary failure in name resolution)"))
âœ… 800/13719 kitap iÅŸlendi...
âœ… 90

KeyboardInterrupt: 

In [11]:
# KaÃ§ satÄ±r iÅŸlendi?
print(f"Ä°ÅŸlenen kitap: {len(books_enriched)}")
print(f"Cover URL doluluk: {books_enriched['cover_url'].notna().sum()}/{len(books_enriched)}")
print(f"Subjects doluluk: {books_enriched['subjects'].notna().sum()}/{len(books_enriched)}")

# CSV olarak kaydet
books_enriched.to_csv("books_enriched.csv", index=False)
print("âœ… books_enriched.csv kaydedildi!")

Ä°ÅŸlenen kitap: 13719
Cover URL doluluk: 2675/13719
Subjects doluluk: 3020/13719
âœ… books_enriched.csv kaydedildi!


In [12]:
books_enriched[['title', 'subjects', 'cover_url', 'publisher', 'publish_date']].head(10)

Unnamed: 0,title,subjects,cover_url,publisher,publish_date
0,Harry Potter and the Half-Blood Prince (Harry ...,"['orphans', 'foster homes', 'romans', 'magie',...",https://covers.openlibrary.org/b/id/15156081-M...,Scholastic,2006-09
1,Harry Potter and the Order of the Phoenix (Har...,"[""Children's Books/Ages 9-12 Fiction"", 'Witche...",https://covers.openlibrary.org/b/id/12025650-M...,Scholastic Inc.,2004-09
2,Harry Potter and the Sorcerer's Stone (Harry P...,"['series:Harry_Potter', 'Ghosts', 'Monsters', ...",https://covers.openlibrary.org/b/id/7572543-M.jpg,Arthur A. Levine Books,2003
3,Harry Potter and the Chamber of Secrets (Harry...,"['series:Harry_Potter', 'Fantasy fiction', 'sc...",https://covers.openlibrary.org/b/id/10301720-M...,Arthur A. Levine Books,"Nov 01, 2003"
4,Harry Potter and the Prisoner of Azkaban (Harr...,"['Fantasy fiction', 'orphans', 'foster homes',...",https://covers.openlibrary.org/b/id/8778528-M.jpg,Scholastic,"May 01, 2004"
5,Harry Potter Boxed Set Books 1-5 (Harry Potte...,"['Potter, harry (fictitious character), fictio...",https://covers.openlibrary.org/b/id/278981-M.jpg,Scholastic Inc.,"October 1, 2004"
6,"Unauthorized Harry Potter Book Seven News: ""Ha...","['Characters', 'Harry Potter', ""Children's sto...",https://covers.openlibrary.org/b/id/742235-M.jpg,Nimble Books,"April 26, 2005"
7,Harry Potter Collection (Harry Potter #1-6),"['England, fiction', 'Fantasy fiction', 'Magic...",https://covers.openlibrary.org/b/id/279436-M.jpg,Arthur A. Levine Books,"October 1, 2005"
8,The Ultimate Hitchhiker's Guide: Five Complete...,"['comic science fiction', 'Vogons', 'Humorous ...",https://covers.openlibrary.org/b/id/12617870-M...,Gramercy Books,2005
9,The Ultimate Hitchhiker's Guide to the Galaxy,"['comic science fiction', 'Vogons', 'Humorous ...",https://covers.openlibrary.org/b/id/14656530-M...,Del Rey,1996
