# Graduate Challenge

We would like you to showcase your Python (or other coding language e.g. Java, Golang etc.)
skills by tackling a data wrangling challenge that involves the open library public data API.

**Deliverables:**
● A well-commented script / notebook demonstrating the tasks mentioned above.

**Bonus Points:**
● Implement error handling in your script to gracefully handle any issues encountered by either the API or subsequently collected data

**Evaluation Criteria:**
● Functionality (completing all tasks)
● Code clarity and structure
● Efficiency and error handling
● Creativity and approach to high-level dataset exploration

This challenge will assess your ability to interact with APIs, parse data, and perform basic data
exploration using your language of choice. Good luck!

## Parse Available Datasets
- Write a Python script that retrieves a list of all books with the title “lord of the rings” from the below API (https://openlibrary.org/dev/docs/api/search)
- Parse the response from the API and write the names of the books to a dataset.
- Add 4 other columns showing data from the response

In [2]:
# Write a Python script that retrieves a list of all books 
# with the title “lord of the rings” from the below API 
# (https://openlibrary.org/dev/docs/api/search)

import requests
import pandas as pd

url = "https://openlibrary.org/search.json"
params= {
    'title': 'lord of the rings',
    'limit': 1000
}

api_data = requests.get(url, params=params).json()
print(f"{api_data['numFound']} works with title 'lord of the rings'")

437 works with title 'lord of the rings'


In [3]:
#Parse the response from the API and write the names of the books to a dataset.
#BOOKS format = paperback OR hardcover

api_books = []

# for each result (dictionary within list)
for doc_index, doc_result in enumerate(api_data['docs']):
    # check if 'format' key is present
    try:
        #print(f"{
        api_data['docs'][doc_index]['format']
        #}")
    except KeyError: # If there is no 'format' key - add to books list
        #print(f"KeyError : {api_data['docs'][doc_index]['title']} at index {doc_index} does not have a format")
        api_books.append(api_data['docs'][doc_index])
    except :
        print("Other Error")
    else : # there is a 'format' key
    # check if one of the formats listed is a BOOK, if so add it to list of BOOKS
        for format_index, format_type in enumerate(api_data['docs'][doc_index]['format']):
            if ('paperback' or 'hardcover' or 'e-book') in api_data['docs'][doc_index]['format'][format_index].lower():
                #print(f"{api_book_check[doc_index]['title']} is a {api_book_check[doc_index]['format'][format_index]}")
                api_books.append(api_data['docs'][doc_index])
                break

print(f"{len(api_books)} of the {api_data['numFound']} are books")

363 of the 437 are books


In [5]:
# Parse the response from the API and write the names of the books to a dataset.
# Add 4 other columns showing data from the response 
lotr_dataset = pd.DataFrame.from_records(api_books)[['title', 'author_name', 'first_publish_year','publisher']]
lotr_dataset

Unnamed: 0,title,author_name,first_publish_year,publisher
0,The Lord of the Rings,[J.R.R. Tolkien],1954.0,"[Houghton Mifflin Harcourt Publishing Company,..."
1,Novels (Hobbit / Lord of the Rings),[J.R.R. Tolkien],1979.0,"[Highbridge Audio, Mariner Books, HarperCollin..."
2,The Lord of the Ring,"[Phil Anderson, Philip A. Anderson]",2006.0,"[Regal Books, Muddy Pearl, Kingsway Publications]"
3,Lord of the Rings,[Cedco Publishing],2001.0,[Cedco Publishing Company]
4,The lord of the rings,[Jude Fisher],2001.0,"[HarperCollins, Houghton Mifflin]"
...,...,...,...,...
358,Secrets of the Alchemist Star Lord of Wars in ...,[Dan Plouff],2020.0,[Independently Published]
359,Lord of &Tau;he Rings - the BEST of - Coloring...,[Jason Morin],2022.0,[Independently Published]
360,&Tau;he Lord of &Tau;he Rings _ the BEST of _ ...,[Mark Ross],2022.0,[Independently Published]
361,&Tau;olkien's World Coloring Book - EXCLUSIVE ...,[Sam Streeter],2022.0,[Independently Published]
