### Use the BookOps WorldCat wrapper to pull brief bibliographic data about other editions based on a list of OCLC numbers

##### Import libraries
This section loads the packages needed to work with data, send API requests, and use the BookOps wrapper with OCLC's WorldCat API.

In [None]:
import pandas as pd
import requests
from bookops_worldcat import WorldcatAccessToken
import time
import re
import json

##### Configure access token
This section contains the authentication details required by the WorldCat API. 'mykey' and 'mysecret' should be updated based on the user's credentials.

In [None]:
#Configure access token
WORLDCAT_KEY = 'mykey'
WORLDCAT_SECRET = 'mysecret'
SCOPES = 'WorldCatMetadataAPI'

##### Configure files
This section contains the filepath and name of the file that will be read (INPUT_FILE) and the file where results will be saved (OUTPUT_FILE).

In [None]:
#Configure files
INPUT_FILE = 'FILENAME'
OUTPUT_FILE = 'FILENAME.xlsx'

##### Generate an access token
The **get_token** function uses the API credentials specified above to create a token to access the WorldCat API. Tokens expire after twenty minutes, and should automatically refresh within the script.

In [None]:
#Generate an access token
def get_token():
    return WorldcatAccessToken(
        key=WORLDCAT_KEY,
        secret=WORLDCAT_SECRET,
        scopes=SCOPES
    )

##### Get OCLC numbers

The **get_other_editions** function takes an OCLC number from INPUT_FILE, queries the WorldCat API, and returns the related OCLC numbers associated with the input number. For each related OCLC number, three fields are returned:
1. Other ISBNs: All ISBNs associated with the related OCLC number, returned in a single field and pipe-delimited
2. Format: Archv, ArtChapter, AudioBook, Book, CompFile, Encyc, Game, Image, IntMM, Jrnl, Kit, Map, MsScr, Music, News, Object, Snd, Toy, Video, Vis, Web
3. Specific format: 2D, Artcl, Bluray, Braille, Cassette, CD, Chptr, Continuing, Digital, DVD, Encyc, Film, LargePrint, LP, Mic, mss, PrintBook, rec, Thsis, VHS

The format fields are included to allow for optional later filtering to include digital or print books only.

In [None]:
#Get other editions - OCLC numbers
def get_other_editions(oclc_number, token, record_id):
    try:
        url = f'https://americas.discovery.api.oclc.org/worldcat/search/v2/brief-bibs/{oclc_number}/other-editions'
        headers = {
            'Authorization': f'Bearer {token.token_str}',
            'Accept': 'application/json'
        }

        response = requests.get(url, headers=headers, timeout=30)
        response.raise_for_status()
        data = response.json()

        rows = []

        for rec in data.get("briefRecords", []):
            rec_oclcs = rec.get("oclcNumber", [])
            isbns = rec.get("isbns", [])
            format = rec.get("generalFormat", "None")
            specific_format = rec.get("specificFormat", "None")

            if isinstance(rec_oclcs, list) and rec_oclcs:
                for oclc in rec_oclcs:
                    isbn_str = ", ".join(isbns) if isinstance(isbns, list) else str(isbns)

                    rows.append({
                        "OTHER_OCLCS": str(oclc).strip(),
                        "OTHER_ISBNS": isbn_str.strip(),
                        "FORMAT": str(format),
                        "SPECIFIC_FORMAT": str(specific_format),
                        "RECORD_ID": record_id
                    })
            else:
                rows.append({
                    "OTHER_OCLCS": str(rec_oclcs).strip(),
                    "OTHER_ISBNS": " | ".join(isbns) if isinstance(isbns, list) else str(isbns),
                    "FORMAT": str(format),
                    "SPECIFIC_FORMAT": str(specific_format),
                    "RECORD_ID": record_id
                })
        if not rows: rows.append({"OTHER_OCLCS": "None", "OTHER_ISBNS": "None", "FORMAT": "None", "SPECIFIC_FORMAT": "None", "RECORD_ID": record_id})

        return rows

    except Exception as e:
        print(f"[ERROR] {oclc_number}: {e}")
        return [{"OTHER_OCLCS": "None", "OTHER_ISBNS": "None", "FORMAT": "None", "SPECIFIC_FORMAT": "None", "RECORD_ID": record_id}]


##### Run the workflow

The **main** function performs the following steps:
1. Reads INPUT_FILE
2. Creates an API token
3. Runs through each OCLC number and sends a query to get all related OCLC numbers and their corresponding data
4. Collects the results of the queries and merges them back with the original data
5. Exports the complete dataset as an Excel file
6. Filters the results to only include print books
7. Exports the print only dataset as an Excel file, or prints a message saying no print book matches found

Depending on the structure of INPUT_FILE, names of fields may need to be updated. For example, the file structure here uses "OCLC_NUMBER" as a field name. A different file may use "network_number" instead, which means either the script below needs to be updated to use "network_number", or INPUT_FILE needs to be updated to use "OCLC_NUMBER".

In [None]:
#Run query and export results
def main():
    oclclist_df = pd.read_excel(INPUT_FILE, dtype={'RECORD_ID': str, 'OCLC_NUMBER': str})
    all_results = []
    token = get_token()

    for _, row in oclclist_df.iterrows():
        oclc_number = row['OCLC_NUMBER']
        record_id = row['RECORD_ID']
        if not oclc_number:
            continue

        if token.is_expired():
            print("Refreshing token!")
            token = get_token()

        rows = get_other_editions(oclc_number, token, record_id)
        all_results.extend(rows)

        print(f"{oclc_number}: {len(rows)} rows returned")
        time.sleep(0.2)

    results_df = pd.DataFrame(all_results)

    final_df = oclclist_df.merge(results_df, on="RECORD_ID", how="left")
    final_df.to_excel(OUTPUT_FILE, index=False)
    print(f"Other editions data exported to {OUTPUT_FILE}.")

    #Optional filter to export separate file with only rows matching parameters, update or comment out as needed
    filtered_df = final_df[final_df['SPECIFIC_FORMAT'] == 'PrintBook']
    if not filtered_df.empty:
        filtered_file = OUTPUT_FILE.replace(".xlsx", "_PrintOnly.xlsx")
        filtered_df.to_excel(filtered_file, index=False)
        print(f"PrintBook data exported to {filtered_file}")
    else:
        print("No rows with SPECIFIC_FORMAT = 'PrintBook' found.")

if __name__ == "__main__":
    main()
