This notebook works through the process of adding W3ID redirect URLs to all items in the NI 43-101 Technical Reports library. These URLs are permanent in the sense that we can maintain a redirection rule over time. The basic GET request with no special accept header will send a user to the web page view of an item. Sending an "application/json" accept header will return an API request with the JSON structure for an item.

The code process here is something that we might execute for certain specialized GeoArchive collections in Zotero where all items or items that can respond to a specific search parameter will have the W3ID URLs added. This uses basic pyzotero methods of simply sending the key, version, and the field we want modified (url) to the update_items() function. We build the URL with a simple rule that follows what we set up in the W3ID redirect for "/usgs/z/". We can only send 50 items at a time, so this is the limit we set in the search for items we want to update. I used a lazy method of iterating over the entire library. It's possible that the Zotero API would accept multiple requests in parallel if I pulled all data first and then chunked onto workers, but I've run into issues with getting booted off in the past when trying to do that. So, it's essentially safest to work within the limitations Zotero is imposing and get the job completed over a period of time.

In [1]:
import os
from pyzotero import zotero

In [2]:
def add_w3id_url(item):
    library_id = item["library"]["id"]
    item_key = item["key"]
    w3id_url = f"https://w3id.org/usgs/z/{library_id}/{item_key}"

    item_update = {
        "key": item_key,
        "version": item["version"],
        "url": w3id_url
    }

    return item_update

# We could incorporate additional logic here to handle adding URLs in cases outside the control of query criteria
def process_batch(batch):
    updates = [add_w3id_url(i) for i in batch]
    return updates

In [3]:
# Could be modified to make the library ID a variable
z_ni43101 = zotero.Zotero(
    library_id="4530692",
    library_type="group",
    api_key=os.environ["Z_API_KEY"]
)

In [4]:
# Could include additional criteria here if needed
report_items = z_ni43101.items(itemType="report", limit=50)
report_items_iterator = z_ni43101.iterfollow()

In [5]:
total_items = len(report_items)

initial_batch_updates = process_batch(report_items)
results_update = z_ni43101.update_items(initial_batch_updates)
print("PROCESSED INITIAL BATCH", len(initial_batch_updates), results_update)

for item_batch in report_items_iterator:
    batch_updates = process_batch(item_batch)
    results_update = z_ni43101.update_items(batch_updates)
    print("PROCESSED NEXT BATCH", len(batch_updates), results_update)
    total_items += len(item_batch)

print("TOTAL ITEMS PROCESSED", total_items)

PROCESSED INITIAL BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED NEXT BATCH 50 True
PROCESSED N