In this notebook, I'm addressing the problem I raised in a GitHub issue of how to keep things straight between a shared Zotero library and processing to a) establish linkages to the xDD Digital Library and b) extracted data/information from associated xDD documents. I reworked the original process of operating against an export from the Zotero library to one that works against the Zotero API via the pyzotero package.

I'm breaking this up into a two part exploration. This first part simply runs the process of establishing a linkage on DOI to a corresponding xDD document identifier by putting a link attachment back onto the Zotero library item. That establishes a base of operations for subsequent data extraction processes, keeping everything on established Zotero library items where continued curation can occur via web or desktop clients.

To operate this code, you will need to establish the ZOTERO_API_KEY environment variable in your Python environment. That key is tied to a given user account that has read/write permissions to the WLCI Library Group (established with the group ID parameter.

This same idea should apply for any other Zotero library where we want to run the same overall process of linking a library reference to an xDD digital NLP representation. In the second part of this experiment, I will pick up from the established "xDD Document Link" identifier, pull back a list of associated species scientific names, and put those into a child attachment of the xDD Document Link attachment. This then begins to lay out structured annotation retrieved via algorithms directly into the Zotero library items, adding value to those items and using the Zotero library as a living cloud-based repository with both human and software generated value.

In [1]:
from pyzotero import zotero
import os
import requests
from datetime import datetime
from IPython.display import display

In [2]:
wlci_library_group_id = "2341914"

In [3]:
wlci_lib = zotero.Zotero(wlci_library_group_id, "group", os.environ["ZOTERO_API_KEY"])

In [4]:
wlci_lib_items = wlci_lib.everything(wlci_lib.top())

In [5]:
print(len(wlci_lib_items))

299


In [6]:
set([i["data"]["itemType"] for i in wlci_lib_items])

{'book',
 'document',
 'journalArticle',
 'presentation',
 'report',
 'thesis',
 'videoRecording',
 'webpage'}

In our current work with the xDD library, we are only working on those articles for which we were able to establish a DOI in our Zotero library. DOIs either came from the original article metadata imported into the library and are in the DOI property, or they were added as a note and will show up in the extra property for now. (Need to revisit how we manage this in the library.) To work these over, I assemble a list of Zotero IDs and DOIs.

In [7]:
lookup_doi_list = [(i["data"]["key"],i["data"]["DOI"]) for i in wlci_lib_items if "DOI" in i["data"].keys() and len(i["data"]["DOI"]) > 0]
lookup_doi_list.extend([(i["data"]["key"],i["data"]["extra"]) for i in wlci_lib_items if "DOI" not in i["data"].keys() and len(i["data"]["extra"]) > 0 and i["data"]["extra"].split(":")[0] != "OCLC"])

We will eventually pull these functions out into our Python package for this work. I tweaked on what Daniel started here with a somewhat different take on the xdd_api consultation process. I also added a helper function to assemble the necessary information into the Zotero template for the attachment.

I had to fork the pyzotero package and create a [branch](https://github.com/skybristol/pyzotero/tree/attachment-template-type) with an adjustment to the item_template function, which was failing on the "attachment" template type and needed an additional parameter against the Zotero REST API. I'll post this back to the project as a pull request for consideration.

In [12]:
def xdd_api(route, params):
    """Create list of docs mentioning a term of interest
    Parameters : see https://geodeepdive.org/api for more detail
    ----------
    routes : str of available api routes for xDD 
    params : str of key value pairs of paramaters:values separated by &
    """
    base_url = 'https://geodeepdive.org/api'
    search = (base_url + '/' + route + '?' + str(params))
    r=requests.get(search)
    if r.status_code == 200 and 'success' in r.json():
        json_r = r.json()
        data = json_r['success']['data']
        return data
    elif r.status_code == 200:
        return None
    else:
        return None

    
def xdd_link_attachment(parent_id, xdd_record, template):
    template["parentItem"] = parent_id
    template["title"] = "xDD Document Link"
    template["url"] = f"https://geodeepdive.org/api/articles?docid={xdd_record['_gddid']}"
    template["accessDate"] = "CURRENT_TIMESTAMP"
    template["note"] = "Link to xDD document established through search algorithm"
    return template

Here I run through the list of DOIs and Zotero IDs to check xDD for a link. This could be rearranged in a variety of ways and will need to be explored further for production use. I do check to make sure we don't already have a link of the appropriate "type" (based on assigning a particular title to the attachment).

In [14]:
for ref in lookup_doi_list:
    print(ref[0], ref[1])
    xdd_data = xdd_api(
        'articles', 
        'max=1&doi='+str(ref[1])
    )

    if xdd_data is not None:
        child_items = wlci_lib.children(ref[0])
        current_xdd_doc_attachment = next((i for i in child_items if i["data"]["title"] == "xDD Document Link"), None)
        
        if current_xdd_doc_attachment is None:
            xdd_doc_attachment = xdd_link_attachment(ref[0], xdd_data[0], wlci_lib.item_template("attachment", linkmode="linked_url"))
            create_response = wlci_lib.create_items([xdd_doc_attachment])
            if not create_response["successful"]:
                display(create_response)
        else:
            display(current_xdd_doc_attachment)


NNI3PJ5Z 10.1111/1365-2664.12513


{'key': '2QVF3BJH',
 'version': 3112,
 'library': {'type': 'group',
  'id': 2341914,
  'name': 'WLCI',
  'links': {'alternate': {'href': 'https://www.zotero.org/groups/wlci',
    'type': 'text/html'}}},
 'links': {'self': {'href': 'https://api.zotero.org/groups/2341914/items/2QVF3BJH',
   'type': 'application/json'},
  'alternate': {'href': 'https://www.zotero.org/groups/wlci/items/2QVF3BJH',
   'type': 'text/html'},
  'up': {'href': 'https://api.zotero.org/groups/2341914/items/NNI3PJ5Z',
   'type': 'application/json'}},
 'meta': {'createdByUser': {'id': 1119084,
   'username': 'skybristol',
   'name': 'Sky Bristol',
   'links': {'alternate': {'href': 'https://www.zotero.org/skybristol',
     'type': 'text/html'}}}},
 'data': {'key': '2QVF3BJH',
  'version': 3112,
  'parentItem': 'NNI3PJ5Z',
  'itemType': 'attachment',
  'linkMode': 'linked_url',
  'title': 'xDD Document Link',
  'accessDate': '2019-09-28T17:42:09Z',
  'url': 'https://geodeepdive.org/api/articles?docid=5897848bcf58f1ac

88PSMFIC 10.1016/j.biocon.2015.02.009


{'key': 'JV8EH3GN',
 'version': 3114,
 'library': {'type': 'group',
  'id': 2341914,
  'name': 'WLCI',
  'links': {'alternate': {'href': 'https://www.zotero.org/groups/wlci',
    'type': 'text/html'}}},
 'links': {'self': {'href': 'https://api.zotero.org/groups/2341914/items/JV8EH3GN',
   'type': 'application/json'},
  'alternate': {'href': 'https://www.zotero.org/groups/wlci/items/JV8EH3GN',
   'type': 'text/html'},
  'up': {'href': 'https://api.zotero.org/groups/2341914/items/88PSMFIC',
   'type': 'application/json'}},
 'meta': {'createdByUser': {'id': 1119084,
   'username': 'skybristol',
   'name': 'Sky Bristol',
   'links': {'alternate': {'href': 'https://www.zotero.org/skybristol',
     'type': 'text/html'}}}},
 'data': {'key': 'JV8EH3GN',
  'version': 3114,
  'parentItem': '88PSMFIC',
  'itemType': 'attachment',
  'linkMode': 'linked_url',
  'title': 'xDD Document Link',
  'accessDate': '2019-09-28T17:44:07Z',
  'url': 'https://geodeepdive.org/api/articles?docid=57a84443cf58f170

G9L9FU5W 10.1016/j.ecolmodel.2017.05.017


{'key': 'GDDCHMSE',
 'version': 3116,
 'library': {'type': 'group',
  'id': 2341914,
  'name': 'WLCI',
  'links': {'alternate': {'href': 'https://www.zotero.org/groups/wlci',
    'type': 'text/html'}}},
 'links': {'self': {'href': 'https://api.zotero.org/groups/2341914/items/GDDCHMSE',
   'type': 'application/json'},
  'alternate': {'href': 'https://www.zotero.org/groups/wlci/items/GDDCHMSE',
   'type': 'text/html'},
  'up': {'href': 'https://api.zotero.org/groups/2341914/items/G9L9FU5W',
   'type': 'application/json'}},
 'meta': {'createdByUser': {'id': 1119084,
   'username': 'skybristol',
   'name': 'Sky Bristol',
   'links': {'alternate': {'href': 'https://www.zotero.org/skybristol',
     'type': 'text/html'}}}},
 'data': {'key': 'GDDCHMSE',
  'version': 3116,
  'parentItem': 'G9L9FU5W',
  'itemType': 'attachment',
  'linkMode': 'linked_url',
  'title': 'xDD Document Link',
  'accessDate': '2019-09-28T17:48:19Z',
  'url': 'https://geodeepdive.org/api/articles?docid=5acfde1ecf58f17c

W5K4UAZ8 10.1002/jwmg.21179


{'key': '2S6BI6XI',
 'version': 3117,
 'library': {'type': 'group',
  'id': 2341914,
  'name': 'WLCI',
  'links': {'alternate': {'href': 'https://www.zotero.org/groups/wlci',
    'type': 'text/html'}}},
 'links': {'self': {'href': 'https://api.zotero.org/groups/2341914/items/2S6BI6XI',
   'type': 'application/json'},
  'alternate': {'href': 'https://www.zotero.org/groups/wlci/items/2S6BI6XI',
   'type': 'text/html'},
  'up': {'href': 'https://api.zotero.org/groups/2341914/items/W5K4UAZ8',
   'type': 'application/json'}},
 'meta': {'createdByUser': {'id': 1119084,
   'username': 'skybristol',
   'name': 'Sky Bristol',
   'links': {'alternate': {'href': 'https://www.zotero.org/skybristol',
     'type': 'text/html'}}}},
 'data': {'key': '2S6BI6XI',
  'version': 3117,
  'parentItem': 'W5K4UAZ8',
  'itemType': 'attachment',
  'linkMode': 'linked_url',
  'title': 'xDD Document Link',
  'accessDate': '2019-09-28T17:48:20Z',
  'url': 'https://geodeepdive.org/api/articles?docid=5c2c22901faed655

2S5E6PXT 10.1111/fme.12303


{'key': 'WQCJ2VNR',
 'version': 3118,
 'library': {'type': 'group',
  'id': 2341914,
  'name': 'WLCI',
  'links': {'alternate': {'href': 'https://www.zotero.org/groups/wlci',
    'type': 'text/html'}}},
 'links': {'self': {'href': 'https://api.zotero.org/groups/2341914/items/WQCJ2VNR',
   'type': 'application/json'},
  'alternate': {'href': 'https://www.zotero.org/groups/wlci/items/WQCJ2VNR',
   'type': 'text/html'},
  'up': {'href': 'https://api.zotero.org/groups/2341914/items/2S5E6PXT',
   'type': 'application/json'}},
 'meta': {'createdByUser': {'id': 1119084,
   'username': 'skybristol',
   'name': 'Sky Bristol',
   'links': {'alternate': {'href': 'https://www.zotero.org/skybristol',
     'type': 'text/html'}}}},
 'data': {'key': 'WQCJ2VNR',
  'version': 3118,
  'parentItem': '2S5E6PXT',
  'itemType': 'attachment',
  'linkMode': 'linked_url',
  'title': 'xDD Document Link',
  'accessDate': '2019-09-28T17:53:58Z',
  'url': 'https://geodeepdive.org/api/articles?docid=5d3c21570b45c76c

EDVXJFQK 10.1002/jwmg.123


{'key': '43A79WUT',
 'version': 3119,
 'library': {'type': 'group',
  'id': 2341914,
  'name': 'WLCI',
  'links': {'alternate': {'href': 'https://www.zotero.org/groups/wlci',
    'type': 'text/html'}}},
 'links': {'self': {'href': 'https://api.zotero.org/groups/2341914/items/43A79WUT',
   'type': 'application/json'},
  'alternate': {'href': 'https://www.zotero.org/groups/wlci/items/43A79WUT',
   'type': 'text/html'},
  'up': {'href': 'https://api.zotero.org/groups/2341914/items/EDVXJFQK',
   'type': 'application/json'}},
 'meta': {'createdByUser': {'id': 1119084,
   'username': 'skybristol',
   'name': 'Sky Bristol',
   'links': {'alternate': {'href': 'https://www.zotero.org/skybristol',
     'type': 'text/html'}}}},
 'data': {'key': '43A79WUT',
  'version': 3119,
  'parentItem': 'EDVXJFQK',
  'itemType': 'attachment',
  'linkMode': 'linked_url',
  'title': 'xDD Document Link',
  'accessDate': '2019-09-28T17:54:00Z',
  'url': 'https://geodeepdive.org/api/articles?docid=5d4384980b45c76c

EC835SP7 10.3996/022014-JFWM-016
ARXBE7P4 10.1002/ecs2.1817


{'key': '5TGURDT2',
 'version': 3120,
 'library': {'type': 'group',
  'id': 2341914,
  'name': 'WLCI',
  'links': {'alternate': {'href': 'https://www.zotero.org/groups/wlci',
    'type': 'text/html'}}},
 'links': {'self': {'href': 'https://api.zotero.org/groups/2341914/items/5TGURDT2',
   'type': 'application/json'},
  'alternate': {'href': 'https://www.zotero.org/groups/wlci/items/5TGURDT2',
   'type': 'text/html'},
  'up': {'href': 'https://api.zotero.org/groups/2341914/items/ARXBE7P4',
   'type': 'application/json'}},
 'meta': {'createdByUser': {'id': 1119084,
   'username': 'skybristol',
   'name': 'Sky Bristol',
   'links': {'alternate': {'href': 'https://www.zotero.org/skybristol',
     'type': 'text/html'}}}},
 'data': {'key': '5TGURDT2',
  'version': 3120,
  'parentItem': 'ARXBE7P4',
  'itemType': 'attachment',
  'linkMode': 'linked_url',
  'title': 'xDD Document Link',
  'accessDate': '2019-09-28T17:57:46Z',
  'url': 'https://geodeepdive.org/api/articles?docid=5d1ed9d20b45c76c

ZURD2T85 10.1007/s10666-017-9559-1
A2WG577D 10.1002/wmon.1014
K8PT4QFH 10.1007/s00442-010-1768-0
4TCBL9S3 10.1002/jwmg.337
JFV5GDKM 10.1002/jwmg.155
R3NSCC9R 10.1002/jwmg.21386


{'key': '9ETANTY8',
 'version': 3105,
 'library': {'type': 'group',
  'id': 2341914,
  'name': 'WLCI',
  'links': {'alternate': {'href': 'https://www.zotero.org/groups/wlci',
    'type': 'text/html'}}},
 'links': {'self': {'href': 'https://api.zotero.org/groups/2341914/items/9ETANTY8',
   'type': 'application/json'},
  'alternate': {'href': 'https://www.zotero.org/groups/wlci/items/9ETANTY8',
   'type': 'text/html'},
  'up': {'href': 'https://api.zotero.org/groups/2341914/items/R3NSCC9R',
   'type': 'application/json'}},
 'meta': {'createdByUser': {'id': 1119084,
   'username': 'skybristol',
   'name': 'Sky Bristol',
   'links': {'alternate': {'href': 'https://www.zotero.org/skybristol',
     'type': 'text/html'}}}},
 'data': {'key': '9ETANTY8',
  'version': 3105,
  'parentItem': 'R3NSCC9R',
  'itemType': 'attachment',
  'linkMode': 'linked_url',
  'title': 'xDD Document Link',
  'accessDate': '2019-09-27T21:17:21Z',
  'url': 'https://geodeepdive.org/api/articles?docid=5c2c30a41faed655

M7QX7I8U 10.1002/jwmg.21560
HJ28LVB4 10.1016/j.jhydrol.2015.02.020
RUATTVSK 10.1016/j.foreco.2016.01.017
X3I66DJ8 10.1080/2150704X.2015.1072289
BH3VH7QL 10.1111/ele.12772
MSU75H97 10.2747/1548-1603.49.3.378
TRECRZAA 10.1080/01431161.2011.605085
5GTCJ27A 10.1002/ecs2.2113
D78REKE6 10.1002/ieam.4118
K7EW3YQE 10.1890/08-2034.1
F4N9CWU6 10.2193/2008-478
7243RZFL 10.1111/1365-2664.12013
Q9VEKFZX 10.1111/j.1365-2656.2011.01845.x
K4YNMGZX 10.1016/j.biocon.2018.10.020
JK87YUFS 10.1002/ece3.3607
4W5FVZZM 10.1016/j.jag.2014.01.008
9N87KUPL 10.1080/15420353.2014.885925
XYW5M83E 10.1002/jwmg.1050
Q5TCVPYX 10.1002/eap.1512
A3PMD7DU 10.2111/REM-D-12-00056.1
U3MKG3Y5 10.1306/10011212090
822S7HY8 10.1016/j.ecolind.2017.12.033
Z7TMPV6Q 10.1016/j.ecolind.2015.03.002
JHHSGSZ3 10.1117/1.JRS.7.073508
5PXF3TPC 10.1016/j.jag.2011.09.012
LSJ875PV 10.1111/gcb.12852
YK64M5GP 10.1080/17445647.2012.745381
DNANIMHC 10.3133/cir1407
ZP43FL59 10.3133/sir20125025
KVSUPQDN 10.3133/wlci7
FLZKZ5W8 10.3133/ds800
9H4QL3RG 