# OpenSky Web Services + Zotero Integration Experiment
### Keith E. Maull<sup>1</sup>, Michaeleen Trimarchi<sup>2</sup> and Mike Wright<sup>3</sup>

##### September 20, 2016

#### NCAR Library, National Center for Atmospheric Research
1. kmaull@ucar.edu 
2. trimarchi@ucar.edu
3. mwright@ucar.edu

In [1]:
import requests
import json
import sys

### CONFIGURATION FILE

To run the code as demonstrated here, you will need to have a JSON file named `config.json` on the same level as this notebook with the following format and information:

``` json
    {
        "zoterokey":  "<your Zotero key to get WRITE access to your group>",
        "zoterogroup": "<the group link of the form groups/XXXXX"
    }
```

In [2]:
ZOTERO_API_WS = "https://api.zotero.org"
OSWS_API_WS   = "http://cypressvm.dls.ucar.edu:8788/osws"

CONFIG = json.load(open('config.json'))

NAR_GROUP     = CONFIG['zoterogroup']
ZAPI_KEY      = CONFIG['zoterokey']

### ZOTERO ARTICLE METADATA OBJECT
In order to get data into the Zotero group library, we will need to extract the metadata for it.  In this case, we are going to use [crossref's content negotiation](http://www.crosscite.org/cn/) to get such metadata.  While the metadata _does exist in the OSWS metadata_ we are going to simulate what we need from crossref so we can show another service interaction.

In [3]:
def make_zotero_article_metadata(doi):
    article = {
        "itemType": "journalArticle",
        "DOI": doi,
        "URL": "http://dx.doi.org/{}".format(doi)
    }

    headers = {'Accept': 'application/vnd.citationstyles.csl+json'}
    r = requests.get("http://dx.doi.org/{}".format(doi), headers= headers)
    if r.status_code == 200:
        try:
            resp_payload = r.json()
            article['title'] = resp_payload['title']
        except ValueError, e:
            print "[DOI:{}] / {}".format(doi, e.message)

    return article

### OSWS SEARCH PAYLOAD
While in the previous example code, we were extracting the title from Crossref, in reality we already have it, since we are getting the **full** metadata records from OSWS.  Thus the code could easily be adapted to take the title from the OSWS metadata as well as tags, as will be shown later. Learn more about the OSWS requests and data payloads [here](https://docs.google.com/document/d/1gRNhtWkCYFd4Ho9R4X_1J0z_YUEIBhw33e_oRlZ4YYE).

An example payload will look something like this:

```json
{
    "OpenSkyWebService": {
        "Search": {
            "results": {
                "result": [
                    {
                        "head": {
                            "nldrCitableUrl": "http://nldr.library.ucar.edu/repository/collections/OSGC-000-000-022-305",
                            "PID": "articles:18270",
                            "keyDateYMD": "2016-02-01",
                            "upid": [
                                "5394",                                
             ...
}
```

In [4]:
def get_osws_doi_list(year=2016):
    doi_list = []

    # will search for all peer-reviewed articles from the year provided
    OSWS_QRY = \
        '''{}/search/v1?q=(date:[2016-01-01T00:00:00.000Z TO 2016-08-30T23:00:00.000Z] AND collection:"articles")&output=json'''\
        .format(OSWS_API_WS)

    r = requests.get(OSWS_QRY)
    if r.status_code == 200:
        resp_payload = r.json()
        results = resp_payload[ "OpenSkyWebService" ][ "Search" ][ "results" ][ "result" ]
        
        # extract just the DOI for now!
        for r in results:
            doi = r['head']['doi']
            doi_list.append(doi)

    return doi_list

### INSERTING DATA INTO THE ZOTERO GROUP
We will use the Zotero API to push data to the group using the `items` (POST) endpoint.  Read the [full documentation](https://www.zotero.org/support/dev/web_api/v3/start) to see the details of other methods that may be of interest to your needs.

In [5]:
def push_to_zotero_group(group_id, group_api_key, article):
    # POST the object back to the collection
    # print json.dumps(article)

    headers = {'Content-Type': 'application/json',
               'Authorization': 'Bearer {}'.format(group_api_key)}

    r = requests.post('{}/{}/items'.format(ZOTERO_API_WS, group_id), headers=headers, data=json.dumps([article]))
    if r.status_code == 200:
        print "[INFO] Push to Zotero successful."
    else:
        print "[WARN] Push to Zotero NOT successful. ({})".format(r.status_code)

### LET'S TRY IT OUT!

We're going to grab data from OSWS, get metadata from Crossref and push data to Zotero as shown in the diagram below:

<img src="./assets/overview.png"/>

In [6]:
doi_list = get_osws_doi_list()

for doi in doi_list[:1]:
    try:
        article_md = make_zotero_article_metadata(doi)
        push_to_zotero_group(NAR_GROUP, ZAPI_KEY, article_md)
    except:
        print "[ERROR]: {} / \n\t{}".format(doi, sys.exc_info())

[INFO] Push to Zotero successful.


### SUCCESS!

After executing this on the full set of 2016 publications, title and DOI metadata are now in Zotero(some 464 total publications).  There are all part of the  in the [OpenSky articles collection](http://opensky.ucar.edu/islandora/object/research%3Aarticles), and are able to see them all in Zotero group they were uploaded to.

<img src="./assets/zotero_screenshot.png"/>
<img src="./assets/zotero_screenshot2.png"/>