# Query Data by DOI

This notebook shows how to fetch information about a specific publication identified by its DOI, and handle API errors.

[Download Notebook](https://github.com/researchgraph/augment-api-beta/blob/main/docs/notebooks/doi.ipynb)

Related Notebooks:  
- [orcid notebook](./orcid.ipynb) Query Researcher and Co-author Relationships by ORCID  
- [publications notebook](./publications.ipynb) Publication List for A Researcher in Bibtex Format. Visualise data with bar plot and wordcloud.  
- [affiliations notebook](./affiliations.ipynb) Query Researcher and Affiliations by ORCID. Mapping affiliation data on worldmap, visualising researcher-organisation relationship.


In [1]:
import sys
sys.path.append('../')

# Packages to use API
import requests
import json

# packages to read API_KEY
import os
from os.path import join, dirname
from dotenv import load_dotenv
load_dotenv();

## API Errors  
When using the API, we load API_KEY and DOI you want to search into variables and add them in the url string. Later the python request package will pass those values to the API and get the data you want. This section shows the 2 types of common errors you might get when using augment API. Either the DOI passed is invalid or the API_KEY is not load successfully from you environment file.
### DOI Not Found  
Here we assign an invalid value to the DOI variable. When error occurs, the request.get( ) will be an object with the status code indicating what type error it is with an error message for explanation.

In [2]:
# DOI does not exist
API_KEY = os.environ.get("API_KEY")
DOI = "10.1038/XXXX"

url = f'https://augmentapi.researchgraph.com/v1/doi/{DOI}?subscription-key={API_KEY}'
r = requests.get(url)

# print a short confirmation on completion
print('Augment API query complete ', r.status_code)

if r.status_code == 400:
    print(r.json()[0]["error"])

Augment API query complete  400
We have failed to identify this DOI (10.1038/XXXX). If it is a new identifier, it might take a few days to appear on our server.


### Missing API_KEY  
You will receive an authentication error if the API KEY in not valid.

In [3]:
# Missing API_KEY
API_KEY = ''
DOI = "10.1038/sdata.2018.99"

url = f'https://augmentapi.researchgraph.com/v1/doi/{DOI}?subscription-key={API_KEY}'
r = requests.get(url)

# print a short confirmation on completion
print('Augment API query complete ', r.status_code)

if r.status_code == 401:
    print(f'Authentication error.',r.json()['message'])

Augment API query complete  401
Authentication error. Access denied due to invalid subscription key. Make sure to provide a valid key for an active subscription.


### DOI does exist  
For valid ORCID records retrieved, it is a nested dictionary structure with all data that is connected to the ORCID requested. First level has 3 keys as shown in the block below.

In [4]:
# DOI does exist
API_KEY = os.environ.get("API_KEY")
DOI = "10.1038/sdata.2018.99"

url = f'https://augmentapi.researchgraph.com/v1/doi/{DOI}?subscription-key={API_KEY}'
r = requests.get(url)

# print a short confirmation on completion
print('Augment API query complete ', r.status_code)
# Shows data 
print('The data returned has below fields: ',r.json()[0].keys())


Augment API query complete  200
The data returned has below fields:  dict_keys(['nodes', 'relationships', 'stats'])


In node dictionary, data is stored for 5 labels as the researchgraph schema:  

In [5]:
r.json()[0]["nodes"].keys()

dict_keys(['datasets', 'grants', 'organisations', 'publications', 'researchers'])

Each data above is stored as a list of dictionaries for each person. To extract the researcher we need, iterate through the list and check for the ORCID.  
  
    
      
        
        

In [6]:
if r.status_code == 200 and r.json()[0]["nodes"]["publications"]:    
    publications = r.json()[0]["nodes"]["publications"]
    
    publication = None
    for i in range(len(publications)):
        if publications[i]["doi"] == DOI:
            publication = publications[i]

print()
print(f'DOI: {publication["doi"]}')
print(f'Authors: {publication["authors_list"]}')
print(f'Title: {publication["title"]}')
print(f'Publication year: {publication["publication_year"]}')
print()
print(f'The publication "{publication["title"]}" is connected to {r.json()[0]["stats"]}.')


DOI: 10.1038/sdata.2018.99
Authors: Amir Aryani, Marta Poblet, Kathryn Unsworth, Jingbo Wang, Ben Evans, Anusuriya Devaraju, Brigitte Hausstein, Claus-Peter Klas, Benjamin Zapilko, Samuele Kaplun
Title: A Research Graph dataset for connecting research data repositories using RD-Switchboard
Publication year: 2018

The publication "A Research Graph dataset for connecting research data repositories using RD-Switchboard" is connected to {'datasets': 28, 'grants': 10, 'organisations': 131, 'publications': 134, 'researchers': 92}.
