### Query OpenAire for publications authored by a person
This notebook queries the [OpenAire API](https://graph.openaire.eu/develop/api.html) via its `/publications` endpoint for publications authored by a person. It takes an ORCID iD as input which is used to filter for publications where one of the creators' `orcid` field matches the given ORCID iD. From the resulting list of publications we output all DOIs.

In [1]:
# Prerequisites:
import requests                    # dependency for making HTTP calls
from benedict import benedict      # dependency for dealing with json

The input for this notebook is an ORCID iD, e.g. '`0000-0003-2499-7741`'.

In [2]:
# input parameter
example_orcid="0000-0003-2499-7741"

We use it to query the OpenAire API for publications that specified the ORCID iD within their metadata in one of the creators `orcid` field. Since the API uses pagination, we need to loop through all pages to get the complete result set.

In [3]:
# OpenAire endpoint to query for publications
OPENAIRE_API_PUBLICATIONS = "https://api.openaire.eu/search/publications"

# query all publications that are connected to orcid
def query_openaire_for_person2publications(orcid_id):
    page = 1
    max_page = 1

    while page <= max_page:
        params = {'orcid': orcid_id, 'page': page, 'format': "json"}
        response = requests.get(url=OPENAIRE_API_PUBLICATIONS,
                                params=params)
        response.raise_for_status()
        result=response.json()

        # calculate max page number in first loop
        if max_page == 1:
            max_page = determine_max_page(result)
        page = page + 1
        yield result

# calculate max number of result pages
def determine_max_page(response_data):
    response_dict = benedict.from_json(response_data)
    items_total = response_dict.get('response.header.total.$')
    items_per_page = response_dict.get('response.header.size.$')
    max_page_ceil = items_total // items_per_page + bool(items_total % items_per_page)
    return max_page_ceil


# ---- example execution
list_of_pages=query_openaire_for_person2publications(example_orcid)

From the resulting list of publications we extract and print out each title and DOI. 

*Note: publications that do not have a DOI assigned, will not be printed.*

In [4]:
# from the result pages, extract the data about each publication
def extract_publications_from_page(page):
    return [pub for pub in benedict.from_json(page).get('response.results.result') or []]

# extract DOI from publication
def extract_doi(pub):
    oaf_result_dict=benedict.from_json(pub).get('metadata.oaf:entity.oaf:result')

    # unfortunately the json data is inconsistent:
    # if there is one pid/title for a publication, it is a json object
    # if there are multiple pids/titles for a publication, they form a json list
    pids=oaf_result_dict.get('pid') or []
    if isinstance(pids, list):
        dois=[pid['$'] for pid in pids if pid.get('@classid')=="doi"]
    else:
        dois= [pids['$']] if pids.get('@classid')=="doi" else []
    doi=dois[0] if dois else None # pick the first one
    
    titles=oaf_result_dict.get('title') or []
    if isinstance(titles, list):
        main_titles=[title['$'] for title in titles if title['@classid']=="main title"]
    else:
        main_titles=[titles['$']] if titles.get('@classid')=="main title" else []
    title=main_titles[0] if main_titles else None  # pick the first one

    return doi, title


#--- example execution
for page in list_of_pages or []:
    publications=extract_publications_from_page(page)
    for pub in publications:
        doi,title = extract_doi(pub)
        if doi:
            print(f"{doi}, {title}")

10.15488/11463, Roadmap to FAIR Research Information in Open Infrastructures
10.1515/bd.2006.40.4.466, Informationsvermittlung: Personalisiertes Lernen in der Bibliothek: das Düsseldorfer Online-Tutorial (DOT) Informationskompetenz
10.1080/00048623.2006.10755322, Teaching Information Literacy with the Lerninformationssystem
10.3389/frma.2021.694307, Enhancing Knowledge Graph Extraction and Validation From Scholarly Publications Using Bibliographic Metadata
10.3897/rio.7.e66264, OPTIMETA – Strengthening the Open Access publishing system through open citations and spatiotemporal metadata
10.1016/j.procs.2019.01.074, The Research Core Dataset (KDSF) in the Linked Data context
