### Query the FREYA PID Graph with pagination for all people affiliated with an organization

This notebook queries Datacite's GraphQL API for the FREYA PID Graph to retrieve all people affiliated with an organization. It takes a ROR URL as input and outputs all ORCIDs belonging to people affiliated with the respective organization.

In [1]:
# needed dependency to make HTTP calls
import requests

The input for the computation is a ROR ID or ROR URL.

In [2]:
# input parameter for all further computations
example_ror="https://ror.org/021k10z87"

We use it to query Datacite's GraphQL API for the organization and all its connected people using pagination.

In [3]:
# Datacite's GraphQL endpoint for the FREYA PID Graph
DATACITE_GRAPHQL_API = "https://api.datacite.org/graphql"
# Query to retrieve an organization and all its affiliated people
QUERY_ORGA2PEOPLE = """query organization($ror :ID!, $after:String){
organization(id: $ror) {
    people(first: 1000, after: $after) {
      totalCount
      pageInfo {
        endCursor
        hasNextPage
        }

      nodes {
        id
        name
        givenName
      }
    }
  }
}"""

# query all people that are connected to given ROR
def download_data(ror):
    continue_paginating = True
    cursor=""
    while continue_paginating:
        vars = {'ror': ror, 'after': cursor}
        response = requests.post(url=DATACITE_GRAPHQL_API,
                                 json={'query': QUERY_ORGA2PEOPLE, 'variables': vars},
                                 headers={'Content-Type': 'application/json'})
        result=response.json()

        # check if next page exists and set cursor to next page
        continue_paginating = has_next_page(result)
        cursor = next_cursor(result)
        yield result

# check if there is another page with results to query
def has_next_page(response_data):
    return response_data['data']['organization']['people']['pageInfo']['hasNextPage']

# set cursor to next value
def next_cursor(response_data):
    return response_data['data']['organization']['people']['pageInfo']['endCursor']


# example execution
list_of_pages=download_data(example_ror)
print(f"Found {len(list(list_of_pages))} page(s) for ROR {example_ror}")


Found 1 page(s) for ROR https://ror.org/021k10z87


From the result we extract each person's metadata and output their respective ORCID and name:

In [4]:
# extract all ORCIDs from the result
def extract_orcids(data):
    for person in data['data']['organization']['people']['nodes']:
      orcid=person['id'].replace("https://orcid.org/", "")
      name=person['name']
      yield orcid, name


# example execution
list_of_pages=download_data(example_ror)
for page in list_of_pages:
    for orcid,name in extract_orcids(page):
        print(f"{orcid}, {name}")

0000-0002-3783-6130, Irene Weipert-Fenner
0000-0002-5452-0488, Hans-Joachim Spanger
0000-0001-6746-1248, Anton Peez
0000-0001-6731-5304, Julia Eckert
0000-0003-1575-9688, Hendrik Simon
0000-0002-1712-2624, Julian Junk
0000-0003-0035-5840, Raphael Oidtmann
0000-0002-8739-2486, Elvira Rosert
0000-0002-5925-043X, Ariadne Natal
0000-0002-7012-6739, Peter Kreuzer
0000-0001-7843-4480, Dirk Peters
0000-0003-0039-9827, Eldad Ben Aharon
0000-0001-6823-6819, Janna Lisa Chalmovsky
0000-0002-4259-6071, Felix S. Bethke
0000-0001-7286-3575, Paul Chambers
0000-0003-1940-8877, Mikhail Polianskii
