## Search for Experts from a list of ORCID IDs

This notebook queries the [OpenAlex API](https://docs.openalex.org/api) via its `/works` endpoint for works authored by a person. It then uses the metadata from the concept field(researchfield) to calculate a score for each given concept. This is used to print a list of experts for a given researchfield and there overall score.

In [1]:
# Prerequisites:
import requests         # dependency to make HTTP calls

#### Subject List

This List contains all 19 root concepts (Level 0) from OpenAlex. This concepts are the broadest concepts, OpenAlex contains over 60000 different concepts, which can be found here:
This search can work with all concepts an can search mupliple concepts at the same time. 

#### orthography

If you want to use different non level 0 concepts you have to use the displayed name. The concept has to be the form 'concept', if you want to use multiple concepts, they have to be seperated with an comma.

In [2]:
# list of root concepts
root_concepts=['Political science',
'Philosophy',
'Economics',
'Business',
'Psychology',
'Mathematics',
'Medicine',
'Biology',
'Computer science',
'Geology',
'Chemistry',
'Art',
'Sociology',
'Engineering',
'Geography',
'History',
'Materials science',
'Physics',
'Environmental science']

#### Input parameters

There are 3 Input Parameters: A List of ORCID IDs, a list of subjects and a confidence score.

In [3]:
# Search Subjects
Concept_search=['Computer science','Physics']

Note: ORCID IDs have to be in the Form "0000-0001-5380-4449" and have to be seperated by comma.

In [4]:
# Orcid list
list_of_ids=["0000-0002-3416-2652",
"0000-0001-6604-6253",
"0000-0003-4331-8695",
"0000-0003-4939-1666",
"0000-0002-5861-8896",]

Note: The score has to be between 0 and 1. a higher score meets a higher threshold for counting as an expert.

In [5]:
# Confidence scores
confidence_score=0.4

We use it to query the OpenAlex API for works that specified the ORCID URL within their metadata in the field '`authorships.author.orcid`'.
 Since the API uses [pagination](https://docs.openalex.org/api/get-lists-of-entities#pagination), we need to loop through all pages to get the complete result set.

In [6]:
# OpenAlex endpoint to query for works
OPENALEX_API_WORKS = "https://api.openalex.org/works"

# query all works that are connected to orcid
def query_openalex_for_person2works(orcid):
    page = 1
    max_page = 1
    
    while page <= max_page:
        params = {'filter': 'authorships.author.orcid:'+orcid, 'page': page}
        response = requests.get(url=OPENALEX_API_WORKS,
                                params=params,
                                headers= {'Accept': 'application/json'})
        response.raise_for_status()
        result=response.json()

        # calculate max page number in first loop
        if max_page == 1:
            max_page = determine_max_page(result)
        page = page + 1
        yield result

# calculate max number of result pages
def determine_max_page(response_data):
    item_count = response_data['meta']['count']
    items_per_page = response_data['meta']['per_page']
    max_page_ceil = item_count // items_per_page + bool(item_count % items_per_page)
    return max_page_ceil

From the resulting list of works we extract and print out title and DOI. 

*Note: works that do not have a DOI assigned, will not be printed.*

In [7]:
# from the result pages we get from the OpenAlex API, extract the data about works
def extract_works_from_page(page):
    return [work for work in page.get('results') or []]

# extract DOI from work
def extract_doi(work):
    doi=work.get('ids', {}).get('doi') or ""
    doi_id=doi.replace("https://doi.org/", "") if doi else doi
    title=work.get('display_name', "")
    concept=work.get('concepts')
    return doi_id, title, concept

def main_search(orcid):
    global Error_count
    # Query for DOI list
    result_doi=[]
    count_doi=0
    list_of_pages=query_openalex_for_person2works(orcid)
    for page in list_of_pages or []:
        works=extract_works_from_page(page)
        for work in works or []:
            doi,title,concept=extract_doi(work)
            if doi:
                add=[]
                add.append(orcid)
                add.append(doi)
                add.append(title)
                add_concept=[]
                for item in concept:
                    all_concepts=[item['display_name'],'Level:'+str(item['level']),item['score']]
                    add_concept.append(all_concepts)
                add.append(add_concept)
                result_doi.append(add)
    # Start of the expertsearch           
    dict_gesamt={}
    dict_gesamt.update({'ID':orcid})
    dict_gesamt.update({'Count DOI:':count_doi})
    add=[]
    dedub_add=[]
    # Building a list of all with the respective ORCID connected concepts
    for item in result_doi:
        if orcid in item:
            count_doi=count_doi+1
            for item2 in item[3]:
                new=item2[0]
                add.append(new) 
        dict_gesamt.update({'Count DOI:':count_doi})
    # Dedublication 
    for item in add:
        if item not in dedub_add:
            dedub_add.append(item)
    # Score for each concept
    for single_concept in dedub_add:
        score_concept=0
        concept_count=0
        for item in result_doi:
            for item2 in item[3]:
                if single_concept in item2[0]:
                    score_concept=score_concept+float(item2[2])
                    concept_count=concept_count+1
            if concept_count>0:
                final_score=score_concept/concept_count
                dict_gesamt.update({single_concept:final_score}) 
    # error search
    dict_error=dict_gesamt.copy()
    del dict_error['ID']
    del dict_error['Count DOI:']
    error_check=dict_error.values()
    for item in error_check:
        if item >1:
            Error_count=Error_count+1
            print('############Error#############')
    # Expert search 
    check=0
    expert=['Orcid:', dict_gesamt['ID']]
    for item in Concept_search:
        if item in dict_gesamt.keys() and dict_gesamt[item]>confidence_score:
            check=1
            add=['Subject:'+item,'Score:', dict_gesamt[item]]
            expert.append(add)
    if check ==1:
        list_experts.append(expert)

In [8]:
# main programm:
global Error_count
Error_count=0
global list_experts
list_experts=[]
for item in list_of_ids:
    main_search(item) 

In [9]:
# Results
print('Error Count:',Error_count)
print('Count of Experts:', len(list_experts))
if len(list_experts) ==0:
    print('no experts found')
for exp in list_experts:
    print (exp)

Error Count: 0
Count of Experts: 2
['Orcid:', '0000-0002-3416-2652', ['Subject:Computer science', 'Score:', 0.5338947615263159]]
['Orcid:', '0000-0002-5861-8896', ['Subject:Computer science', 'Score:', 0.6944728807438019]]
