<img width="8%" alt="OpenAlex.jpg" src="https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/.github/assets/logos/OpenAlex.jpg" style="border-radius: 15%">

# OpenAlex - Get lists of authors
<a href="https://bit.ly/3JyWIk6">Give Feedback</a> | <a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=bug&template=bug_report.md&title=OpenAlex+-+Get+lists+of+authors:+Error+short+description">Bug report</a>

**Tags:** #openalex #api #entities #authors #get #lists

**Author:** [Florent Ravenel](https://www.linkedin.com/in/florent-ravenel)

**Last update:** 2023-07-27 (Created: 2023-07-27)

**Description:** This notebook will show how to get lists of authors from OpenAlex API.

**References:**
- [OpenAlex API - Get lists of authors](https://docs.openalex.org/api-entities/authors/get-lists-of-authors)
- [OpenAlex API - Author object](https://docs.openalex.org/api-entities/authors/author-object)

## Input

### Import libraries

In [None]:
import requests
import pandas as pd

### Setup variables
- `endpoint`: API endpoint
- `limit`: number of data to be returned. The daily limit for API calls is 100,000 requests per user per day

In [None]:
endpoint = "authors"
limit = 100

## Model

### Get lists of authors
This function will get lists of authors from OpenAlex API.

In [None]:
def get_data(endpoint, limit=-1):
    # Init
    page = 1
    per_page = 100
    data = []
    
    # Loop on page
    while True:
        if limit != -1:
            y = limit - len(data)
            if y < per_page:
                per_page = y
            if len(data) > limit:
                break
                
        # Params
        params = {
            "page": page,
            "per_page": per_page,
        }
        url = f"https://api.openalex.org/{endpoint}"
        
        # Requests
        res = requests.get(url, params=params)
        
        # Results
        if res.status_code == 200:
            results = res.json().get("results")
            if len(results) > 0:
                data.extend(results)
            else:
                break
        else:
            break
        page += 1
    return data

## Output

### Display result

In [None]:
data = get_data(endpoint, limit=limit)
print("Results fetched:", len(data))
print("Example:")
data[0]