<img width="10%" alt="Naas" src="https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160"/>

# OpenAlex - Get lists of publishers
<a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/OpenAlex/OpenAlex_Get_lists_of_publishers.ipynb" target="_parent"><img src="https://naasai-public.s3.eu-west-3.amazonaws.com/Open_in_Naas_Lab.svg"/></a><br><br><a href="https://bit.ly/3JyWIk6">Give Feedbacks</a> | <a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=bug&template=bug_report.md&title=OpenAlex+-+Get+lists+of+publishers:+Error+short+description">Bug report</a>

**Tags:** #openalex #api #entities #publishers #get #lists

**Author:** [Florent Ravenel](https://www.linkedin.com/in/florent-ravenel)

**Last update:** 2023-07-27 (Created: 2023-07-27)

**Description:** This notebook will show how to get lists of publishers from OpenAlex API.

**References:**
- [OpenAlex API - Get lists of publishers](https://docs.openalex.org/api-entities/publishers/get-lists-of-publishers)
- [OpenAlex API - Publisher object](https://docs.openalex.org/api-entities/publishers/publisher-object)

## Input

### Import libraries

In [1]:
import requests
import pandas as pd

### Setup variables
- `endpoint`: API endpoint
- `limit`: number of data to be returned. The daily limit for API calls is 100,000 requests per user per day

In [2]:
endpoint = "publishers"
limit = 100

## Model

### Get lists of publishers
This function will get lists of publishers from OpenAlex API.

In [3]:
def get_data(endpoint, limit=-1):
    # Init
    page = 1
    per_page = 100
    data = []
    
    # Loop on page
    while True:
        if limit != -1:
            y = limit - len(data)
            if y < per_page:
                per_page = y
            if len(data) > limit:
                break
                
        # Params
        params = {
            "page": page,
            "per_page": per_page,
        }
        url = f"https://api.openalex.org/{endpoint}"
        
        # Requests
        res = requests.get(url, params=params)
        
        # Results
        if res.status_code == 200:
            results = res.json().get("results")
            if len(results) > 0:
                data.extend(results)
            else:
                break
        else:
            break
        page += 1
    return data

## Output

### Display result

In [4]:
data = get_data(endpoint, limit=limit)
print("Results fetched:", len(data))
print("Example:")
data[0]