<img width="10%" alt="Naas" src="https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160"/>

# OpenAlex - Get lists of sources

**Tags:** #openalex #api #entities #sources #get #lists

**Author:** [Florent Ravenel](https://www.linkedin.com/in/florent-ravenel)

**Last update:** 2023-07-27 (Created: 2023-07-27)

**Description:** This notebook will show how to get lists of sources from OpenAlex API.

**References:**
- [OpenAlex API - Get lists of sources](https://docs.openalex.org/api-entities/sources/get-lists-of-sources)
- [OpenAlex API - Source object](https://docs.openalex.org/api-entities/sources/source-object)

## Input

### Import libraries

In [1]:
import requests
import pandas as pd

### Setup variables
- `endpoint`: API endpoint
- `limit`: number of data to be returned. The daily limit for API calls is 100,000 requests per user per day

In [2]:
endpoint = "sources"
limit = 100

## Model

### Get lists of sources
This function will get lists of sources from OpenAlex API.

In [3]:
def get_data(endpoint, limit=-1):
    # Init
    page = 1
    per_page = 100
    data = []
    
    # Loop on page
    while True:
        if limit != -1:
            y = limit - len(data)
            if y < per_page:
                per_page = y
            if len(data) > limit:
                break
                
        # Params
        params = {
            "page": page,
            "per_page": per_page,
        }
        url = f"https://api.openalex.org/{endpoint}"
        
        # Requests
        res = requests.get(url, params=params)
        
        # Results
        if res.status_code == 200:
            results = res.json().get("results")
            if len(results) > 0:
                data.extend(results)
            else:
                break
        else:
            break
        page += 1
    return data

## Output

### Display result

In [4]:
data = get_data(endpoint, limit=limit)
print("Results fetched:", len(data))
print("Example:")
data[0]

Results fetched: 100
Example:


{'id': 'https://openalex.org/S4306525036',
 'issn_l': None,
 'issn': None,
 'display_name': 'PubMed',
 'host_organization': 'https://openalex.org/I1299303238',
 'host_organization_name': 'National Institutes of Health',
 'host_organization_lineage': ['https://openalex.org/I1299303238'],
 'works_count': 33077501,
 'cited_by_count': 918278952,
 'summary_stats': {'2yr_mean_citedness': 0.0, 'h_index': 2, 'i10_index': 1},
 'is_oa': False,
 'is_in_doaj': False,
 'ids': {'openalex': 'https://openalex.org/S4306525036',
  'wikidata': 'https://www.wikidata.org/entity/Q180686'},
 'homepage_url': 'https://pubmed.ncbi.nlm.nih.gov',
 'apc_prices': None,
 'apc_usd': None,
 'country_code': 'US',
 'societies': [],
 'alternate_titles': [],
 'abbreviated_title': None,
 'type': 'repository',
 'x_concepts': [],
 'counts_by_year': [{'year': 2023,
   'works_count': 0,
   'cited_by_count': 37204499},
  {'year': 2022, 'works_count': 0, 'cited_by_count': 63672165},
  {'year': 2021, 'works_count': 0, 'cited_by_c