[![GitHub Repository](https://img.shields.io/badge/GitHub-Repository-181717?style=for-the-badge&logo=GitHub&link=https://github.com/Mearman/openalex-docs)](https://github.com/Mearman/openalex-docs)[![Open in GitHub](https://img.shields.io/badge/Open%20in-GitHub-181717?style=for-the-badge&logo=github&link=https://github.com/Mearman/openalex-docs/blob/main/api-entities/works/get-lists-of-works.ipynb)](https://github.com/Mearman/openalex-docs/blob/main/api-entities/works/get-lists-of-works.ipynb)[![Open in Colab](https://img.shields.io/badge/Open%20in-Colab-F9AB00?style=for-the-badge&logo=Google%20Colab&link=https://colab.research.google.com/github/Mearman/openalex-docs/blob/main/api-entities/works/get-lists-of-works.ipynb)](https://colab.research.google.com/github/Mearman/openalex-docs/blob/main/api-entities/works/get-lists-of-works.ipynb)

In [None]:
%pip install --upgrade "git+https://github.com/Mearman/openalex-python-pydantic-v1.git"
%pip install pandasai

In [None]:
import json
import pandas as pd
import numpy as np
from openalex_api import Configuration, ApiClient, AutocompleteApi, AuthorsApi, ConceptsApi, FundersApi, InstitutionsApi, PublishersApi, SourcesApi, WorksApi

configuration = Configuration(host="https://api.openalex.org")
autocomplete_api = AutocompleteApi(ApiClient(configuration))
authors_api = AuthorsApi(ApiClient(configuration))
concepts_api = ConceptsApi(ApiClient(configuration))
funders_api = FundersApi(ApiClient(configuration))
institutions_api = InstitutionsApi(ApiClient(configuration))
publishers_api = PublishersApi(ApiClient(configuration))
sources_api = SourcesApi(ApiClient(configuration))
works_api = WorksApi(ApiClient(configuration))

from pandasai import SmartDataframe
from pandasai.llm import OpenAI

In [None]:
# @title  { run: "auto", display-mode: "form" }
openapi_token = "" # @param {type:"string"}

# Get lists of works

You can get lists of works:

* Get _all_ of the works in OpenAlex\
  [https://api.openalex.org/works](https://api.openalex.org/works)

In [None]:
# @title { run: "auto", vertical-output: false }
# https://api.openalex.org/works


response = works_api.get_works(
	
)

df = pd.DataFrame(response.results)
display(df)

In [None]:
numeric_df = df[['id', 'display_name'] +
	[col for col in df.columns if df[col].dtype in ['int64', 'float64'] and col != 'relevance_score']]
display(numeric_df)

try:
	llm = OpenAI(api_token = openapi_token)
	sdf = SmartDataframe(numeric_df, config = { "llm": llm })
	sdf.chat("Plot a chart of this data")
except:
	if not openapi_token:
		print("Error: openapi_token not set")
	else:
		print("Error when creating SmartDataframe")

Which returns a response like this:

```json
{
    "meta": {
        "count": 245684392,
        "db_response_time_ms": 929,
        "page": 1,
        "per_page": 25
    },
    "results": [
        {
            "id": "https://openalex.org/W1775749144",
            "doi": "https://doi.org/10.1016/s0021-9258(19)52451-6",
            "title": "PROTEIN MEASUREMENT WITH THE FOLIN PHENOL REAGENT",
            // more fields (removed to save space)
        },
        {
            "id": "https://openalex.org/W2100837269",
            "doi": "https://doi.org/10.1038/227680a0",
            "title": "Cleavage of Structural Proteins during the Assembly of the Head of Bacteriophage T4",
            // more fields (removed to save space)
        },
        // more results (removed to save space)
    ],
    "group_by": []
}
```

## Page and sort works

You can [page through](./../../how-to-use-the-api/get-lists-of-entities/paging.ipynb) works and change the default number of results returned with the `page` and `per-page` parameters:

* Get a second page of results with 50 results per page\
  [https://api.openalex.org/works?per-page=50\&page=2](https://api.openalex.org/works?per-page=50\&page=2)

In [None]:
# @title { run: "auto", vertical-output: false }
# https://api.openalex.org/works?per-page=50&page=2
per_page="50" # @param "50" {type: "string"},
	page="2" # @param "2" {type: "string"}

response = works_api.get_works(
	per_page=per_page,
	page=page
)

df = pd.DataFrame(response.results)
display(df)

In [None]:
numeric_df = df[['id', 'display_name'] +
	[col for col in df.columns if df[col].dtype in ['int64', 'float64'] and col != 'relevance_score']]
display(numeric_df)

try:
	llm = OpenAI(api_token = openapi_token)
	sdf = SmartDataframe(numeric_df, config = { "llm": llm })
	sdf.chat("Plot a chart of this data")
except:
	if not openapi_token:
		print("Error: openapi_token not set")
	else:
		print("Error when creating SmartDataframe")

You can [sort results](./../../how-to-use-the-api/get-lists-of-entities/sort-entity-lists.ipynb) with the `sort` parameter:

* Sort works by publication year\
  [https://api.openalex.org/works?sort=publication\_year](https://api.openalex.org/works?sort=publication\_year)

In [None]:
# @title { run: "auto", vertical-output: false }
# https://api.openalex.org/works?sort=publication_year
sort="publication_year" # @param "publication_year" {type: "string"}

response = works_api.get_works(
	sort=sort
)

df = pd.DataFrame(response.results)
display(df)

In [None]:
numeric_df = df[['id', 'display_name'] +
	[col for col in df.columns if df[col].dtype in ['int64', 'float64'] and col != 'relevance_score']]
display(numeric_df)

try:
	llm = OpenAI(api_token = openapi_token)
	sdf = SmartDataframe(numeric_df, config = { "llm": llm })
	sdf.chat("Plot a chart of this data")
except:
	if not openapi_token:
		print("Error: openapi_token not set")
	else:
		print("Error when creating SmartDataframe")

Continue on to learn how you can [filter](./filter-works.ipynb) and [search](./search-works.ipynb) lists of works.

## Sample works

You can use `sample` to get a random batch of works. Read more about sampling and how to add a `seed` value [here](./../../how-to-use-the-api/get-lists-of-entities/sample-entity-lists.ipynb).

* Get 20 random works\
  [https://api.openalex.org/works?sample=20](https://api.openalex.org/works?sample=20)

In [None]:
# @title { run: "auto", vertical-output: false }
# https://api.openalex.org/works?sample=20
sample="20" # @param "20" {type: "string"}

response = works_api.get_works(
	sample=sample
)

df = pd.DataFrame(response.results)
display(df)

In [None]:
numeric_df = df[['id', 'display_name'] +
	[col for col in df.columns if df[col].dtype in ['int64', 'float64'] and col != 'relevance_score']]
display(numeric_df)

try:
	llm = OpenAI(api_token = openapi_token)
	sdf = SmartDataframe(numeric_df, config = { "llm": llm })
	sdf.chat("Plot a chart of this data")
except:
	if not openapi_token:
		print("Error: openapi_token not set")
	else:
		print("Error when creating SmartDataframe")

## Select fields

You can use `select` to limit the fields that are returned in a list of works. More details are [here](./../../how-to-use-the-api/get-lists-of-entities/select-fields.ipynb).

* Display only the `id` and `display_name` within works results\
  [https://api.openalex.org/works?select=id,display\_name](https://api.openalex.org/works?select=id,display\_name)

In [None]:
# @title { run: "auto", vertical-output: false }
# https://api.openalex.org/works?select=id,display_name
select="id,display_name" # @param "id,display_name" {type: "string"}

response = works_api.get_works(
	select=select
)

df = pd.DataFrame(response.results)
display(df)

In [None]:
numeric_df = df[['id', 'display_name'] +
	[col for col in df.columns if df[col].dtype in ['int64', 'float64'] and col != 'relevance_score']]
display(numeric_df)

try:
	llm = OpenAI(api_token = openapi_token)
	sdf = SmartDataframe(numeric_df, config = { "llm": llm })
	sdf.chat("Plot a chart of this data")
except:
	if not openapi_token:
		print("Error: openapi_token not set")
	else:
		print("Error when creating SmartDataframe")