[![GitHub Repository](https://img.shields.io/badge/GitHub-Repository-181717?style=for-the-badge&logo=GitHub&link=https://github.com/Mearman/openalex-docs)](https://github.com/Mearman/openalex-docs)[![Open in GitHub](https://img.shields.io/badge/Open%20in-GitHub-181717?style=for-the-badge&logo=github&link=https://github.com/Mearman/openalex-docs/blob/main/api-entities/works/group-works.ipynb)](https://github.com/Mearman/openalex-docs/blob/main/api-entities/works/group-works.ipynb)[![Open in Colab](https://img.shields.io/badge/Open%20in-Colab-F9AB00?style=for-the-badge&logo=Google%20Colab&link=https://colab.research.google.com/github/Mearman/openalex-docs/blob/main/api-entities/works/group-works.ipynb)](https://colab.research.google.com/github/Mearman/openalex-docs/blob/main/api-entities/works/group-works.ipynb)

In [None]:
%pip install --upgrade "git+https://github.com/Mearman/openalex-python-pydantic-v1.git"
%pip install pandasai

In [None]:
import json
import pandas as pd
import numpy as np
from openalex_api import Configuration, ApiClient,AuthorsApi, ConceptsApi, FundersApi, InstitutionsApi, PublishersApi, SourcesApi, WorksApi

configuration = Configuration(host="https://api.openalex.org")
authors_api = AuthorsApi(ApiClient(configuration))
concepts_api = ConceptsApi(ApiClient(configuration))
funders_api = FundersApi(ApiClient(configuration))
institutions_api = InstitutionsApi(ApiClient(configuration))
publishers_api = PublishersApi(ApiClient(configuration))
sources_api = SourcesApi(ApiClient(configuration))
works_api = WorksApi(ApiClient(configuration))

In [None]:
from pandasai import SmartDataframe
from pandasai.llm import OpenAI

In [None]:
openapi_token = "" # @param {type:"string"}

# Group works

You can group works with the `group_by` parameter:

* Get counts of works by Open Access status:\
  [`https://api.openalex.org/works?group_by=oa_status`](https://api.openalex.org/works?group\_by=oa\_status)

In [None]:
response = works_api.get_works(
	group_by="oa_status"
)

display(pd.DataFrame(response.results))

In [None]:
try:
	print(openapi_token)
except:
	raise Exception("Please provide an openapi_token")

df = pd.DataFrame(response.results)
numeric_df = df.select_dtypes(include=[np.number])
display(numeric_df)

llm = OpenAI(api_token=openapi_token)
sdf = SmartDataframe(numeric_df, config={"llm": llm})
sdf.chat("Plot a chart of this data")

Or you can group using one the attributes below.

{% hint style="info" %}
It's best to [read about group by](./../../how-to-use-the-api/get-groups-of-entities.ipynb) before trying these out. It will show you how results are formatted, the number of results returned, and how to sort results.
{% endhint %}

### `/works` group\_by attributes

{% hint style="danger" %}
The `host_venue` and `alternate_host_venues` properties have been deprecated in favor of [`primary_location`](./work-object/README.md#primary\_location) and [`locations`](./work-object/README.md#locations). The attributes `host_venue` and `alternate_host_venues` are no longer available in the Work object, and trying to access them in filters or group-bys will return an error.
{% endhint %}

* [`authors_count`](./filter-works.md#authors\_count)
* [`authorships.author.id`](./work-object/README.md#author) (alias `author.id`)
* [`authorships.author.orcid`](./work-object/README.md#author) (alias `author.orcid`)
* [`authorships.countries`](./work-object/authorship-object.md#countries)
* [`authorships.institutions.country_code`](./work-object/README.md#institutions) (alias `institutions.country_code`)
* [`authorships.institutions.continent`](./filter-works.md#authorships.institutions.continent-alias-institutions.continent) (alias `institutions.continent`)
* [`authorships.institutions.is_global_south`](./filter-works.md#authorships.institutions.is\_global\_south-alias-institutions.is\_global\_south)
* [`authorships.institutions.id`](./work-object/README.md#institutions) (alias `institutions.id`)
* [`authorships.institutions.lineage`](./work-object/authorship-object.md#institutions)
* [`authorships.institutions.ror`](./work-object/README.md#institutions) (alias `institutions.ror`)
* [`authorships.institutions.type`](./work-object/README.md#institutions) (alias `institutions.type`)
* [`authorships.is_corresponding`](./work-object/authorship-object.md#is\_corresponding) (alias: `is_corresponding`): this marks whether or not we have corresponding author information for a given work
* [`apc_list.value`](./work-object/README.md#apc\_list)
* [`apc_list.currency`](./work-object/README.md#apc\_list)
* [`apc_list.provenance`](./work-object/README.md#apc\_list)
* [`apc_list.value_usd`](./work-object/README.md#apc\_list)
* [`apc_paid.value`](./work-object/README.md#apc\_paid)
* [`apc_paid.currency`](./work-object/README.md#apc\_paid)
* [`apc_paid.provenance`](./work-object/README.md#apc\_paid)
* [`apc_paid.value_usd`](./work-object/README.md#apc\_paid)
* [`best_oa_location.is_accepted`](./work-object/README.md#best\_oa\_location)
* [`best_oa_location.is_published`](./work-object/README.md#best\_oa\_location)
* [`best_oa_location.license`](./work-object/README.md#best\_oa\_location)
* [`best_oa_location.source.host_organization`](./work-object/README.md#best\_oa\_location)
* [`best_oa_location.source.id`](./work-object/README.md#best\_oa\_location)
* [`best_oa_location.source.is_in_doaj`](./work-object/README.md#best\_oa\_location)
* [`best_oa_location.source.issn`](./work-object/README.md#best\_oa\_location)
* [`best_oa_location.source.type`](./work-object/README.md#best\_oa\_location)
* [`best_oa_location.version`](./work-object/README.md#best\_oa\_location)
* [`best_open_version`](./filter-works.md#best\_open\_version)
* [`cited_by_count`](./work-object/README.md#cited\_by\_count)
* [`cites`](./filter-works.md#cites)
* [`concepts_count`](./filter-works.md#concepts\_count)
* [`concepts.id`](./work-object/README.md#concepts)
* [`concepts.wikidata`](./work-object/README.md#concepts)
* [`corresponding_author_ids`](./work-object/README.md#corresponding\_author\_ids)
* [`corresponding_institution_ids`](./work-object/README.md#corresponding\_institution\_ids)
* [`countries_distinct_count`](./work-object/README.md#countries_distinct_count)
* [`fulltext_origin`](./work-object/README.md#fulltext_origin)
* [`grants.award_id`](./work-object/README.md#grants)
* [`grants.funder`](./work-object/README.md#grants)
* [`has_abstract`](./filter-works.md#has\_abstract)
* [`has_doi`](./filter-works.md#has\_doi)
* [`has_fulltext`](./work-object/README.md#has_fulltext)
* [`has_orcid`](./filter-works.md#has\_orcid)
* [`has_pmid`](./filter-works.md#has\_pmid)
* [`has_pmcid`](./filter-works.md#has\_pmcid)
* [`has_ngrams`](./filter-works.md#has\_ngrams) (DEPRECATED)
* [`has_references`](./filter-works.md#has\_references)
* [`is_retracted`](./work-object/README.md#is\_retracted)
* [`is_paratext`](./work-object/README.md#is\_paratext)
* [`journal`](./filter-works.md#journal)
* [`keywords.keyword`](./work-object/README.md#keywords)
* [`language`](./work-object/README.md#language)
* [`locations.is_accepted`](./work-object/README.md#locations)
* [`locations.is_published`](./work-object/README.md#locations)
* [`locations.source.host_institutions_lineage`](./filter-works.md#locations.source.host\_institution\_lineage)
* [`locations.source.is_in_doaj`](./work-object/README.md#locations)
* [`locations.source.publisher_lineage`](./filter-works.md#locations.source.publisher\_lineage)
* [`locations_count`](./work-object/README.md#locations\_count)
* [`open_access.any_repository_has_fulltext`](./work-object/README.md#open\_access)
* [`open_access.is_oa`](./work-object/README.md#is\_oa-1) (alias `is_oa`)
* [`open_access.oa_status`](./work-object/README.md#oa\_status) (alias `oa_status`)
* [`primary_location.is_accepted`](./work-object/README.md#primary\_location)
* [`primary_location.is_oa`](./work-object/README.md#primary\_location)
* [`primary_location.is_published`](./work-object/README.md#primary\_location)
* [`primary_location.license`](./work-object/README.md#primary\_location)
* [`primary_location.source.has_issn`](./work-object/README.md#primary\_location)
* [`primary_location.source.host_organization`](./work-object/README.md#primary\_location)
* [`primary_location.source.id`](./work-object/README.md#primary\_location)
* [`primary_location.source.is_in_doaj`](./work-object/README.md#primary\_location)
* [`primary_location.source.issn`](./work-object/README.md#primary\_location)
* [`locations.source.publisher_lineage`](./filter-works.md#primary_location.source.publisher\_lineage)
* [`primary_location.source.type`](./work-object/README.md#primary\_location)
* [`primary_location.version`](./work-object/README.md#primary\_location)
* [`publication_year`](./work-object/README.md#publication\_year)
* [`repository`](./filter-works.md#repository)
* [`sustainable_development_goals.id`](./work-object/README.md#sustainable_development_goals)
* [`type`](./work-object/README.md#type)
* [`type_crossref`](./work-object/README.md#type_crossref)