Skip to content

Conversation

chelsea-lin
Copy link
Contributor

@chelsea-lin chelsea-lin commented May 6, 2024

This change introduces the bigframes.bigquery.array_agg method for SeriesGroupBy and DataFrameGroupby. By default, aggregated arrays are ordered by the underlying sorting columns. Additionally, array_agg is the inverse operation of (Series|Dataframe).explode().

Fixes internal bug: 338232748🦕

@product-auto-label product-auto-label bot added size: l Pull request size is large. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels May 6, 2024
@chelsea-lin chelsea-lin force-pushed the main_chelsealin_arrayagg branch from 6e92dd6 to c30464f Compare May 8, 2024 16:57
@chelsea-lin chelsea-lin marked this pull request as ready for review May 8, 2024 17:03
@chelsea-lin chelsea-lin requested review from a team as code owners May 8, 2024 17:03
@chelsea-lin chelsea-lin requested a review from shobsi May 8, 2024 17:03
@chelsea-lin chelsea-lin requested review from TrevorBergeron and removed request for shobsi May 8, 2024 17:03
@chelsea-lin chelsea-lin force-pushed the main_chelsealin_arrayagg branch from f703d45 to 333d0ac Compare May 8, 2024 17:37
@chelsea-lin chelsea-lin requested a review from tswast May 8, 2024 22:12
@chelsea-lin chelsea-lin force-pushed the main_chelsealin_arrayagg branch from 1c44cef to e2b0854 Compare May 8, 2024 23:28
@chelsea-lin chelsea-lin force-pushed the main_chelsealin_arrayagg branch from 824c27b to c5cc12e Compare May 10, 2024 19:59
@chelsea-lin chelsea-lin force-pushed the main_chelsealin_arrayagg branch from c5cc12e to 3e071ad Compare May 14, 2024 18:36
Copy link
Collaborator

@tswast tswast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit regarding the helper function naming, but otherwise looks good!

bigframes.dtypes.ibis_dtype_to_bigframes_dtype(ibis_type),
)

def _aggregate_helper(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get a better name for this method, please. "helper" is very generic, so it's hard to understand what this method is doing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed it into _aggergate_base to match the object name BaseIbisIR.

@chelsea-lin chelsea-lin force-pushed the main_chelsealin_arrayagg branch from a66d01b to 55c9d4f Compare May 14, 2024 22:51
@chelsea-lin chelsea-lin force-pushed the main_chelsealin_arrayagg branch from 0115c73 to ea3614e Compare May 16, 2024 01:25
@chelsea-lin chelsea-lin added the automerge Merge the pull request once unit tests and other checks pass. label May 16, 2024
Copy link

Merge-on-green attempted to merge your PR for 6 hours, but it was not mergeable because either one of your required status checks failed, one of your required reviews was not approved, or there is a do not merge label. Learn more about your required status checks here: https://help.github.com/en/github/administering-a-repository/enabling-required-status-checks. You can remove and reapply the label to re-run the bot.

@gcf-merge-on-green gcf-merge-on-green bot removed the automerge Merge the pull request once unit tests and other checks pass. label May 16, 2024
@chelsea-lin chelsea-lin force-pushed the main_chelsealin_arrayagg branch from ea3614e to 5596095 Compare May 16, 2024 17:35
@chelsea-lin
Copy link
Contributor Author

The end-to-end tests that failed are not caused by this particular change.

@chelsea-lin chelsea-lin merged commit 412f28b into main May 16, 2024
@chelsea-lin chelsea-lin deleted the main_chelsealin_arrayagg branch May 16, 2024 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants