Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support summarization of empty data in Pandas backend #2908

Merged

Conversation

emilyreff7
Copy link
Contributor

Proposed Change

Handle empty input data in summarization.

Currently, when a summarization gets passed empty data in the Pandas backend, we error out with a confusing

UnboundLocalError: local variable 'grouped_col_name' referenced before assignment

This PR adds checks in a couple of places for empty data, such that we gracefully return an empty DataFrame instead.

Tests

Added a new test for this in test_vectorized_udf.py.

Copy link
Contributor

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

series = [data.apply(lambda t: t[i]) for i in range(num_cols)]
result = pd.concat(series, axis=1)
if not len(data):
result = pd.concat([data], axis=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not result = data.to_frame()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, updated to use to_frame instead.

Copy link
Contributor

@icexelloss icexelloss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@datapythonista datapythonista added dask The Dask backend pandas The pandas backend bug Incorrect behavior inside of ibis labels Aug 24, 2021
@datapythonista datapythonista merged commit c2343d8 into ibis-project:master Aug 24, 2021
@datapythonista
Copy link
Contributor

Thanks @emilyreff7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis dask The Dask backend pandas The pandas backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants