Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow naive concatenation of sorted dataframes #4725

Merged
merged 1 commit into from Apr 25, 2019

Conversation

Projects
None yet
1 participant
@mrocklin
Copy link
Member

commented Apr 22, 2019

Previously we would raise a somewhat confusing error when a user tried
to concatenate dataframes that were sorted, pointing them towards
interleave_partitions=True. Now we allow this behavior without
complaint, producing an unsorted dataframe.

This stops guiding users towards correct behavior, but also reduces the
confusion for novice users.

Fixes #4693

  • Tests added / passed
  • Passes flake8 dask
Allow naive concatenation of sorted dataframes
Previously we would raise a somewhat confusing error when a user tried
to concatenate dataframes that were sorted, pointing them towards
`interleave_partitions=True`.  Now we allow this behavior without
complaint, producing an unsorted dataframe.

This stops guiding users towards correct behavior, but also reduces the
confusion for novice users.

Fixes #4693
@mrocklin

This comment has been minimized.

Copy link
Member Author

commented Apr 22, 2019

raise ValueError('All inputs have known divisions which '
'cannot be concatenated in order. Specify '
'interleave_partitions=True to ignore order')
divisions = [None] * (sum([df.npartitions for df in dfs]) + 1)

This comment has been minimized.

Copy link
@mrocklin

mrocklin Apr 22, 2019

Author Member

Alternatively, we could also warn here, but naive users don't have a clear way to turn this warning off

@mrocklin

This comment has been minimized.

Copy link
Member Author

commented Apr 24, 2019

Merging tomorrow if there are no comments

@mrocklin mrocklin merged commit 9f21623 into dask:master Apr 25, 2019

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@mrocklin mrocklin deleted the mrocklin:relax-interleave_partitions branch Apr 25, 2019

jorge-pessoa pushed a commit to jorge-pessoa/dask that referenced this pull request May 14, 2019

Allow naive concatenation of sorted dataframes (dask#4725)
Previously we would raise a somewhat confusing error when a user tried
to concatenate dataframes that were sorted, pointing them towards
`interleave_partitions=True`.  Now we allow this behavior without
complaint, producing an unsorted dataframe.

This stops guiding users towards correct behavior, but also reduces the
confusion for novice users.

Fixes dask#4693

Thomas-Z added a commit to Thomas-Z/dask that referenced this pull request May 17, 2019

Allow naive concatenation of sorted dataframes (dask#4725)
Previously we would raise a somewhat confusing error when a user tried
to concatenate dataframes that were sorted, pointing them towards
`interleave_partitions=True`.  Now we allow this behavior without
complaint, producing an unsorted dataframe.

This stops guiding users towards correct behavior, but also reduces the
confusion for novice users.

Fixes dask#4693
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.