Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate fastparquet #10743

Merged
merged 2 commits into from Jan 4, 2024
Merged

Deprecate fastparquet #10743

merged 2 commits into from Jan 4, 2024

Conversation

crusaderky
Copy link
Collaborator

@crusaderky crusaderky commented Jan 2, 2024

Changes to to_parquet and read_parquet:

}
marks = {(w, r): [skip_marks[w], skip_marks[r]] for w in backends for r in backends}

# Custom marks
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was unused.

@crusaderky crusaderky marked this pull request as ready for review January 3, 2024 17:09
@hendrikmakait hendrikmakait self-requested a review January 3, 2024 17:13
Copy link
Collaborator

@phofl phofl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

feel free to merge @hendrikmakait if you want to take a look as well, otherwise I'll merge tomorrow

Copy link
Member

@hendrikmakait hendrikmakait left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @crusaderky!

Parquet library to use. Defaults to 'auto', which uses ``pyarrow`` if
it is installed, and falls back to ``fastparquet`` otherwise.
it is installed, and falls back to the deprecated ``fastparquet`` otherwise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should we at least mention somewhere that fastparquet does not support all functionality?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: which functionality does it not support? You could have asked?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martindurant: I've merely referred to the comments about functionality not supported within Dask's read parquet that have been removed in this PR. See the removed lines above this comment for more information.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I understand. You mean that fastparquet as a backend in dask does not any more support all dask functionality (although fastparquet itself does).

I am disappointed in how this issue (the whole PR) was handled without any notice to me, perhaps my comment above was not the right place for it.

Nevertheless, let me state for the record: that fastparquet has remained remarkably stable, small and performant for a long time, while keeping up with all upstream changes in pandas. I am sorry to see it dropped like this.

@phofl
Copy link
Collaborator

phofl commented Jan 4, 2024

Merging this one, can do the doc change in a follow up if we want to

@phofl phofl merged commit f2d448e into dask:main Jan 4, 2024
25 checks passed
@phofl
Copy link
Collaborator

phofl commented Jan 4, 2024

thx @crusaderky

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deprecate fastparquet engine for read_parquet to enable switch to dask-expr
4 participants