-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] RowGroup filtering on file level #17793
Comments
Wes McKinney / @wesm: |
Uwe Korn / @xhochy: |
Uwe Korn / @xhochy: |
Robbie Gruener / @rgruener: |
Wes McKinney / @wesm: |
Joris Van den Bossche / @jorisvandenbossche: (we can have a separate one about actually using this in |
Wes McKinney / @wesm: |
We can build upon the API defined in
fastparquet
for defining RowGroup filters: https://github.com/dask/fastparquet/blob/master/fastparquet/api.py#L296-L300 and translate them into the C++ enums we will define in https://issues.apache.org/jira/browse/PARQUET-1158 . This should enable us to provide the user with a simple predicate pushdown API that we can extend in the background from RowGroup to Page level later on.Reporter: Uwe Korn / @xhochy
Assignee: Joris Van den Bossche / @jorisvandenbossche
Related issues:
PRs and other links:
Note: This issue was originally created as ARROW-1796. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: