Add GenericColumnReader::skip_records Missing OffsetIndex Fallback #2433
Labels
enhancement
Any new improvement worthy of a entry in the changelog
parquet
Changes to the parquet crate
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
GenericColumnReader::skip_records
always calls toPageReader::peek_next_page
. If the file doesn't have an OffsetIndex or this hasn't been loaded, this will result in an error.Describe the solution you'd like
I would like some mechanism for GenericColumnReader to fallback to decoding the level data from the page and using this to skip decoding. This may require adding a
PageReader::has_page_metadata() -> bool
or something similar to detect this situation.Not only would this avoid needing to know ahead of time if index information is available before pushing down
RowFilter
, etc... but would also allow these APIs to work even for files without a PageIndex - as just decoding the levels will still be significantly faster than also decoding the valuesDescribe alternatives you've considered
Additional context
FYI @Ted-Jiang
The text was updated successfully, but these errors were encountered: