Skip to content

Custom Predicates for ParquetExec and Parquet row indexes #9341

Answered by alamb
lightjacket asked this question in Q&A
Discussion options

You must be logged in to vote

So my question is whether there is a way from a RecordBatch to load what the original record indexes in the parquet file were.

I don't think this is possible today in DataFusion (or in the parquet rust reader)

As @Ted-Jiang RowSelection can describe this concept, but I believe the row selections are per record group (not for the file as a whole) -- also I don't think this is exposed in some way you can provide a row selection to pass into the underlying reader

Replies: 3 comments 4 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@lightjacket
Comment options

Comment options

You must be logged in to vote
3 replies
@lightjacket
Comment options

@alamb
Comment options

Answer selected by lightjacket
@lightjacket
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
4 participants