Skip to content

Fast way to read a particular row from and specific columns with uneven entries (which file format?!, saving/loading options etc.) #1785

Answered by jpivarski
thoglu asked this question in Q&A
Discussion options

You must be logged in to vote

I need to follow up on this when I have time to look things up, but I can provide some pointers in the meantime. There's another function, ak.metadata_from_parquet, which reads the (small) metadata of a Parquet file but not the (large) data. In this metadata, there are fields for num_entries, num_row_groups, and also row-group by row-group information about exactly which entries (rows) are in each row group.

If you have a specific entry/row to read, or a specific range, entry_start:entry_stop, this can be expanded to row_group_start:row_group_stop by rounding down the start index and rounding up the stop index. (There is no way to read one entry; row groups are the smallest granularity th…

Replies: 1 comment 6 replies

Comment options

You must be logged in to vote
6 replies
@jpivarski
Comment options

@jpivarski
Comment options

@thoglu
Comment options

@thoglu
Comment options

@jpivarski
Comment options

Answer selected by thoglu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants