You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please correct me if this is possible already. I looked through the source code and the documentation and did not find a clear way to do this: basically, I want to read a FeatherV2 file, but not mmap every single column. I already know which columns I need and I'd like to tell Arrow.Table the subset of columns I want read into memory.
Hey @CarlColglazier, thanks for opening an issue. We could probably support keyword arguments like select and drop, but note that it wouldn't change how much memory is "mmapped". Arrow tables are stored in a single memory blob and there isn't really a way to only mmap a few columns. You still have to read the header/metadata to figure out the offsets of specific columns into the data.
So, happy to support select/drop, since it can be convenient to only get back the columns you really need, but I just want to point out that I wouldn't expect there to be any real effect on memory/performance.
Please correct me if this is possible already. I looked through the source code and the documentation and did not find a clear way to do this: basically, I want to read a FeatherV2 file, but not mmap every single column. I already know which columns I need and I'd like to tell
Arrow.Table
the subset of columns I want read into memory.This is similar to this issue on Feather.jl.
This seems to be possible in the R arrow package using
col_select
.The text was updated successfully, but these errors were encountered: