Skip to content

Use select_column if possible when read_hdf #20673

Open
@dzubo

Description

@dzubo

If in pd.read_hdf(..., columns=['col_name], ...) we have only one column, then it would be much faster to use store.select_column() instead of store.column().

return store.select(key, auto_close=auto_close, **kwargs)

Any caveats on implementing this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    IO HDF5read_hdf, HDFStorePerformanceMemory or execution speed performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions