Note
Note: This is an archived repository of my unmaintained package. There were times when I found such a library useful. It is archived, because I moved to using Polars and its rich I/O connectors for the typical usecases.
Have a look at the Polars User Guide in case you need a library to read and write dataframes.
Read and write dataframes from and to any storage.
- Documentation: https://chr1st1ank.github.io/dataframe-io/
- License: Apache-2.0
- Status: Initial development
Dataframes types supported:
- pandas DataFrame
- Python dictionary
Supported storage backends:
- Parquet files
- PostgreSQL database
More backends will come. Open an issue if you are interested in a particular backend.
Implementation status for reading data:
Storage | Select columns | Filter rows | Max rows | Sampling | Drop duplicates |
---|---|---|---|---|---|
Parquet files | ✔️ | ✔️ | ✔️ | ✔️ | ✔ ¹ |
PostgreSQL | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
¹ only for pandas DataFrames
Implementation status for writing data:
Storage | write append | write replace |
---|---|---|
Parquet files | ✔️ | ✔️ |
PostgreSQL | ✔️ | ✔️ |
pip install dframeio
# Including pyarrow to read/write parquet files:
pip install dframeio[parquet]
# Including PostgreSQL support:
pip install dframeio[postgres]
Show installed backends:
>>> import dframeio
>>> dframeio.backends
[<class 'dframeio.parquet.ParquetBackend'>]