Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Reading Parquet file written by pyarrow with lz4 compression fails with OutOfSpec("Thrift out of range") #940

Closed
ritchie46 opened this issue Apr 12, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@ritchie46
Copy link
Collaborator

import polars as pl
import io

f = io.BytesIO()
df = pl.DataFrame({
    "a": [1, 2, 3]
})

df.write_parquet(f, use_pyarrow=True, compression="lz4")
f.seek(0)
read = pl.read_parquet(f, use_pyarrow=False)
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: OutOfSpec("Thrift out of range")', /github/home/.cargo/registry/src/github.com-1ecc6299db9ec823/parquet2-0.10.3/src/metadata/column_chunk_metadata.rs:83:49
stack backtrace:
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

@jorgecarleitao
Copy link
Owner

Thanks!

This has been fixed in parquet2 at jorgecarleitao/parquet2#118 by @dantengsky. We need a release there, which I expect to be within the next few days.

@jorgecarleitao jorgecarleitao added the bug Something isn't working label Apr 14, 2022
@jorgecarleitao
Copy link
Owner

Closed by #923.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants