Describe the bug
Reading parquet files with corrupted data leads to panic due to:
|
assert!(end <= remainder.len()); |
To Reproduce
couldn't write a minimal PoC, but here is stacktrace
thread 'xx' (53) panicked at /home/ubuntu/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/parquet-57.3.0/src/file/metadata/reader.rs:535:17:
assertion failed: end <= remainder.len()
stack backtrace:
0: __rustc::rust_begin_unwind
1: core::panicking::panic_fmt
2: core::panicking::panic
3: <datafusion_datasource_parquet::reader::CachedParquetFileReader as parquet::arrow::async_reader::AsyncFileReader>::get_metadata::{{closure}}
4: <datafusion_datasource_parquet::opener::ParquetOpener as datafusion_datasource::file_stream::FileOpener>::open::{{closure}}
5: <datafusion_datasource::file_stream::FileStream as futures_core::stream::Stream>::poll_next
6: <datafusion_physical_plan::coop::CooperativeStream<T> as futures_core::stream::Stream>::poll_next
7: <datafusion_physical_plan::stream::BatchSplitStream as futures_core::stream::Stream>::poll_next
we had a deltalake, I corrupted one of the parquet files with:
python3 -c "
data = open('/tmp/original.parquet','rb').read()
total = len(data)
# Keep first 30% of data + last 1000 bytes (footer)
head = data[:int(total * 0.3)]
foot = data[-1000:] # footer is small, 846 bytes per metadata output
open('/tmp/corrupt.parquet','wb').write(head + foot)
print(f'Original: {total} -> Corrupt: {len(head) + len(foot)} bytes')
"
that lead to the crash
Expected behavior
It should return ParquetError instead of panic
Additional context
version: parquet-57.3.0 [ latest would fail as well ]
Describe the bug
Reading parquet files with corrupted data leads to panic due to:
arrow-rs/parquet/src/file/metadata/reader.rs
Line 533 in 88b7fca
To Reproduce
couldn't write a minimal PoC, but here is stacktrace
we had a deltalake, I corrupted one of the parquet files with:
that lead to the crash
Expected behavior
It should return
ParquetErrorinstead of panicAdditional context
version:
parquet-57.3.0[ latest would fail as well ]