Fix Parquet reader crash when min_bytes_for_seek=0#88784
Fix Parquet reader crash when min_bytes_for_seek=0#88784al13n321 merged 5 commits intoClickHouse:masterfrom
Conversation
- Fix prefetch loop to always fetch at least one row group - Add defensive null check before using record_batch_reader - Prevents segfault when input_format_parquet_local_file_min_bytes_for_seek=0
|
I used this script to check if the bug has been fixed. |
|
@al13n321 Could you please review this. Thanks. |
|
Please add a test in Reproduces with just insert into function file('t.parquet') select 1 as x settings engine_file_truncate_on_insert=1;
select * from file('t.parquet') settings input_format_parquet_local_file_min_bytes_for_seek=0, input_format_parquet_use_native_reader_v3=0, max_parsing_threads=1;(but in the test please put the file in |
|
Without these changes the test gives this output With these changes |
|
Hey @al13n321 thanks for the comments. |
|
Workflow [PR], commit [812fa0b] Summary: ❌
|
| @@ -0,0 +1,2 @@ | |||
| 1 | |||
|
|
|||
There was a problem hiding this comment.
The test fails because of this empty line, I think: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=88784&sha=f37701f26abe71ec4744fbda0ce7f12a9ec1ee03&name_0=PR
Remove trailing blank line from reference file.
|
Fixed. Thanks. |
31127f2
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Fixed a segmentation fault in the Parquet reader when input_format_parquet_local_file_min_bytes_for_seek is set to 0. Resolves #78456
Details
Fixes a segmentation fault in the Parquet reader that occurs when input_format_parquet_local_file_min_bytes_for_seek is set to 0.
Root Cause
When input_format_parquet_local_file_min_bytes_for_seek = 0, the prefetch iterator's loop condition fails causing the loop to never execute, prefetched_row_groups to remain empty, nextRowGroupReader() to return nullptr, and fetchBatch() to crash on chassert(row_group_batch.record_batch_reader).
Changes
Reproduction
Before fix: Segmentation fault
After fix: Works correctly
Resolves #78456