Skip to content

Fix reading count() from cache in case of partitioned delta lake#85704

Merged
kssenii merged 4 commits intomasterfrom
delta-kernel-fix-not-found-column
Aug 19, 2025
Merged

Fix reading count() from cache in case of partitioned delta lake#85704
kssenii merged 4 commits intomasterfrom
delta-kernel-fix-not-found-column

Conversation

@kssenii
Copy link
Copy Markdown
Member

@kssenii kssenii commented Aug 15, 2025

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix reading count from cache for delta lake.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of extra diff in this file, because of applied black formatting.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Aug 15, 2025

Workflow [PR], commit [2dc95ef]

Summary:

job_name test_name status info comment
Stateless tests (amd_binary, old analyzer, s3 storage, DatabaseReplicated, parallel) failure
00712_prewhere_with_alias_bug_2 FAIL
01038_array_of_unnamed_tuples FAIL
Exception in test runner FAIL
Killed by signal (in clickhouse-server.log or clickhouse-server.err.log) FAIL
Fatal messages (in clickhouse-server.log or clickhouse-server.err.log) FAIL
Stateless tests (amd_debug, distributed plan, s3 storage, parallel) failure
02443_detach_attach_partition FAIL
Stateless tests (amd_tsan, s3 storage, parallel) failure
02443_detach_attach_partition FAIL
Integration tests (amd_asan, old analyzer, 3/6) failure
test_storage_kafka/test_batch_fast.py::test_kafka_no_holes_when_write_suffix_failed[generate_old_create_table_query] FAIL

@clickhouse-gh clickhouse-gh bot added the pr-bugfix Pull request with bugfix, not backported by default label Aug 15, 2025
ColumnWithTypeAndName count_column(column_type->createColumn(), column_type, column_name);
builder.init(Pipe(std::make_shared<ConstChunkGenerator>(
std::make_shared<const Block>(read_from_format_info.format_header), *num_rows_from_cache, max_block_size)));
std::make_shared<const Block>(ColumnsWithTypeAndName{count_column}), *num_rows_from_cache, max_block_size)));
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here there was a problem when format_header was used, because partition columns are read not from file, but from file path, so if this column was chosen to execute count, then format header will be empty, and below we will get and error "not found column in block"

@kssenii kssenii requested a review from Avogar August 15, 2025 11:37
@Avogar Avogar self-assigned this Aug 15, 2025
@kssenii
Copy link
Copy Markdown
Member Author

kssenii commented Aug 19, 2025

Fatal messages (in clickhouse-server.log or clickhouse-server.err.log)

#85861

@kssenii kssenii enabled auto-merge August 19, 2025 21:11
@kssenii kssenii added this pull request to the merge queue Aug 19, 2025
Merged via the queue into master with commit 23b4e0e Aug 19, 2025
119 of 122 checks passed
@kssenii kssenii deleted the delta-kernel-fix-not-found-column branch August 19, 2025 21:26
@robot-ch-test-poll robot-ch-test-poll added the pr-synced-to-cloud The PR is synced to the cloud repo label Aug 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants