Skip to content

fix rewrite disk object storage transaction#81787

Merged
CheSema merged 23 commits intomasterfrom
chesema-disk-object-rewrite-file
Sep 22, 2025
Merged

fix rewrite disk object storage transaction#81787
CheSema merged 23 commits intomasterfrom
chesema-disk-object-rewrite-file

Conversation

@CheSema
Copy link
Copy Markdown
Member

@CheSema CheSema commented Jun 13, 2025

#65914

Here is an attempt to remove blob from previous file version

object_storage_tx->object_storage.removeObjectsIfExist(object_storage_tx->metadata_storage.getStorageObjects(path));

That deletion is made even before metadata transaction is tried to cimmit.
When metadata transaction is not commited or is rolled back the blobs are lost.
See the test TEST_F(DiskObjectStorageTest, RewriteFileTxUndo) with repro.

The fix is straight forward: truncation operation should be done before rewrite operation. Truncation removes remote blobs correctly, it removes them after successful metadata transaction commit.

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

rewrite disk object storage transaction removes previous remote blobs if metadata transaction is committed.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Jun 13, 2025

Workflow [PR], commit [84600cf]

@clickhouse-gh clickhouse-gh bot added pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-cloud labels Jun 13, 2025
@CheSema CheSema removed pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-cloud labels Jun 13, 2025
@CheSema
Copy link
Copy Markdown
Member Author

CheSema commented Jun 13, 2025

proof

https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=81787&sha=2ec35a60576a7240b91fca93f74b660a114983dd&name_0=PR&name_1=Unit%20tests%20%28asan%29

[ RUN ] DiskObjectStorageTest.RewriteFileTxUndo
create_metadata_callback path WriteFileTxCommit_file key_ local_blob_storage_dir/ymfhclghzlwvafrtuouahbywoquimrys
create_metadata_callback on execute
MetadataStorageFromDiskTransaction createMetadataFile path WriteFileTxCommit_file object_key local_blob_storage_dir/ymfhclghzlwvafrtuouahbywoquimrys size 30
DiskObjectStorageTransaction writeFile WriteFileTxUndo_file mode Rewrite auto commit false
writeObject WriteFileTxUndo_file->(local_blob_storage_dir/jhkxweadxqajyifeyhyetixxcplhoncl):0
create_metadata_callback path WriteFileTxUndo_file key_ local_blob_storage_dir/jhkxweadxqajyifeyhyetixxcplhoncl
removeObject WriteFileTxUndo_file->(local_blob_storage_dir/jhkxweadxqajyifeyhyetixxcplhoncl):0
DiskObjectStorageTransaction writeFile RewriteFileTxUndo_file mode Rewrite auto commit true
writeObject RewriteFileTxUndo_file->(local_blob_storage_dir/ouuzffdwumcsvsghcsawcqqufeojjatj):0
MetadataStorageFromDiskTransaction createMetadataFile path RewriteFileTxUndo_file object_key local_blob_storage_dir/ouuzffdwumcsvsghcsawcqqufeojjatj size 30
DiskObjectStorageTransaction writeFile RewriteFileTxUndo_file mode Rewrite auto commit false
writeObject RewriteFileTxUndo_file->(local_blob_storage_dir/whcwaihfztixjkxadfuoergvzremlpwx):0
create_metadata_callback path RewriteFileTxUndo_file key_ local_blob_storage_dir/whcwaihfztixjkxadfuoergvzremlpwx
create_metadata_callback on execute
object_storage.removeObjectsIfExist x.size RewriteFileTxUndo_file->(local_blob_storage_dir/ouuzffdwumcsvsghcsawcqqufeojjatj)
removeObject RewriteFileTxUndo_file->(local_blob_storage_dir/ouuzffdwumcsvsghcsawcqqufeojjatj):30
MetadataStorageFromDiskTransaction createMetadataFile path RewriteFileTxUndo_file object_key local_blob_storage_dir/whcwaihfztixjkxadfuoergvzremlpwx size 40
removeObject RewriteFileTxUndo_file->(local_blob_storage_dir/whcwaihfztixjkxadfuoergvzremlpwx):0
unknown file: Failure
C++ exception with description "Cannot open file local_blob_storage_dir/ouuzffdwumcsvsghcsawcqqufeojjatj: , errno: 2, strerror: No such file or directory" thrown in the test body.

[ FAILED ] DiskObjectStorageTest.RewriteFileTxUndo (1 ms)

@CheSema CheSema force-pushed the chesema-disk-object-rewrite-file branch 2 times, most recently from 749361c to eb12ea6 Compare June 13, 2025 14:27
@CheSema CheSema changed the title add unit test when data is lost on rewrite file operation fix rewrite disk object storage transaction Jun 13, 2025
@CheSema CheSema marked this pull request as ready for review June 13, 2025 14:49
@CheSema CheSema marked this pull request as draft June 13, 2025 15:14
@CheSema
Copy link
Copy Markdown
Member Author

CheSema commented Jun 13, 2025

I need to check fast tests.
But the main part is completed here.

String local_path; /// or equivalent "metadata_path"

uint64_t bytes_size = 0;
uint64_t bytes_size = std::numeric_limits<uint64_t>::max();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be std::optional is better?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has a wide usage range as just uint64_t. Usually it is well defined because this struct represent the context of disk metadata files.
Making it optional makes this case better but the rest would be less comprehensive.

std::shared_ptr<StoredObject> object_,
ObjectStorageKey remote_key_,
WriteMode mode_,
bool do_not_write_empty_blob_)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The negation is less convenient to use, may be just like this (with reversing all current values of course):

Suggested change
bool do_not_write_empty_blob_)
bool create_blob_if_empty_)

wb->finalize();
}

DB::FailPointInjection::enableFailPoint("disk_object_storage_fail_commit_metadata_transaction");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do a similar test with failpoint but as stateless?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could.
One of my aims was make ut-test as more flexible.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having both is better, isn't it?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would take a time.
Also I suspect it hits not in the public but in private functionality.

@CheSema CheSema force-pushed the chesema-disk-object-rewrite-file branch from 84600cf to 262c03a Compare June 18, 2025 14:20
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Jun 18, 2025

Workflow [PR], commit [1c3db61]

Summary:

@CheSema CheSema force-pushed the chesema-disk-object-rewrite-file branch from 262c03a to 6a12319 Compare June 18, 2025 16:21
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Jul 29, 2025

Dear @kssenii, this PR hasn't been updated for a while. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself.

@CheSema CheSema force-pushed the chesema-disk-object-rewrite-file branch 3 times, most recently from 06aa12b to 02de0d2 Compare August 1, 2025 23:02
@CheSema
Copy link
Copy Markdown
Member Author

CheSema commented Aug 5, 2025

This is related to public version as well.

2025.08.04 22:17:17.707686 [ 2228 ] {fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad::201403_1_6_1} <Fatal> TruncateFileObjectStorageOperation: Truncate file operation for path 'store/fcd/fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad/tmp_merge_201403_1_6_1/ParsedParams.size0.bin' resulted in removing objects: store/fcd/fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad/tmp_merge_201403_1_6_1/ParsedParams.size0.bin -> s3/hyq/tdajrxbdaedswveaxtezzhmmckcup

The file ParsedParams.size0.bin is rewrited several times during merge.
The column has a type ParsedParams Nested(Key1 String, Key2 String, Key3 String, Key4 String, Key5 String, ValueDouble Float64)
ParsedParams.size0.bin is rewritten each time when next field of a type is written.

2025.08.04 22:17:16.762093 [ 2225 ] {fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad::201403_1_6_1} <Information> DiskObjectStorageTransaction: DiskObjectStorageTransaction writeFile store/fcd/fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad/tmp_merge_201403_1_6_1/ParsedParams.size0.bin mode Rewrite
2025.08.04 22:17:16.762128 [ 2225 ] {fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad::201403_1_6_1} <Information> DiskObjectStorageTransaction: DiskObjectStorageTransaction writeFile store/fcd/fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad/tmp_merge_201403_1_6_1/ParsedParams.size0.cmrk2 mode Rewrite
--
2025.08.04 22:17:16.808844 [ 2225 ] {fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad::201403_1_6_1} <Information> DiskObjectStorageTransaction: DiskObjectStorageTransaction writeFile store/fcd/fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad/tmp_merge_201403_1_6_1/ParsedParams.size0.bin mode Rewrite
2025.08.04 22:17:16.808874 [ 2225 ] {fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad::201403_1_6_1} <Information> DiskObjectStorageTransaction: DiskObjectStorageTransaction writeFile store/fcd/fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad/tmp_merge_201403_1_6_1/ParsedParams.size0.cmrk2 mode Rewrite
--
2025.08.04 22:17:16.846982 [ 2225 ] {fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad::201403_1_6_1} <Information> DiskObjectStorageTransaction: DiskObjectStorageTransaction writeFile store/fcd/fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad/tmp_merge_201403_1_6_1/ParsedParams.size0.bin mode Rewrite
2025.08.04 22:17:16.847018 [ 2225 ] {fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad::201403_1_6_1} <Information> DiskObjectStorageTransaction: DiskObjectStorageTransaction writeFile store/fcd/fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad/tmp_merge_201403_1_6_1/ParsedParams.size0.cmrk2 mode Rewrite
--
2025.08.04 22:17:17.083530 [ 2225 ] {fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad::201403_1_6_1} <Information> DiskObjectStorageTransaction: DiskObjectStorageTransaction writeFile store/fcd/fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad/tmp_merge_201403_1_6_1/ParsedParams.size0.bin mode Rewrite
2025.08.04 22:17:17.083563 [ 2225 ] {fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad::201403_1_6_1} <Information> DiskObjectStorageTransaction: DiskObjectStorageTransaction writeFile store/fcd/fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad/tmp_merge_201403_1_6_1/ParsedParams.size0.cmrk2 mode Rewrite
--
2025.08.04 22:17:17.129045 [ 2218 ] {fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad::201403_1_6_1} <Information> DiskObjectStorageTransaction: DiskObjectStorageTransaction writeFile store/fcd/fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad/tmp_merge_201403_1_6_1/ParsedParams.size0.bin mode Rewrite
2025.08.04 22:17:17.129080 [ 2218 ] {fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad::201403_1_6_1} <Information> DiskObjectStorageTransaction: DiskObjectStorageTransaction writeFile store/fcd/fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad/tmp_merge_201403_1_6_1/ParsedParams.size0.cmrk2 mode Rewrite
--
2025.08.04 22:17:17.189681 [ 2225 ] {fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad::201403_1_6_1} <Information> DiskObjectStorageTransaction: DiskObjectStorageTransaction writeFile store/fcd/fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad/tmp_merge_201403_1_6_1/ParsedParams.size0.bin mode Rewrite
2025.08.04 22:17:17.189708 [ 2225 ] {fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad::201403_1_6_1} <Information> DiskObjectStorageTransaction: DiskObjectStorageTransaction writeFile store/fcd/fcdf9f3e-cd71-4e0f-8372-182cbf57e3ad/tmp_merge_201403_1_6_1/ParsedParams.size0.cmrk2 mode Rewrite

@CheSema CheSema force-pushed the chesema-disk-object-rewrite-file branch 2 times, most recently from d6ac829 to 6d4992f Compare August 14, 2025 14:06
@clickhouse-gh clickhouse-gh bot added the pr-improvement Pull request with some product improvements label Sep 9, 2025
@CheSema CheSema requested a review from kssenii September 9, 2025 18:59
@kssenii kssenii self-assigned this Sep 10, 2025
@CheSema CheSema added this pull request to the merge queue Sep 22, 2025
Merged via the queue into master with commit b2d92b9 Sep 22, 2025
123 checks passed
@CheSema CheSema deleted the chesema-disk-object-rewrite-file branch September 22, 2025 16:55
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Sep 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants