Skip to content

Fix The range specified is invalid for the current size of the resource when reading from azure disk with plain_rewritable and encryption#86400

Merged
jkartseva merged 12 commits intoClickHouse:masterfrom
jkartseva:fix-tde-azure
Sep 4, 2025
Merged

Fix The range specified is invalid for the current size of the resource when reading from azure disk with plain_rewritable and encryption#86400
jkartseva merged 12 commits intoClickHouse:masterfrom
jkartseva:fix-tde-azure

Conversation

@jkartseva
Copy link
Copy Markdown
Member

@jkartseva jkartseva commented Aug 29, 2025

Do not write empty blobs to Azure blob storage, which happens when creating an encrypted disk with an empty path.

When the disk is reloaded (e.g., because of pod restart), reading from an empty blob fails with 416 The range specified is invalid for the current size of the resource. The range specified is invalid for the current size of the resource:
https://pastila.nl/?000448db/8f3c31d2c5a6c3dffd32af6d6cbc547b#yLMbaEXWJBYP6pz3CmHKrg==

Additionally, skip reading from empty blobs for objects that already exist.

The issue did not reproduce with Azurite, so I added a test that uses openbucketforpublicci Azure bucket.

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix 416 The range specified is invalid for the current size of the resource. The range specified is invalid for the current size of the resource when reading empty blobs from Azure blob storage for plain_rewritable disk.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@jkartseva jkartseva added the can be tested Allows running workflows for external contributors label Aug 29, 2025
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Aug 29, 2025

Workflow [PR], commit [1be042e]

Summary:

@clickhouse-gh clickhouse-gh bot added the pr-not-for-changelog This PR should not be mentioned in the changelog label Aug 29, 2025
auto normalized_path = normalizeDirectoryPath(path);
if (normalized_path.empty())
{
LOG_TRACE(getLogger("MetadataStorageFromPlainObjectStorageTransaction"), "Skipping creation of a directory with an empty path");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we throw an exception instead?

And can we print out the original path, so we know if something wrong with normalizeDirectoryPath?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I essentially moved the check to src/Disks/DiskEncrypted.cpp

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we throw an exception instead?
I essentially moved the check to src/Disks/DiskEncrypted.cpp

Reverting that, because it has some undesired side effects, e.g.:

tests/queries/0_stateless/03362_merge_tree_with_background_refresh.sh

[ip-172-31-8-86] 2025.08.30 03:18:34.873339 [ 1335461 ] {56d32d72-96e6-48eb-b4dd-7a0f6fd04002} <Error> executeQuery: Code: 281. DB::Exception: Directory path '' is empty. (DICTIONARY_IS_EMPTY) (version 25.9.1.1) (from [::ffff:127.0.0.1]:61872) (query 1, line 2) (in query: CREATE TABLE writer (`s` String) ORDER BY tuple() SETTINGS table_disk = true, disk = disk(name = `03362_writer_test`, type = object_storage, object_storage_type = '[HIDDEN]', metadata_type = '[HIDDEN]', path = '[HIDDEN]')), Stack trace (when copying this message, always include the lines below):

0. ./contrib/llvm-project/libcxx/include/__exception/exception.h:113: Poco::Exception::Exception(String const&, int) @ 0x000000001f7bce32
1. ./build.release/./src/Common/Exception.cpp:128: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000fad999e
2. DB::Exception::Exception(String&&, int, String, bool) @ 0x000000000904a38e
3. DB::Exception::Exception(PreformattedMessage&&, int) @ 0x0000000009049f80
4. DB::Exception::Exception<String const&>(int, FormatStringHelperImpl<std::type_identity<String const&>::type>, String const&) @ 0x000000000a11162b
5. ./build.release/./src/Disks/ObjectStorages/MetadataStorageFromPlainObjectStorage.cpp:297: DB::MetadataStorageFromPlainObjectStorageTransaction::createDirectory(String const&) @ 0x0000000015052943
6. ./build.release/./src/Disks/ObjectStorages/DiskObjectStorageTransaction.cpp:708: void std::__function::__policy_invoker<void (std::shared_ptr<DB::IMetadataTransaction>)>::__call_impl[abi:ne190107]<std::__function::__default_alloc_func<DB::DiskObjectStorageTransaction::createDirectories(String const&)::$_0, void (std::shared_ptr<DB::IMetadataTransaction>)>>(std::__function::__policy_storage const*, std::shared_ptr<DB::IMetadataTransaction>&&) @ 0x0000000015018de7
7. ./contrib/llvm-project/libcxx/include/__functional/function.h:716: ? @ 0x0000000015018c2f
8. ./build.release/./src/Disks/ObjectStorages/DiskObjectStorageTransaction.cpp:1067: DB::DiskObjectStorageTransaction::commit(std::variant<std::monostate, DB::MetaInKeeperCommitOptions<zkutil::ZooKeeper>, DB::MetaInKeeperCommitOptions<DB::ZooKeeperWithFaultInjection>> const&) @ 0x0000000015017e8f
9. ./build.release/./src/Disks/ObjectStorages/DiskObjectStorage.cpp:389: DB::DiskObjectStorage::createDirectories(String const&) @ 0x0000000015005775
10. ./build.release/./src/Storages/MergeTree/MergeTreeData.cpp:409: DB::MergeTreeData::initializeDirectoriesAndFormatVersion(String const&, bool, String const&, bool) @ 0x000000001903f500
11. ./build.release/./src/Storages/StorageMergeTree.cpp:170: DB::StorageMergeTree::StorageMergeTree(DB::StorageID const&, String const&, DB::StorageInMemoryMetadata const&, DB::LoadingStrictnessLevel, std::shared_ptr<DB::Context>, String const&, DB::MergeTreeData::MergingParams const&, std::unique_ptr<DB::MergeTreeSettings, std::default_delete<DB::MergeTreeSettings>>) @ 0x0000000019000a9c
12. ./contrib/llvm-project/libcxx/include/__memory/construct_at.h:41: std::shared_ptr<DB::StorageMergeTree> std::allocate_shared[abi:ne190107]<DB::StorageMergeTree, std::allocator<DB::StorageMergeTree>, DB::StorageID const&, String const&, DB::StorageInMemoryMetadata&, DB::LoadingStrictnessLevel const&, std::shared_ptr<DB::Context>&, String&, DB::MergeTreeData::MergingParams&, std::unique_ptr<DB::MergeTreeSettings, std::default_delete<DB::MergeTreeSettings>>, 0>(std::allocator<DB::StorageMergeTree> const&, DB::StorageID const&, String const&, DB::StorageInMemoryMetadata&, DB::LoadingStrictnessLevel const&, std::shared_ptr<DB::Context>&, String&, DB::MergeTreeData::MergingParams&, std::unique_ptr<DB::MergeTreeSettings, std::default_delete<DB::MergeTreeSettings>>&&) @ 0x000000001941a5f6
13. ./contrib/llvm-project/libcxx/include/__memory/shared_ptr.h:851: DB::create(DB::StorageFactory::Arguments const&) @ 0x000000001941720d
14. ./contrib/llvm-project/libcxx/include/__functional/function.h:716: ? @ 0x000000001897a6fb
15. ./build.release/./src/Interpreters/InterpreterCreateQuery.cpp:1951: DB::InterpreterCreateQuery::doCreateTable(DB::ASTCreateQuery&, DB::InterpreterCreateQuery::TableProperties const&, std::unique_ptr<DB::DDLGuard, std::default_delete<DB::DDLG26. start_thread @ 0x0000000000094ac3
27. __GI___clone3 @ 0x0000000000126850
 (version 25.9.1.1)

Leaving it as a no-op.

auto read_buf = object_storage->readObject(object, settings);
readStringUntilEOF(local_path, *read_buf);
if (metadata->size_bytes == 0)
LOG_TRACE(log, "The object with the key '{}' has size 0, skipping the read", remote_metadata_path);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we skipping the read, should we return here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory (though unlikely), top-level objects can be created for an empty directory. No return ensures that they are read and added to the in-memory map.

@jkartseva
Copy link
Copy Markdown
Member Author

cache test failures are related.

@tuanpach
Copy link
Copy Markdown
Member

tuanpach commented Sep 1, 2025

There is something wrong with the upgrade check

Started setup_minio.sh asynchronously with PID 28
Run command: [/mc ls clickminio/test | grep -q .]
mc: <ERROR> Unable to list folder. Requested path `/repo/clickminio` not found
ERROR: command failed, exit code: 1, retry: 0/1
Run command: [/mc ls clickminio/test | grep -q .]
mc: <ERROR> Unable to list folder. Requested path `/repo/clickminio` not found
ERROR: command failed, exit code: 1, retry: 0/1
Run command: [/mc ls clickminio/test | grep -q .]
mc: <ERROR> Unable to list folder. Requested path `/repo/clickminio` not found
ERROR: command failed, exit code: 1, retry: 0/1
Run command: [/mc ls clickminio/test | grep -q .]
mc: <ERROR> Unable to list folder. Requested path `/repo/clickminio` not found
ERROR: command failed, exit code: 1, retry: 0/1
Run command: [/mc ls clickminio/test | grep -q .]
mc: <ERROR> Unable to list folder. Requested path `/repo/clickminio` not found
ERROR: command failed, exit code: 1, retry: 0/1
Run command: [/mc ls clickminio/test | grep -q .]
mc: <ERROR> Unable to list folder. Requested path `/repo/clickminio` not found
ERROR: command failed, exit code: 1, retry: 0/1
Run command: [/mc ls clickminio/test | grep -q .]
mc: <ERROR> Unable to list folder. Requested path `/repo/clickminio` not found
ERROR: command failed, exit code: 1, retry: 0/1
Run command: [/mc ls clickminio/test | grep -q .]
+ clickhouse start --user root
WARNING:root:Timeout exceeded. Send SIGTERM to process 2220, timeout 3600
WARNING:root:Wait the process 2220 to terminate
Run action failed for: [Upgrade check (amd_asan)] with exit code [-15]
Job timed out: [Upgrade check (amd_asan)] exit code [-15]
ERROR: Run action failed with timeout and did not generate JobReport - update dummy report with execution time
Command `python3 /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/tests/ci/upgrade_check.py` has failed, timeout 3600s is exceededRun action done for: [Upgrade check (amd_asan)]
ERROR: Job was killed - generate evidence
INFO:botocore.credentials:Found credentials from IAM Role: ec2_admin
Posting slack message, dry_run [False]
{}
ERROR: Run failed with exit cod

@jkartseva
Copy link
Copy Markdown
Member Author

tests/config/install.sh changes should not impact; merging the master to see if it reproduces

@jkartseva
Copy link
Copy Markdown
Member Author

It looks like the issue this PR addresses was reproduced for the 25.8 release, because of the new tests/config/config.d/storage_conf_03602.xml config, which is why the upgrade checks have failed.

INFO:root:Will download v25.8.1.5101-lts
INFO:build_download_helper:Downloading from https://github.com/ClickHouse/ClickHouse/releases/download/v25.8.1.5101-lts/clickhouse-client_25.8.1.5101_amd64.deb to temp path previous_release_package_folder/clickhouse-client_25.8.1.5101_amd64.deb
2025.09.01 20:56:28.202644 [ 84360 ] {} <Information> FileCache(disk_cache_03517): Loading filesystem cache with 16 threads from /var/lib/clickhouse/caches/disks/cache_03517/
2025.09.01 20:56:28.203444 [ 84360 ] {} <Information> DiskCache: Registered cached disk (`disk_cache_03517`) with structure: DiskObjectStorage-disk_s3_plain_rewritable_03517(CachedObjectStorage-disk_cache_03517(PlainRewritableS3ObjectStorage))
2025.09.01 20:56:28.212478 [ 84360 ] {} <Debug> deleteFileFromS3: Object with path common/kmhnpqjtnxhfugtqjxtivxeubhfntaen/clickhouse_access_check_4b2148cd-bcac-4516-8f2d-95c2f88d043d was removed from S3
2025.09.01 20:56:28.216584 [ 84360 ] {} <Debug> MetadataStorageFromPlainObjectStorage: Loading metadata
2025.09.01 20:56:28.243529 [ 85044 ] {} <Debug> ReadBufferFromAzureBlobStorage: Exception caught during Azure Download for file __meta/awzdrmwzlaacnlbzfxuclhjmvormzwlb/prefix.path at offset 0 at attempt 1/4: The range specified is invalid for the current size of the resource.
RequestId:a028bd9a-701e-00c1-1672-1bb7f1000000
Time:2025-09-01T18:56:28.2413059Z
2025.09.01 20:56:28.251484 [ 85044 ] {} <Debug> ReadBuffer: ReadBuffer is canceled by the exception: std::exception. Code: 1001, type: Azure::Storage::StorageException, e.what() = 416 The range specified is invalid for the current size of the resource.
The range specified is invalid for the current size of the resource.
RequestId:a028bd9a-701e-00c1-1672-1bb7f1000000
Time:2025-09-01T18:56:28.2413059Z
Request ID: a028bd9a-701e-00c1-1672-1bb7f1000000, Stack trace (when copying this message, always include the lines below):

0. ./contrib/llvm-project/libcxx/include/__memory/compressed_pair.h:49: Azure::Storage::StorageException::CreateFromResponse(std::unique_ptr<Azure::Core::Http::RawResponse, std::default_delete<Azure::Core::Http::RawResponse>>) @ 0x000000001f8d7aed
1. ./ci/tmp/build/./contrib/azure/sdk/storage/azure-storage-blobs/src/rest_client.cpp:3710: Azure::Storage::Blobs::_detail::BlobClient::Download(Azure::Core::Http::_internal::HttpPipeline&, Azure::Core::Url const&, Azure::Storage::Blobs::_detail::BlobClient::DownloadBlobOptions const&, Azure::Core::Context const&) @ 0x000000001f8b66bd
2. ./ci/tmp/build/./contrib/azure/sdk/storage/azure-storage-blobs/src/blob_client.cpp:213: Azure::Storage::Blobs::BlobClient::Download(Azure::Storage::Blobs::DownloadBlobOptions const&, Azure::Core::Context const&) const @ 0x000000001f880996
3. ./ci/tmp/build/./src/Disks/IO/ReadBufferFromAzureBlobStorage.cpp:254: DB::ReadBufferFromAzureBlobStorage::initialize(unsigned long) @ 0x0000000016381873
4. ./ci/tmp/build/./src/Disks/IO/ReadBufferFromAzureBlobStorage.cpp:99: DB::ReadBufferFromAzureBlobStorage::nextImpl() @ 0x000000001637fdd9
5. ./ci/tmp/build/./src/IO/ReadBuffer.cpp:96: DB::ReadBuffer::next() @ 0x000000001348cb16
6. ./src/IO/ReadBuffer.h:81: DB::readStringUntilEOF(String&, DB::ReadBuffer&) @ 0x00000000134a18b7
7. ./ci/tmp/build/./src/Disks/ObjectStorages/MetadataStorageFromPlainRewritableObjectStorage.cpp:162: void std::__function::__policy_invoker<void ()>::__call_impl[abi:ne190107]<std::__function::__default_alloc_func<DB::MetadataStorageFromPlainRewritableObjectStorage::load(bool)::$_0, void ()>>(std::__function::__policy_storage const*) @ 0x00000000172b6ec2
8. ./contrib/llvm-project/libcxx/include/__functional/function.h:716: ? @ 0x0000000014ce2891
9. ./src/Common/threadPoolCallbackRunner.h:183: DB::ThreadPoolCallbackRunnerLocal<void, ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>, std::function<void ()>>::operator()(std::function<void ()>&&, Priority, std::optional<unsigned long>)::'lambda'()::operator()() @ 0x0000000014ce265f
10. ./contrib/llvm-project/libcxx/include/__functional/function.h:716: ? @ 0x00000000134fcf6b
11. ./contrib/llvm-project/libcxx/include/__type_traits/invoke.h:117: void std::__function::__policy_invoker<void ()>::__call_impl[abi:ne190107]<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false, true>::ThreadFromGlobalPoolImpl<void (ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool::*)(), ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool*>(void (ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool::*&&)(), ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool*&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x00000000135042e6
12. ./contrib/llvm-project/libcxx/include/__functional/function.h:716: ? @ 0x00000000134f9f52
13. ./contrib/llvm-project/libcxx/include/__type_traits/invoke.h:117: void* std::__thread_proxy[abi:ne190107]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void (ThreadPoolImpl<std::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::thread>::ThreadFromThreadPool*>>(void*) @ 0x0000000013501a1a
14. ? @ 0x0000000000094ac3
15. ? @ 0x0000000000126850
 (version 25.8.1.5101 (official build))

@clickhouse-gh clickhouse-gh bot added pr-bugfix Pull request with bugfix, not backported by default and removed pr-not-for-changelog This PR should not be mentioned in the changelog labels Sep 2, 2025
@jkartseva
Copy link
Copy Markdown
Member Author

I'll add the config in a separate PR after the fix is merged to 25.8

@jkartseva jkartseva added this pull request to the merge queue Sep 4, 2025
Merged via the queue into ClickHouse:master with commit 52d46b3 Sep 4, 2025
239 of 240 checks passed
@jkartseva jkartseva deleted the fix-tde-azure branch September 4, 2025 01:10
@robot-clickhouse robot-clickhouse added the pr-synced-to-cloud The PR is synced to the cloud repo label Sep 4, 2025
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added pr-backports-created-cloud deprecated label, NOOP pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR labels Sep 4, 2025
robot-ch-test-poll1 added a commit that referenced this pull request Sep 4, 2025
Cherry pick #86400 to 25.8: Fix `The range specified is invalid for the current size of the resource` when reading from azure disk with plain_rewritable and encryption
robot-clickhouse added a commit that referenced this pull request Sep 4, 2025
…current size of the resource` when reading from azure disk with plain_rewritable and encryption
@robot-clickhouse robot-clickhouse added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Sep 4, 2025
clickhouse-gh bot added a commit that referenced this pull request Sep 4, 2025
Backport #86400 to 25.8: Fix `The range specified is invalid for the current size of the resource` when reading from azure disk with plain_rewritable and encryption
@jkartseva jkartseva added v25.6-must-backport and removed pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-backports-created-cloud deprecated label, NOOP pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR labels Sep 8, 2025
@robot-ch-test-poll2 robot-ch-test-poll2 added pr-backports-created-cloud deprecated label, NOOP pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR labels Sep 9, 2025
robot-ch-test-poll added a commit that referenced this pull request Sep 9, 2025
Cherry pick #86400 to 25.6: Fix `The range specified is invalid for the current size of the resource` when reading from azure disk with plain_rewritable and encryption
robot-clickhouse added a commit that referenced this pull request Sep 9, 2025
…current size of the resource` when reading from azure disk with plain_rewritable and encryption
@jkartseva jkartseva removed pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-backports-created-cloud deprecated label, NOOP labels Sep 9, 2025
@robot-ch-test-poll1 robot-ch-test-poll1 added pr-backports-created-cloud deprecated label, NOOP pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR labels Sep 9, 2025
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Sep 9, 2025
jkartseva added a commit that referenced this pull request Sep 10, 2025
Backport #86400 to 25.6: Fix `The range specified is invalid for the current size of the resource` when reading from azure disk with plain_rewritable and encryption
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-backports-created-cloud deprecated label, NOOP pr-bugfix Pull request with bugfix, not backported by default pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo v25.6-must-backport v25.8-must-backport

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants