Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the new Azure SDK for C++. #3910

Merged
merged 40 commits into from
Apr 23, 2023
Merged

Use the new Azure SDK for C++. #3910

merged 40 commits into from
Apr 23, 2023

Conversation

teo-tsirpanis
Copy link
Member

@teo-tsirpanis teo-tsirpanis commented Feb 17, 2023

Note This PR is not ready to merge; the acquisition of the new SDK is very clumsy and works only on Windows. But it is ready for review.

To communicate with Azure Storage we have been using the Azure C++ Storage Client Lite which is no longer in active development. This PR switches the Azure VFS to use the new Azure SDK for C++. The dependency to the old SDK is removed.

The migration is mostly a straightforward change from the old SDK calls to the new ones. The new SDK does not use futures, simplifying the code. As a bonus, we now pass more detailed error messages. Some stuff around configuration will now work differently, and I added relevant warnings.

Azure unit tests succeed on my machine. Let's see what CI thinks.

SC-21996
SC-25039
SC-25045
SC-25046


TYPE: IMPROVEMENT
DESC: Use the new Azure SDK for C++.

@teo-tsirpanis teo-tsirpanis changed the title [NO MERGE] Use the maintained Azure SDK for C++. [NO MERGE] Use the new Azure SDK for C++. Feb 17, 2023
@teo-tsirpanis
Copy link
Member Author

I think I found the problem. For some reason the BlobServiceClient's endpoint URI gets reset (or we create one with unset parameters).

@jabbera
Copy link

jabbera commented Mar 7, 2023

Does this mean the azure backend will now be supported on Windows?

@ihnorton
Copy link
Member

ihnorton commented Mar 7, 2023

@jabbera, yes - @teo-tsirpanis develops primarily on Windows, so it will definitely be supported there.

@teo-tsirpanis teo-tsirpanis changed the title [NO MERGE] Use the new Azure SDK for C++. Use the new Azure SDK for C++. Mar 8, 2023
@teo-tsirpanis
Copy link
Member Author

Well, on Windows CI vcpkg does not run at all. 🥲

@teo-tsirpanis
Copy link
Member Author

CI on Linux seems to fail because the Azurite version we use is old.

@teo-tsirpanis
Copy link
Member Author

I think I know why vcpkg doesn't get imported on Windows; we are setting CMAKE_TOOLCHAIN_FILE in the workflow through an environment variable.

@ihnorton
Copy link
Member

ihnorton commented Mar 8, 2023

I think I know why vcpkg doesn't get imported on Windows; we are setting CMAKE_TOOLCHAIN_FILE in the workflow through an environment variable.

Ugh. Good catch. That was used to work around this problem. Fix: #3949

@teo-tsirpanis
Copy link
Member Author

teo-tsirpanis commented Mar 22, 2023

The [azure] unit tests succeed on a premium block blob storage account on the real Azure cloud.

I kept a branch with the older Windows-only way of acquiring the Azure SDK that works on my machine, and tested it from there.

The storage browser of the TileDB test azure bucket. It shows some files existing, that were created from the unit tests.

@ihnorton ihnorton requested a review from shaunrd0 March 22, 2023 15:45
Copy link
Contributor

@shaunrd0 shaunrd0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! 👍 I was able to run the UTs locally against Azurite and all of them passed. I ran into some problems with a debug build, and noticed the Azure tests take a long time to retry requests up to failure.

tiledb/sm/cpp_api/config.h Outdated Show resolved Hide resolved
cmake/TileDB-Superbuild.cmake Show resolved Hide resolved
client_ = make_shared<azure::storage_lite::blob_client>(
HERE(), account, thread_pool_->concurrency_level(), cert_file);
::Azure::Storage::Blobs::BlobClientOptions options;
options.Retry.MaxRetries = constants::azure_max_attempts;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the following options here, after trying to run UTs and hanging for some time..

  options.Retry.RetryDelay = std::chrono::milliseconds(100);
  options.Retry.MaxRetryDelay = std::chrono::milliseconds(1000);

Azure SDK will increase the RetryDelay up to MaxRetryDelay for-each retry.. The default options seem to allow for up to 120s of hang time for a request retry. With the retry count set to 10, failing this request 10 times using these defaults could take quite a while. The C++ docs offer very little information on this 😕 so the other link here is from JS docs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shaunrd0 and I decided to use the default Azure SDK retry policy (three retries starting from 800ms and growing exponentially).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make these config options with defaults? We should also standardize the timing and retry count with what we have for AWS if we can, will just make it simplier to document and explain to users.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added config options. Standardizing across all cloud providers is hard because each SDK allows configuring the retry policies differently and with varying degrees of freedom.

tiledb/sm/filesystem/azure.h Show resolved Hide resolved
tiledb/sm/filesystem/azure.cc Outdated Show resolved Hide resolved
@teo-tsirpanis
Copy link
Member Author

@davisp do I have to copy the new Azure SDK ports to the ports folder? What is the procedure for this? A simple copy-paste caused build errors.

@davisp
Copy link
Contributor

davisp commented Apr 4, 2023

@teo-tsirpanis Can you point me at the version you were trying to use? It shouldn't be much more than a copy/paste but the version resolution ends up being mostly manual because I've used super old port versions to match our existing Find${project}_EP.cmake configuration. There might be quite a bit of finagling on what new azure might force us to upgrade.

@teo-tsirpanis

This comment was marked as off-topic.

@ihnorton
Copy link
Member

CI fixes in #4055

@ihnorton ihnorton merged commit 969dd1c into TileDB-Inc:dev Apr 23, 2023
45 of 47 checks passed
@teo-tsirpanis teo-tsirpanis deleted the azure branch April 23, 2023 13:26
ihnorton pushed a commit that referenced this pull request Jun 16, 2023
A follow-up to #3910, this PR fixes typos, improves a message, and infers the blob endpoint from `vfs.azure.storage_account_name` if `vfs.azure.blob_endpoint` was not specified, matching the behavior of the previous SDK.

I also added `CONFIG` changelog entries that I forgot to do in #3910.

---
TYPE: IMPROVEMENT
DESC: Improvements to the Azure VFS.

TYPE: CONFIG
DESC: Add vfs.azure.max_retries, vfs.azure.retry_delay_ms and vfs.azure.max_retry_delay_ms options to control the retry policy when connecting to Azure.

TYPE: CONFIG
DESC: Remove the vfs.azure.use_https option; connecting to Azure with HTTP will require specifying the scheme in vfs.azure.blob_endpoint instead.

===

* Remove unnecessary headers.

* Fix typos and improve a message.

* Construct the blob endpoint from the account name if not specified.

* Bring back `vfs.azure.storage_sas_token`.

* Append the SAS token to the blob endpoint if specified.

This keeps backwards compatibility with the previous Azure SDK.

* Update documentation of config values.

* Fix typo.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants