Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloud_storage: add Azure Blob Storage Support for Tiered Storage #8086

Merged
merged 22 commits into from
Jan 19, 2023

Conversation

VladLazar
Copy link
Contributor

@VladLazar VladLazar commented Jan 6, 2023

Cover Letter

Fixes https://github.com/redpanda-data/core-internal/issues/127
Fixes #7556

High Level Description

This set of patches adds Azure Blob Storage support to Redpanda Tiered Storage. In theory, everything that works with S3 should also be supported in ABS as all RPCs used during normal operation are implemented.

Configuration

ABS can be configured via the three cluster level configuration properties listed below. If cloud_storage_azure_storage_account is set, then all of them need to be set (otherwise Redpanda will refuse to start).
Note that if you wish to disable TLS for testing via cloud_storage_disable_tls you need to override the port via cloud_storage_api_endpoint_port. I'll attach an example config file to this PR.

  • cloud_storage_azure_storage_account: the name of your storage account
  • cloud_storage_azure_container: the name of the container
  • cloud_storage_azure_shared_key: Shared Key provided by Azure

Implementation Details

This PR can be split in a few high level logical stages:

  1. Preliminaries: small tidy-ups, adjusting the client probe, adding ABS error codes
  2. ABS client implementation
  3. Configuration options and hooking the ABS client in
  4. Changes in the test tree to get ABS tests running in CI against Azurite

The general idea is that an instance of the cloud_storage_clients::client_configuration variant is created at start-up
and threaded into the relevant sharded service. It eventually reaches cloud_storage_clients::client_pool where it drives the type of client that's created.

The ABS client itself is designed in a similar way to the S3 client. It handles most exceptions locally and returns a recommendation to the remote on how to proceed (retry, slow down, give up, etc.).

Testing

The idea is to leverage the existing tiered storage tests and run them against ABS. This PR enables a number
of tiered storage tests to run with ABS in CI, but the list is not exhaustive. I've also been testing this locally and
it seems pretty solid thus far.

We also need to:

  • Run tests against real Azure in CDT
  • Find the write throughput limit from which we start getting throttled.

Backports Required

  • none - not a bug fix
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v22.3.x
  • v22.2.x
  • v22.1.x

UX Changes

Release Notes

Features

  • Redpanda gains support for using Azure Blob Storage with Tiered Storage. In order to point Redpanda at your ABS container, set the following new cluster configs: cloud_storage_azure_storage_account, cloud_storage_azure_container, cloud_storage_azure_shared_key.

Improvements

  • The existing vectorized_s3_client.* metrics have been renamed to vectorized_cloud_client.* in order to match the naming on the public_metrics endpoint.

@VladLazar VladLazar force-pushed the abs-part-2 branch 3 times, most recently from 8a38a7c to 9086235 Compare January 11, 2023 17:35
@VladLazar VladLazar marked this pull request as ready for review January 11, 2023 17:42
@VladLazar VladLazar requested a review from a team as a code owner January 11, 2023 17:42
@VladLazar VladLazar requested review from ivotron, jcsp and Lazin and removed request for a team, ivotron and jcsp January 11, 2023 17:42
@VladLazar VladLazar added the area/cloud-storage Shadow indexing subsystem label Jan 11, 2023
from typing import Optional


def key_to_topic(self, key: str) -> Optional[str]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer: I would've liked to move this to rptest.utils.si_utils, but it would have introduced a circular dependency between the clients and said module.

src/v/cloud_storage_clients/abs_client.h Show resolved Hide resolved

namespace cloud_storage_clients {

// NOLINTNEXTLINE
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for links, this would make sense as a comment in the code too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I've added a comment too.

.get();
client.stop().get();
vlog(test_log.info, "done");
ss::sleep(std::chrono::seconds(1)).get();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reviewer: the destructor for seastar's tls::session detaches a background thread to end the connection nicely. The sleep gives that thread a chance to join. See https://github.com/scylladb/seastar/blob/master/src/net/tls.cc#L1472.

Comment on lines +301 to +313
self.cloud_storage_azure_shared_key = 'Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw=='
self.cloud_storage_azure_storage_account = 'devstoreaccount1'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reviewer: By default, Azurite uses devstoreaccount1 with the specified shared key. Both are readily available in the docs.

@VladLazar VladLazar force-pushed the abs-part-2 branch 3 times, most recently from 7ad3571 to b8dace1 Compare January 12, 2023 13:53
@@ -344,7 +344,6 @@ ss::future<upload_result> remote::upload_segment(
if (res) {
_probe.successful_upload();
_probe.register_upload_size(content_length);
co_await reader_handle->close();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for reviewers you need to look up above further in the file to see the redundancy

Comment on lines 261 to 310
const bucket_name& bucket,
const object_key& key) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer to pass values to coroutines

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. This is my biggest gripe with the interface for the S3 client. If we changed to pass-by-value for these two strings, we'd end up making quite a few copies (we can't always move these args in).

As far as I understand this is safe in two scenarios. I'll outline them below, but please keep me honest here:

  1. The client functions get called from a chain of co_await(s) and all the args used in the chain are local to the callsite, have a longer lifetime than the duration of the whole chain, or are created in the call itself. This is the case in the existing non-test code. cloud_storage::remote is the only caller of the clients (outside of the test code) and all calls to it are co_await(ed).

For example:

constexpr bucket_name = "bucket";

foo() {
auto path = gen_path(...);
co_await _remote.upload_segment(
  bucket_name,
  path,
  created_here{...},
  ...);
}

remote::upload_segment(const auto& bucket, const auto& key, ...) {
  co_await abs_client.put_object(bucket, key, ...); 
}
  1. In our test code, we make calls on the remote without using co_await. For example, the code below will end up calling into the client.
remote.delete_object(bucket, object_key(path), ...).get();

I think what keeps this safe is a combination const-lifetime extension to keep the temporaries alive until a co_await is reached, seastar threads, and seastar's chaining of continuations. I've not analysed these as well, so I'm a bit fuzzy here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are correct about these cases. the nice thing about coroutines, i think, is that if that's all that is used, then everything is pretty safe (although cpp core guideline does say to only pass values to coroutines). in redpanda, however, we have a mixture of coroutines and continuation passing style. the problem then becomes: all of the call sites need to be examined to tell if a coroutine is used safely, because mixing those styles is error prone.

If we changed to pass-by-value for these two strings, we'd end up making quite a few copies

what does "quite a few copies" mean? a client that is ostensibly making encrypted http network requests, it seems that copying some strings would be totally lost in the noise.


anyway, nothing to do here in this PR related to this topic, but something for us to consider. like coroutine lambdas, i'd expect that if we start to see lifetime problems that aren't caught in review, that we'll think about getting rid of the pattern.

@emaxerrno
Copy link
Contributor

@VladLazar & @jcsp -

How come we have one config per cloud

cloud_storage_azure_storage_account: the name of your storage account
cloud_storage_azure_container: the name of the container
cloud_storage_azure_shared_key: Shared Key provided by Azure

why not use

cloud_storage_storage_account: the name of your storage account
cloud_storage_container: the name of the container
cloud_storage_shared_key: Shared Key provided by Azure

or a more general scheme. is it because you have multiple schemes ? if so is fine, just trying to understand.

Copy link
Member

@dotnwat dotnwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

watch out for coroutines that take references. it's a use-after-free footgun that is hard to reivew (all call paths have to be evaluated), and usually ends up causing segfault when it happens in release.

@VladLazar
Copy link
Contributor Author

VladLazar commented Jan 13, 2023

@VladLazar & @jcsp -

How come we have one config per cloud

cloud_storage_azure_storage_account: the name of your storage account
cloud_storage_azure_container: the name of the container
cloud_storage_azure_shared_key: Shared Key provided by Azure

why not use

cloud_storage_storage_account: the name of your storage account
cloud_storage_container: the name of the container
cloud_storage_shared_key: Shared Key provided by Azure

or a more general scheme. is it because you have multiple schemes ? if so is fine, just trying to understand.

One of the requests in the PRD is to have Azure "friendly" configuration properties (i.e. not overload the existing cluster configuration properties used for S3). Some of the existing S3 cluster configs don't map well to the Azure domain (e.g. it would be confusing to use cloud_storage_bucket to specify the ABS container). With this in mind, I thought using separate configs and prefixing them with azure will be clearest for the user. Also, when adding Active Directory auth for ABS, we'll have to add more cluster configs that don't map to anything in the S3 world.

In any case, I think Tiered Storage configuration is pretty complex for the end user. It would be very nice to have an RPK guided flow for this. RPK could ask which cloud storage solution you wish to use and then guide you through providing all the required configs on the terminal.

Vlad Lazar added 9 commits January 18, 2023 15:25
`ss::lowres_clock::duration` is an 8 byte integer, so there's no benefit
to passing it by reference. The `body` should always be moved in and
it's safer to have it as a value for use in coroutines.
This commit updates the 'util::drain_response_stream' to check the
response header in order to figure out if the response is chunked, and,
in that case, drain it accordingly.

The logic for draining chunked HTTP responses has also been lifted from
the s3 client.
This commit prepares the generic cloud storage client probe to be used
with the upcoming ABS client. The existing "vectorized_s3_client.*"
metrics have also been renamed to "vectorized_cloud_client.*".
This commit extends the client probe, which is shared by both the S3 and
ABS clients, with a method that can register ABS RPC failures.
This commit introduces a new cloud storage client configuration for abs:
'cloud_storage_clients::abs_configuration'. Note that the type is not
wired into the 'cloud_storage_clients::client_configuration' variant
until later on.
This commit adds a new implementation of 'cloud_storage_client::client'
for Azure Blob Storage. It currently supports the following requests:
Get Blob, Put Blob, Get Blob Metadata, and List Container.

The design is similar to the existing S3 client. Client methods that
send RPCs can only 'abort_requested_exception' and handle all other
exceptions internally (see 'abs_client::send_request').
@VladLazar
Copy link
Contributor Author

Changes in force-push:

  • Removed config for storage api version and lifted the hardcoded value to a more proeminent place. See fde265e.
  • Added a couple of metrics I promised in the RFC. See 23436cd.

@VladLazar VladLazar requested review from Lazin and jcsp January 18, 2023 15:29
Vlad Lazar added 5 commits January 18, 2023 17:12
This commit introduces a few cluster configs for pointing Redpanda to an
Azure container. Note that changing 'cloud_storage_azure_shared_key'
requires a restart at the moment. This should be revisited to allow for
key rotation.
This commit changes the cloud storage client initialisation to create
different client types based on the cluster configuration. Namely, when
'cloud_storage_azure_storage_account' is provided with a value, Redpanda
will try to use Azure Blob Storage instead of S3.

The two most significant code changes in this commit are:
1. Updates to how the configuration objects that drive tiered storage
to make them aware of different clouds ('cloud_storage::configuration'
and 'archival::configuration')
2. Handling of ABS client configuration
('cloud_storage_clients::abs_configuration') in the remote and, most
importantly, in the client pool to create the ABS client.
This commit updates the configuration check performed when cloud storage
is enabled. It now takes the posibility of configuring ABS into account
and returns custom error messages.
* Rename S3ObjectMetadata to ObjectMetadata and convert members to
  snake_case
* Remove 'list_buckets' method from S3 client and RedpandaService
  as the later is not used anywhere.
* Extract _key_to_topic into a separate module to reuse in the ABS
  python client
Vlad Lazar added 7 commits January 19, 2023 14:26
This commit modifies 'RedpandaService' to use different cloud storage
providers based on the 'cloud_storage_type' parameter injected in the
test context.

The logical starting point for this commit is the change to SISettings.
It now acceps a 'TestContext' as it's first argument and inspects in
order to determine if it should use S3 or ABS for the current test.
A new service is added to the Docker environment used for ducktape
testing. It's called azurite and it runs a storage emulator for Azure
Blob Storage.
This commit introduces a python client for ABS that will be used within
ducktape tests. It provides the same functionality as the pre-existing
'S3Client' via duck typing.
This commit parametrises a number of tests that use shadow indexing to
run against both S3 and ABS. This is not an exhaoustive list, but it
should be a good start.
This commit introduces two new metrics related to requests backing off
after being instructed to do so by the cloud storage provider.

New metrics:
* redpanda_cloud_client_upload_backoff:
    * Description: Total number of upload requests that backed off
    * S3 labels: redpanda_endpoint, redpanda_region
    * ABS labels: redpanda_endpoint, redpanda_storage_account
* redpanda_cloud_client_download_backoff:
    * Description: Total number of download requests that backed off
    * S3 labels: redpanda_endpoint, redpanda_region
    * ABS labels: redpanda_endpoint, redpanda_storage_account

Renamed metric:
redpanda_cloud_client_throttled -> redpanda_cloud_client_backoff
This commit marks CDT tests that are running against the "wrong cloud"
as OFAIL. By "wrong cloud" we mean that the globals are configured for
AWS, but the test is parametrised for ABS (and the other way around).
@VladLazar
Copy link
Contributor Author

Change in force-push:

  • Make python ABS client work in CDT contexts
  • Mark tests running against the wrong cloud as OFAIL.

@jcsp jcsp merged commit da29ac9 into redpanda-data:dev Jan 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cloud-storage Shadow indexing subsystem area/redpanda
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Azure Blob Storage via Azurite in CI runs
5 participants