Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CORE-427: Azure AKS OIDC federation / Azure VM user-assigned managed identity support #17157

Merged
merged 15 commits into from
Mar 27, 2024

Conversation

andijcr
Copy link
Contributor

@andijcr andijcr commented Mar 18, 2024

AKS

ASK managed identities are implemented with cloud_role/refresh_credentials implementations
azure_aks_refresh_impl.cc

  • new credentials implementation apply_abs_oauth_credentials.cc

azure_aks_refresh_impl.cc reads these environment variables

AZURE_CLIENT_ID
AZURE_TENANT_ID
AZURE_FEDERATED_TOKEN_FILE
AZURE_AUTHORITY_HOST

to compose a POST request to retrieve an authorization token that azure_aks_refresh_impl.cc will use to authorize calls to ABS.

The env variables are set up directly by AKS.

Azure VM

User-assigned managed identity via IMDS on a azure VM is implemented in azure_vm_refresh_impl
it requires setting cloud_storage_azure_managed_identity_id in configs to the user-assigned managed identity client_id.
The implementation will perform a GET request to the instance metadata service to retrieve an authorization token

Fixes https://github.com/redpanda-data/core-internal/issues/1125
Fixes https://github.com/redpanda-data/core-internal/issues/1126

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x

Release Notes

Features

  • AKS OIDC Federation support
  • Azure VM user-assigned managed identity support

@andijcr andijcr changed the title Azure AKS OIDC Federation support Azure AKS OIDC federation / Azure VM user-assigned managed identity support Mar 18, 2024
@andijcr andijcr force-pushed the feat/aks_oidc_support branch 2 times, most recently from 2390890 to d11ce73 Compare March 18, 2024 22:37
@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Mar 19, 2024

new failures in https://buildkite.com/redpanda/redpanda/builds/46403#018e53e6-7781-4791-9788-c5539a80b1d2:

"rptest.tests.e2e_iam_role_test.ASKRoleFetchTests.test_write"

new failures in https://buildkite.com/redpanda/redpanda/builds/46403#018e540d-e5d6-4582-be55-33ad5036bac0:

"rptest.tests.e2e_iam_role_test.ASKRoleFetchTests.test_write"

new failures in https://buildkite.com/redpanda/redpanda/builds/46435#018e567e-7793-426c-96d4-1343812ca47a:

"rptest.tests.e2e_iam_role_test.ASKRoleFetchTests.test_write"

new failures in https://buildkite.com/redpanda/redpanda/builds/46435#018e5690-3c51-4ff3-8163-8a71a3864772:

"rptest.tests.e2e_iam_role_test.ASKRoleFetchTests.test_write"

new failures in https://buildkite.com/redpanda/redpanda/builds/46733#018e77cf-c715-4835-87a5-2fe6002e31d3:

"rptest.tests.partition_move_interruption_test.PartitionMoveInterruption.test_cancelling_partition_move.replication_factor=3.unclean_abort=False.recovery=no_recovery.compacted=False"

Copy link
Member

@dotnwat dotnwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good.

  • should we avoid the trailing return type?
  • pointed out a few things that looks like they need unit tests
  • do you have a good sense for how well the code coverage is for the tests you've added?
  • is there a test plan written down somewhere for testing on azure?

src/v/cloud_roles/types.cc Outdated Show resolved Hide resolved
src/v/cloud_storage/remote.cc Outdated Show resolved Hide resolved
src/v/cloud_roles/request_response_helpers.cc Outdated Show resolved Hide resolved
src/v/cloud_roles/request_response_helpers.h Outdated Show resolved Hide resolved
src/v/cloud_roles/types.h Outdated Show resolved Hide resolved
src/v/redpanda/admin/server.cc Outdated Show resolved Hide resolved
src/v/redpanda/admin/server.cc Outdated Show resolved Hide resolved
src/v/cloud_storage/remote.cc Show resolved Hide resolved
src/v/cloud_storage/remote.cc Outdated Show resolved Hide resolved
src/v/cloud_storage/types.cc Show resolved Hide resolved
Copy link
Member

@dotnwat dotnwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks really good. just a few questions and things to maybe clean up.

src/v/cloud_storage/remote.cc Show resolved Hide resolved
src/v/cloud_roles/request_response_helpers.cc Show resolved Hide resolved
src/v/cloud_roles/tests/schema_parsing_tests.cc Outdated Show resolved Hide resolved
src/v/cloud_roles/types.h Outdated Show resolved Hide resolved
src/v/config/configuration.cc Outdated Show resolved Hide resolved
src/v/redpanda/admin/server.cc Show resolved Hide resolved
src/v/cloud_storage/remote.cc Show resolved Hide resolved
src/v/cloud_storage/types.cc Show resolved Hide resolved
src/v/cloud_roles/signature.cc Outdated Show resolved Hide resolved
@andijcr
Copy link
Contributor Author

andijcr commented Mar 25, 2024

as discussed, followup with ducktape testing is moved here #17374

not an actual problem, but calling `<< credentials{abs_credentials{}}`

would cause infinite recursion, because the implementation for
credential would not find an operator<< abs_credential, and instead
would call itself since credentials is constructible from
abs_credentials.

the fix is to use a concept to ensure that the actual calling type is
credential and not an implicit conversion
parse_json_response_and_validate acceps a schema string_view, performs
schema validation on the json::Document read from the iobuf parameters,
and returns the document or an error in case of validation failures.
requests urls are saved as is, url params are not removed from the url,
failing simple queries via has_calls and get_latests_request

to not break existing behavior, these calls accepts ignore_url_params to
perform a search only on the part before '?'
oauth on ABS requires an access token and the header x-ms-version.

this commit incroduces apply_abs_oauth_credentials
AKS OpenID Credential federations places a jwt in the pod that can be
used to get an access token with a limited lifespan.
this commit introduces an implmentation of refresh credentials for this
usecase
azure_vm_refresh_impl calls the vm IMDS endpoint to retrieve an
access_token, and converts it to an abs_oauth_credentials

configuration: cloud_storage_azure_managed_identity_id

to be used with cloud_credentials_source::azure_vm_instance_metadata
in config_multi_property_validation

small refactor for the next commit, when support is added for new
cloud_credentials_source
extended validation to ensure that
cloud_storage_azure_managed_identity_id is set if
cloud_storage_credentials_source is azure_vm_instance_metadata
for readability, less lines and clutter
adapted get_abs_config to work with credential_source equal to either
config_file, azure_aks_oidc_federation, azure_vm_instance_metadata.

removed the strict requirement to have a shared_key. for managed
idendities the autorization is done with an access_token.
@andijcr
Copy link
Contributor Author

andijcr commented Mar 26, 2024

fixed merge conflict, addressed comments, trailing return type fix. previous ci failure was #16324 (it's failing a lot in dev)

@andijcr andijcr requested a review from dotnwat March 26, 2024 20:44

return ss::visit(
maybe_jresp,
[](auto err_resp) -> api_response_parse_result {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why doesn't this auto case match everything? will the more specific case (e.g. Document&) be chosen when possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ye it's the rule of template overloading having lower precedence than exact type match

"&resource=https%3A%2F%2Fstorage.azure.com%2F"
"&client_id={}",
client_id_opt.value()));
req.set("Metadata", "true");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is Metadata?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://learn.microsoft.com/en-us/azure/virtual-machines/instance-metadata-service?tabs=linux#security-and-authentication
it's part of a mitigation to prevent redirection of requests.
since it's a get request, this header (required) make sure that the request originated from the app and it's not unintended/malicious

// to the vm-local IMDS
auto req = http::client::request_header{};
req.method(boost::beast::http::verb::get);
// TODO fix deps and use boost::url
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need a follow issue for this? the comment isn't clear what is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't think so, it's a comment i forgot to remove. boost::url has a nice interface to compose get requests, but it's not part of our dependencies. probably not really worth it.

Comment on lines +113 to +114
return std::visit(
ss::make_visitor(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you use ss::visit previous, but std::visit here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just missed it

@dotnwat dotnwat merged commit 22776d1 into redpanda-data:dev Mar 27, 2024
17 checks passed
@dotnwat dotnwat changed the title Azure AKS OIDC federation / Azure VM user-assigned managed identity support CORE-427: Azure AKS OIDC federation / Azure VM user-assigned managed identity support Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants