Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbac_migrator: suppress error on role_exists #17800

Merged

Conversation

pgellert
Copy link
Contributor

@pgellert pgellert commented Apr 11, 2024

If the leader running the feature migration loses leadership after
the role is created but before the feature migration is successfully
completed, the next leader will redo the feature migration. In that
case, we will get a role_exists error, which we can safely ignore.

This was discovered through the following CI failure: https://ci-artifacts.dev.vectorized.cloud/redpanda/47611/018ec8a8-1fa5-49b2-8d82-1577c14d20bf/vbuild/ducktape/results/final/report.html

Fixes https://redpandadata.atlassian.net/browse/CORE-2287
Fixes #17736

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x

Release Notes

  • none

See the comment in the code change for why this is safe.
@@ -42,7 +42,17 @@ ss::future<> rbac_migrator::do_mutate() {
auto err = co_await _controller.get_security_frontend().local().create_role(
role_name, std::move(role), model::timeout_clock::now() + 5s);

if (err) {
if (err == cluster::errc::role_exists) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that in-between more users were created? Are migrations guaranteed to run before any other operations on the cluster can happen?

We should probably document explicitly (if not done already) that migrations must be idempotent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that in-between more users were created?

Yes, and that's fine. In that case, the migration will appear to have happened at the original time (before the users were added). This role is created as a one-time convenience to showcase roles and it won't be kept up-to-date with users added after the migration.

Are migrations guaranteed to run before any other operations on the cluster can happen?

The logical version of the cluster is only incremented after the migration and the RBAC feature is only activated after the migration, so other RBAC operations guarded by the RBAC feature flag won't happen. But otherwise, operations aren't blocked by the migration as far as I can tell.

We should probably document explicitly (if not done already) that migrations must be idempotent.

I now noticed that there was already a hint about migrations executing at least once (here), but I've now added one more comment to state that do_mutate should be idempotent: 33d857f

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and it won't be kept up-to-date with users added after the migration.

wondering why are we bothering to do it at all then :) wouldn't it just confuse users more?

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Apr 11, 2024

Copy link
Member

@oleiman oleiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@vbotbuildovich
Copy link
Collaborator

new failures in https://buildkite.com/redpanda/redpanda/builds/47670#018ecddc-1925-4895-b3e9-21c5d85ad0f5:

"rptest.tests.cloud_storage_timing_stress_test.CloudStorageTimingStressTest.test_cloud_storage_with_partition_moves.cleanup_policy=delete"

@pgellert pgellert merged commit 3e01a27 into redpanda-data:dev Apr 12, 2024
18 checks passed
@pgellert pgellert deleted the rbac/fix-rbac-migration-idempotence branch April 12, 2024 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants