Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw/sync-policy: Support disabling per-bucket replication #49803

Merged
merged 1 commit into from Apr 6, 2023

Conversation

soumyakoduri
Copy link
Contributor

@soumyakoduri soumyakoduri commented Jan 20, 2023

When the zones replicate, allow disabling replication for specific buckets using sync policy.

These are the semantics to be followed while resolving the policy conflicts -

zonegroup bucket Result
enabled enabled enabled
enabled allowed enabled
enabled forbidden disabled
allowed enabled enabled
allowed allowed disabled
allowed forbidden disabled
forbidden enabled disabled
forbidden allowed disabled
forbidden forbidden disabled

In case multiple group policies are set to reflect for any sync pair (<source-zone,source-bucket>, <dest-zone,dest-bucket>), the following rules are applied in the order -

  1. Even if one policy status is FORBIDDEN, the sync will be disabled
  2. Atleast one policy should be ENABLED for the sync to be allowed.

Fixes: https://tracker.ceph.com/issues/58518
Signed-off-by: Soumya Koduri skoduri@redhat.com

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@soumyakoduri
Copy link
Contributor Author

@yehudasa @cbodley @mattbenjamin .. please comment if the rule semantics mentioned in this PR seem correct to be followed to resolve the sync policies.

The changes in this commit seem to work to disable bucket replication. But I will test all other cases too and confirm.

@mattbenjamin
Copy link
Contributor

@soumyakoduri the truth table appears to make sense; does this address the actual issue Daniel Parkes reported, or is it a fix for issues you found while debugging that?

@yehudasa
Copy link
Member

@soumyakoduri the table makes sense. Is it working at the <zone, bucket> granularity? E.g., can disable sync to/from a bucket at a specific zone. Also, is it possible to disable bucket as source, but not as target (and vice versa)? Will need to test all these cases.

@soumyakoduri
Copy link
Contributor Author

@soumyakoduri the table makes sense. Is it working at the <zone, bucket> granularity? E.g., can disable sync to/from a bucket at a specific zone. Also, is it possible to disable bucket as source, but not as target (and vice versa)? Will need to test all these cases.

Thanks a lot Yehuda. yes..I will test these cases too and update . Hopefully will try to include them as part of #38238 as well.

@soumyakoduri
Copy link
Contributor Author

@soumyakoduri the truth table appears to make sense; does this address the actual issue Daniel Parkes reported, or is it a fix for issues you found while debugging that?

This PR is to specifically address https://bugzilla.redhat.com/show_bug.cgi?id=2132554 . But since the changes affect all the sync-policy use-cases, as discussed in the refactoring call, wanted to define and confirm the semantics for testing and any future reference.

@soumyakoduri soumyakoduri requested a review from a team as a code owner February 20, 2023 13:26
@soumyakoduri soumyakoduri removed the request for review from a team February 21, 2023 02:54
@soumyakoduri
Copy link
Contributor Author

soumyakoduri commented Feb 21, 2023

I have addressed few bugs and outlined the testcases I tested/plan to test here - https://docs.google.com/document/d/19oBQA-bYxLBR4BnekA2DTwJJaTFvjAfrqAk9G3RGU0I/edit#heading=h.4qac9dpc76m

Please check if that covers all the cases @yehudasa

@soumyakoduri
Copy link
Contributor Author

@yehudasa I found another strange behaviour with sync policy -

Out of 3 zones (zone1, zone2, zone3) within a zonegroup, if a policy is configured at zonegroup level to allow symmetrical sync between only zone1, zone2, I assume the sync shouldn't happen from zone1/zone2 <-> zone3. But that doesn't seem to be the case.

I see objects getting sync'ed from zone1, zone2 to zone3 but strangely not viceversa (i.e, from zone3 -> zone2,zone1 is restricted as expected but zone1, zone2 -> zone3 is not disabled).

Please provide your inputs on the same.

@soumyakoduri
Copy link
Contributor Author

@yehudasa I found another strange behaviour with sync policy -

Out of 3 zones (zone1, zone2, zone3) within a zonegroup, if a policy is configured at zonegroup level to allow symmetrical sync between only zone1, zone2, I assume the sync shouldn't happen from zone1/zone2 <-> zone3. But that doesn't seem to be the case.

I see objects getting sync'ed from zone1, zone2 to zone3 but strangely not viceversa (i.e, from zone3 -> zone2,zone1 is restricted as expected but zone1, zone2 -> zone3 is not disabled).

Please provide your inputs on the same.

I raised https://tracker.ceph.com/issues/58822 for the same.

@soumyakoduri
Copy link
Contributor Author

@yehudasa most of the cases listed (https://docs.google.com/document/d/19oBQA-bYxLBR4BnekA2DTwJJaTFvjAfrqAk9G3RGU0I/edit#) seem to be working at least for two-zones. With 3 zones, I am hitting the issue mentioned in https://tracker.ceph.com/issues/58822.

I will request QE also to test all the cases in detail (provided we have fix for #58822 as well).

Please review the latest changes and let me know your comments.

@soumyakoduri soumyakoduri changed the title [WIP] rgw/sync-policy: Support disabling per-bucket replication rgw/sync-policy: Support disabling per-bucket replication Feb 28, 2023
@soumyakoduri
Copy link
Contributor Author

@yehudasa kindly review the changes.

Copy link
Member

@yehudasa yehudasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soumyakoduri see my comments

@@ -482,6 +488,30 @@ void RGWBucketSyncFlowManager::pipe_set::insert(const rgw_sync_bucket_pipe& pipe
handlers.insert(h);
}

void RGWBucketSyncFlowManager::pipe_set::remove_all() {
pipe_map.erase(pipe_map.begin(), pipe_map.end());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soumyakoduri any reason why you're not calling pipe_map.clear() instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Addressed it. Also this routine is currently not used anywhere (in this fix), but just added as a helper routine if any need arise later.

@@ -494,14 +524,15 @@ bool RGWBucketSyncFlowManager::allowed_data_flow(const rgw_zone_id& source_zone,
bool check_activated) const
{
bool found = false;
bool found_activated = false;
bool found_activated = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would return true if flow_groups are empty. However, I can't find anywhere were it's even called with check_activated true, so maybe a moot point. Am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. It is not used anywhere at the moment with check_activated set to true.

But while working on this fix, I assumed it will be set (if ever in future) only for the child policies.. So in case if any child/bucket policy is missing the flow-group, it should inherit from parent policy and hence should return "true" in such case.

Since its not needed for this fix, will take it up later as it may need further discussion.

source_pipes->insert(pipe);
if (is_forbidden) {
ldpp_dout(dpp, 20) << __func__ << "(): flow manager (bucket=" << effective_bucket_key << "): removing source pipe: " << pipe << dendl;
source_pipes->disable(pipe);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soumyakoduri do we have ordering issue, where we can enable stuff after we disabled that depending on the flow_group order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes..while testing I noticed it.

If there are two policies set, for eg.,

  1. policy 1 - FORBIDDEN - <src_pipe,dest_pipe>
  2. policy 2 - ENABLED - <src_pipe, dest_pipe>
    policy2 is processed last in the order and thus granted the access.

To avoid that, have included disabled_pipes to store the restricted <pipe_set>s if any and match with the ones being inserted.

@soumyakoduri
Copy link
Contributor Author

jenkins test make check arm64

@soumyakoduri
Copy link
Contributor Author

jenkins test windows

@soumyakoduri
Copy link
Contributor Author

@yehudasa I addressed your comments and an issue (found during testing) with removing an element in pipe_map . Kindly review the changes.

Copy link
Member

@yehudasa yehudasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

When the zones replicate, allow disabling replication
for specific buckets using sync policy.

These are the semantics to be followed while resolving the policy
conflicts -

==================================================
zonegroup		bucket		Result
==================================================
enabled			enabled		enabled
			allowed		enabled
			forbidden	disabled
allowed			enabled		enabled
			allowed		disabled
			forbidden	disabled
forbidden		enabled		disabled
			allowed		disabled
			forbidden	disabled

In case multiple group policies are set to reflect for any sync pair
(<source-zone,source-bucket>, <dest-zone,dest-bucket>), the following
rules are applied in the order -
1) Even if one policy status is FORBIDDEN, the sync will be disabled
2) Atleast one policy should be ENABLED for the sync to be allowed.

Various cases tested are outlined here -
https://docs.google.com/document/d/19oBQA-bYxLBR4BnekA2DTwJJaTFvjAfrqAk9G3RGU0I/edit#heading=h.4qac9dpc76m

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
@soumyakoduri
Copy link
Contributor Author

lgtm

Thanks Yehuda.

teuthology run: #49803

@soumyakoduri
Copy link
Contributor Author

jenkins test make check arm64

@cbodley cbodley merged commit 7d4c7b6 into ceph:main Apr 6, 2023
10 of 11 checks passed
This was referenced May 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants