Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Issue with Ceph Object Storage Multisite Replication #12272

Closed
polyedre opened this issue May 23, 2023 · 22 comments · Fixed by #12327
Closed

Performance Issue with Ceph Object Storage Multisite Replication #12272

polyedre opened this issue May 23, 2023 · 22 comments · Fixed by #12327

Comments

@polyedre
Copy link
Contributor

polyedre commented May 23, 2023

Summary

We have encountered a performance issue with the Multisite replication feature of Ceph Object Storage while using Rook clusters. Scaling the number of rados gateways to 2 or more significantly increases the replication latency, causing delays of 40 seconds or more.

Context

We have been working with Rook clusters and investigating the performance of Ceph Object Storage Multisite replication. Initially, replication between two zones within a Ceph Realm showed excellent results, with 99% of files successfully replicated and latency consistently under 400 ms.

However, when we scaled the number of rados gateways to 2 or more using Rook, we observed a substantial degradation in performance. The p99 latency during replication skyrocketed to 40 seconds or more, indicating a severe issue. The cause of the degradation of the latency seems to be the lock of the Ceph Object Storage bucket index logs.

Feature Request

To address this performance issue, I suggest that the operator distinguish between the first Rados gateway and the others.
The operator should create the first rados gateway with synchronization active, allowing it to handle replication tasks.
Subsequent rados gateways should be deployed without synchronization enabled, focusing on other purposes such as client access or data ingestion.
By implementing this approach, administrators could scale the number of gateways according to their specific needs while maintaining optimal replication performance.

Assistance Offered

We greatly appreciate the efforts of the Rook team in developing and maintaining this valuable tool. If you require any additional information or assistance to further understand the problem and its implications, we are more than willing to collaborate. We can provide any necessary details or perform additional testing to help resolve this performance issue and enhance the functionality of Rook.

Thank you for your attention to this matter.

Sincerely,

@parth-gr
Copy link
Member

The operator should create the first rados gateway with synchronization active, allowing it to handle replication tasks.
Subsequent rados gateways should be deployed without synchronization enabled, focusing on other purposes such as client access or data ingestion.
By implementing this approach, administrators could scale the number of gateways according to their specific needs while maintaining optimal replication performance.

Can you please elaborate on this feature,

So if we keep the Subsequent rados gateways should be deployed without synchronization enabled so how the replication will be performed for this in the multisite setup?

@parth-gr parth-gr self-assigned this May 23, 2023
@BlaineEXE
Copy link
Member

BlaineEXE commented May 23, 2023

Perhaps [ @smanjara ] or @alimaredia have some insights here.

It's my understanding that some Ceph users create multiple deployments of RGWs for a Zone where one deployment is for clients and one for multisite replication. Perhaps that would work for you. I would suggest trying to deploy a CephObjectStore to serve as the replication RGW and another CephObjectStore for client RGWs. [Both using the same CephObjectZone]

[Addendum]:
For Rook, I don't think it would be appropriate to "distinguish" the first RGW from others for this purpose. That may not be behavior all users want, and it would be complicated to codify in the CRD. Instead, I think suggesting advanced users try the multiple-CephObjectStore strategy is more appropriate. Additionally, if there is a performance issue with multisite, that would best be reported to Ceph's redmine bug tracking system.

@alimaredia
Copy link
Contributor

@polyedre what version of Ceph are the Object Stores in Rook?

@polyedre
Copy link
Contributor Author

Thank you all for your valuable insights and responses.

We are currently using version 1.9.13 and are in the process of upgrading our clusters to version 1.10.13.

We did not know that it was possible to deploy several CephObjectStores in a CephObjectZone. This approach will simplify the way we currently deploy the additional Rados Gateways. We agree that distinguishing the first RGW from others within Rook may not be a desired behavior for all users.

We will try the solution proposed by @BlaineEXE and report the performance problem with multi-site replication directly to Ceph's redmine bug tracking system.

Thank you.

@alimaredia
Copy link
Contributor

@BlaineEXE @thotz @smanjara @cbodley So I just learned that there is a config variable for an RGW that you can set to disable that RGW from doing sync (https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread).

I wonder if that would be something we should set in the zone section of the object-store CRD, and then document to provide configuration of this distinction of RGWs in a zone for users.

@thotz
Copy link
Contributor

thotz commented May 25, 2023

@BlaineEXE @thotz @smanjara @cbodley So I just learned that there is a config variable for an RGW that you can set to disable that RGW from doing sync (https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread).

I wonder if that would be something we should set in the zone section of the object-store CRD, and then document to provide configuration of this distinction of RGWs in a zone for users.

Make sense, I am not quite sure which CRD need to hold this info

@alimaredia
Copy link
Contributor

@thotz It would have to be the object-store CRD since the changes would only apply to the RGWs in that object-store. I was thinking of something like this:

zone:
    name: zone1
    runSyncTraffic: false # rgw_run_sync_thread would correspond to this value and this value would be set to true by default`

I think there would have to be a check somewhere saying you need to have one object-store in the zone for sync traffic.

@polyedre
Copy link
Contributor Author

@BlaineEXE @thotz @smanjara @cbodley So I just learned that there is a config variable for an RGW that you can set to disable that RGW from doing sync (https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread).

That is the configuration variable that we set in order to disable the synchronisation on our user-facing RGWs. Sorry for not having detailed this earlier.

We currently disable it directly from the Toolbox with a command like this:

kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph config set ${KEY_ID} rgw_run_sync_thread false

This is not ideal because we need to implement this into our IaC process. Having the possibility to configure this variable from the CephObjectStore would be great!

May I suggest to add the new option to spec.gateway instead ? I don't know the implementation details, but I would find more logic to have the option here:

apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
  name: user-facing-rgw
spec:
  gateway:
    instances: 1
    runSyncTraffic: false
  preservePoolsOnDelete: false
  zone:
    name: zone1

@BlaineEXE
Copy link
Member

I think the gateway spec seems like a good place for that. I'm not sure runSyncTraffic is the best name. It would be nice to help users understand that "sync" is "multisite sync" by indicating that in the name. Also, generally, it's recommended that the API and default value for the data type be the same: i.e., false for a boolean.

So maybe disableMultisiteSyncTraffic?

Unless "sync" traffic can be something other than multisite (in which case we should drop "Multisite"), I think that is a clear name for the API config that doesn't require reviewing docs to remember what the config does at a glance.

@BlaineEXE
Copy link
Member

BlaineEXE commented May 31, 2023

@polyedre you can do it via IaC currently by setting the option in the rook-config-override configmap if desired. However, you will have to know the name of the RGW cluster beforehand. The name is deterministic based on the CephObjectStore name, but it's not as user friendly as adding the option proposed, which still seems like a valuable addition to me.

Update:

I don't think we need a design doc for anyone who wishes to implement this. Adding this configurable field is pretty simple and merely needs a good implementation that pays attention to setting the value to what is specified in the CRD (or false if unspecified) on every reconcile. It would be most ideal to have an e2e CI test for it, but that is not a requirement IMO as long as unit tests are good and the behavior is tested manually for the implementing PR. If this weren't a multi-site PR, I would be tempted to put a good-first-issue flag on it.

apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
  name: user-facing-rgw
spec:
  gateway:
    instances: 1
    diableMultisiteSyncTraffic: <bool>
  preservePoolsOnDelete: false
  zone:
    name: zone1

@polyedre are you or your associates interested in contributing this to Rook?

@polyedre
Copy link
Contributor Author

polyedre commented Jun 1, 2023

We would be happy to contribute this to Rook ! The name disableMultisiteSyncTraffic seems to be a better fit for the configuration option name.

I'll probably start working on this next week. Can you assign this to me?

@galexrt
Copy link
Member

galexrt commented Jun 1, 2023

@polyedre You might be able to assign the task to you by creating a new comment with just /assign in it, see https://rook.io/docs/rook/v1.11/Contributing/development-flow/?h=assign#self-assign-issue.

Should it not work, feel free to ping me.

@github-actions
Copy link

github-actions bot commented Jun 1, 2023

The issue you are trying to assign to yourself is already assigned.

@cbodley
Copy link

cbodley commented Jun 1, 2023

@BlaineEXE @thotz @smanjara @cbodley So I just learned that there is a config variable for an RGW that you can set to disable that RGW from doing sync (https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread).

i would just point out that the rgw_run_sync_thread configurable only controls the client side of replication traffic. this is the side that will send http requests to fetch objects and replication logs from radosgws in other zones

if you also want to prevent a radosgw from serving these replication requests from other zones, you'd want to keep it out of the zone's "endpoints" list

@alimaredia
Copy link
Contributor

@BlaineEXE @thotz @smanjara @cbodley So I just learned that there is a config variable for an RGW that you can set to disable that RGW from doing sync (https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread).

i would just point out that the rgw_run_sync_thread configurable only controls the client side of replication traffic. this is the side that will send http requests to fetch objects and replication logs from radosgws in other zones

if you also want to prevent a radosgw from serving these replication requests from other zones, you'd want to keep it out of the zone's "endpoints" list

We'll have to make sure to mention this in the object-zone docs since the zone endpoints are getting added to that CR manually.

@polyedre
Copy link
Contributor Author

polyedre commented Jun 2, 2023

@cbodley @alimaredia Thanks for mentioning this. Is there a reason to prevent radosgw dedicated to serve user traffic from serving replication requests from other zones?

The reason why we dissociate user-facing radosgw from replication radosgw is that by scaling the number of replication radosgw horizontally we encountered high latencies in the replication. Allowing user-facing radosgw to server replication requests might reduce the load on the replication radosgw. Or am I misunderstanding something here?

/assign

@github-actions
Copy link

github-actions bot commented Jun 2, 2023

The issue you are trying to assign to yourself is already assigned.

@galexrt galexrt assigned polyedre and unassigned BlaineEXE, alimaredia and parth-gr Jun 2, 2023
polyedre pushed a commit to polyedre/rook that referenced this issue Jun 5, 2023
Some users want to deploy two CephObjectStores for a single Zone. The first
configures RGWs to process the synchronization of the data, while the second
CephObjectStore configures the client RGWs.

Currently, this can be implemented by setting the RGW option
'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not
really user friendly.

Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread

This commit adds a new option in the CephObjectStore CRD as defined in issue
rook#12272. The new option
'disableMultisiteSyncTraffic' determine whether the operator should disable the
multisite sync threads for the RGWs.

If the option is set to 'false', or if the option is not specified, the operator
does nothing. This ensures that the multisite sync threads will not be enabled
for users that disabled explicitely the multisite sync threads either manually
or with the 'rook-config-override' ConfigMap.
polyedre added a commit to polyedre/rook that referenced this issue Jun 5, 2023
Some users want to deploy two CephObjectStores for a single Zone. The first
configures RGWs to process the synchronization of the data, while the second
CephObjectStore configures the client RGWs.

Currently, this can be implemented by setting the RGW option
'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not
really user friendly.

Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread

This commit adds a new option in the CephObjectStore CRD as defined in issue
rook#12272. The new option
'disableMultisiteSyncTraffic' determine whether the operator should disable the
multisite sync threads for the RGWs.

If the option is set to 'false', or if the option is not specified, the operator
does nothing. This ensures that the multisite sync threads will not be enabled
for users that disabled explicitely the multisite sync threads either manually
or with the 'rook-config-override' ConfigMap.

Signed-off-by: Lucas Henry <polyedre@disroot.org>
polyedre added a commit to polyedre/rook that referenced this issue Jun 5, 2023
Some users want to deploy two CephObjectStores for a single Zone. The first
configures RGWs to process the synchronization of the data, while the second
CephObjectStore configures the client RGWs.

Currently, this can be implemented by setting the RGW option
'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not
really user friendly.

Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread

This commit adds a new option in the CephObjectStore CRD as defined in issue
rook#12272. The new option
'disableMultisiteSyncTraffic' determine whether the operator should disable the
multisite sync threads for the RGWs.

If the option is set to 'false', or if the option is not specified, the operator
does nothing. This ensures that the multisite sync threads will not be enabled
for users that disabled explicitely the multisite sync threads either manually
or with the 'rook-config-override' ConfigMap.

Signed-off-by: Lucas Henry <polyedre@disroot.org>
polyedre added a commit to polyedre/rook that referenced this issue Jun 5, 2023
Some users want to deploy two CephObjectStores for a single Zone. The first
configures RGWs to process the synchronization of the data, while the second
CephObjectStore configures the client RGWs.

Currently, this can be implemented by setting the RGW option
'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not
really user friendly.

Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread

This commit adds a new option in the CephObjectStore CRD as defined in issue
rook#12272. The new option
'disableMultisiteSyncTraffic' determine whether the operator should disable the
multisite sync threads for the RGWs.

If the option is set to 'false', or if the option is not specified, the operator
does nothing. This ensures that the multisite sync threads will not be enabled
for users that disabled explicitely the multisite sync threads either manually
or with the 'rook-config-override' ConfigMap.

Signed-off-by: Lucas Henry <polyedre@disroot.org>
polyedre added a commit to polyedre/rook that referenced this issue Jun 5, 2023
Some users want to deploy two CephObjectStores for a single Zone. The first
configures RGWs to process the synchronization of the data, while the second
CephObjectStore configures the client RGWs.

Currently, this can be implemented by setting the RGW option
'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not
really user friendly.

Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread

This commit adds a new option in the CephObjectStore CRD as defined in issue
rook#12272. The new option
'disableMultisiteSyncTraffic' determine whether the operator should disable the
multisite sync threads for the RGWs.

If the option is set to 'false', or if the option is not specified, the operator
does nothing. This ensures that the multisite sync threads will not be enabled
for users that disabled explicitely the multisite sync threads either manually
or with the 'rook-config-override' ConfigMap.

Signed-off-by: Lucas Henry <polyedre@disroot.org>
polyedre added a commit to polyedre/rook that referenced this issue Jun 5, 2023
Some users want to deploy two CephObjectStores for a single Zone. The first
configures RGWs to process the synchronization of the data, while the second
CephObjectStore configures the client RGWs.

Currently, this can be implemented by setting the RGW option
'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not
really user friendly.

Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread

This commit adds a new option in the CephObjectStore CRD as defined in issue
rook#12272. The new option
'disableMultisiteSyncTraffic' determine whether the operator should disable the
multisite sync threads for the RGWs.

If the option is set to 'false', or if the option is not specified, the operator
does nothing. This ensures that the multisite sync threads will not be enabled
for users that disabled explicitely the multisite sync threads either manually
or with the 'rook-config-override' ConfigMap.

Signed-off-by: Lucas Henry <polyedre@disroot.org>
polyedre added a commit to polyedre/rook that referenced this issue Jun 6, 2023
Some users want to deploy two CephObjectStores for a single Zone. The first
configures RGWs to process the synchronization of the data, while the second
CephObjectStore configures the client RGWs.

Currently, this can be implemented by setting the RGW option
'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not
really user friendly.

Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread

This commit adds a new option in the CephObjectStore CRD as defined in issue
rook#12272. The new option
'disableMultisiteSyncTraffic' determine whether the operator should disable the
multisite sync threads for the RGWs.

If the option is set to 'false', or if the option is not specified, the operator
does nothing. This ensures that the multisite sync threads will not be enabled
for users that disabled explicitely the multisite sync threads either manually
or with the 'rook-config-override' ConfigMap.

Signed-off-by: Lucas Henry <polyedre@disroot.org>
@alimaredia
Copy link
Contributor

@polyedre seeing high latencies as you scale the number of rgws solely dedicated to replication traffic is a Ceph issue that warrants further investigation. Could you open up a Ceph RGW tracker issue (https://tracker.ceph.com/projects/rgw/issues/new) describing more in detail about what you're seeing. This will have more visibility for the broader RGW team and if it's a matter of tuning more config we can add it to the docs or provide updates here.

@polyedre
Copy link
Contributor Author

polyedre commented Jun 8, 2023

@alimaredia Thanks for reminding me to report this directly to Ceph. I will open up a Ceph RGW tracker issue as soon as my registration is approved.

@polyedre
Copy link
Contributor Author

polyedre commented Jun 8, 2023

This issue is tracked at https://tracker.ceph.com/issues/61620.

polyedre added a commit to polyedre/rook that referenced this issue Jun 11, 2023
Some users want to deploy two CephObjectStores for a single Zone. The first
configures RGWs to process the synchronization of the data, while the second
CephObjectStore configures the client RGWs.

Currently, this can be implemented by setting the RGW option
'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not
really user friendly.

Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread

This commit adds a new option in the CephObjectStore CRD as defined in issue
rook#12272. The new option
'disableMultisiteSyncTraffic' determine whether the operator should disable the
multisite sync threads for the RGWs.

If the option is set to 'false', or if the option is not specified, the operator
does nothing. This ensures that the multisite sync threads will not be enabled
for users that disabled explicitely the multisite sync threads either manually
or with the 'rook-config-override' ConfigMap.

Signed-off-by: Lucas Henry <polyedre@disroot.org>
polyedre added a commit to polyedre/rook that referenced this issue Jun 15, 2023
Some users want to deploy two CephObjectStores for a single Zone. The first
configures RGWs to process the synchronization of the data, while the second
CephObjectStore configures the client RGWs.

Currently, this can be implemented by setting the RGW option
'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not
really user friendly.

Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread

This commit adds a new option in the CephObjectStore CRD as defined in issue
rook#12272. The new option
'disableMultisiteSyncTraffic' determine whether the operator should disable the
multisite sync threads for the RGWs.

If the option is set to 'false', or if the option is not specified, the operator
does nothing. This ensures that the multisite sync threads will not be enabled
for users that disabled explicitely the multisite sync threads either manually
or with the 'rook-config-override' ConfigMap.

Signed-off-by: Lucas Henry <polyedre@disroot.org>
polyedre added a commit to polyedre/rook that referenced this issue Jul 4, 2023
Some users want to deploy two CephObjectStores for a single Zone. The first
configures RGWs to process the synchronization of the data, while the second
CephObjectStore configures the client RGWs.

Currently, this can be implemented by setting the RGW option
'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not
really user friendly.

Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread

This commit adds a new option in the CephObjectStore CRD as defined in issue
rook#12272. The new option
'disableMultisiteSyncTraffic' determine whether the operator should disable the
multisite sync threads for the RGWs.

If the option is set to 'false', or if the option is not specified, the operator
does nothing. This ensures that the multisite sync threads will not be enabled
for users that disabled explicitely the multisite sync threads either manually
or with the 'rook-config-override' ConfigMap.

This commit also recommends to use two objectstore when scaling Ceph Objectstore
Multisite replication, with one objectstore configured with disabled replication
traffic.

Signed-off-by: Lucas Henry <polyedre@disroot.org>
polyedre added a commit to polyedre/rook that referenced this issue Jul 12, 2023
Some users want to deploy two CephObjectStores for a single Zone. The first
configures RGWs to process the synchronization of the data, while the second
CephObjectStore configures the client RGWs.

Currently, this can be implemented by setting the RGW option
'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not
really user friendly.

Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread

This commit adds a new option in the CephObjectStore CRD as defined in issue
rook#12272. The new option
'disableMultisiteSyncTraffic' determine whether the operator should disable the
multisite sync threads for the RGWs.

If the option is set to 'false', or if the option is not specified, the operator
does nothing. This ensures that the multisite sync threads will not be enabled
for users that disabled explicitely the multisite sync threads either manually
or with the 'rook-config-override' ConfigMap.

This commit also recommends to use two objectstore when scaling Ceph Objectstore
Multisite replication, with one objectstore configured with disabled replication
traffic.

Signed-off-by: Lucas Henry <polyedre@disroot.org>
@polyedre
Copy link
Contributor Author

We have some news about this issue. Following the recommendations of some of Rook's maintainers, we deployed two CephObjectStore: one dedicated to the replication with a single replica, and one that serves user requests with many replicas. Before, we deployed only one CephObjectStore (the one dedicated to the replication), while the user facing rgws were deployed manually with a Deployment.

With the two CephObjectStores, the performance of the synchronisation dropped suddently.

We discovered that Rook creates a zone endpoint for each CephObjectStore of the zone. While we didn't noticed the drop of performance at first, we discovered that we could restore the performance only when removing the CephObjectStore that serves user requests from the zone endpoint.

I hope this new information will help debug the cause of the performance issue when using multiple replication rgw.

We tested :

  1. One rgw that has rgw_run_sync_threads to true and one zone endpoint to this rgw -> performances OK
  2. One rgw that has rgw_run_sync_threads to true and one zone endpoint that point to another rgw that do not run sync threads -> performances NOK/no sync
  3. One rgw that has rgw_run_sync_threads to true and two zone endpoint that point to this rgw and another rgw that do not run sync threads -> performances NOK/no sync

If our conclusion are correct, it seems that we cannot merge #12327 as is: disabling the sync threads from the CephObjectStore should also remove the CephObjectStore from the zone endpoints.

@alimaredia
Copy link
Contributor

That conclusion about ensuring the zone endpoints are only RGWs dedicated to sync traffic is correct. I'll make sure to add my review to #12327 as well.

polyedre added a commit to polyedre/rook that referenced this issue Jul 17, 2023
Some users want to deploy two CephObjectStores for a single Zone. The first
configures RGWs to process the synchronization of the data, while the second
CephObjectStore configures the client RGWs.

Currently, this can be implemented by setting the RGW option
'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not
really user friendly.

Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread

This commit adds a new option in the CephObjectStore CRD as defined in issue
rook#12272. The new option
'disableMultisiteSyncTraffic' determine whether the operator should disable the
multisite sync threads for the RGWs.

If the option is set to 'false', or if the option is not specified, the operator
does nothing. This ensures that the multisite sync threads will not be enabled
for users that disabled explicitely the multisite sync threads either manually
or with the 'rook-config-override' ConfigMap.

This commit also recommends to use two objectstore when scaling Ceph Objectstore
Multisite replication, with one objectstore configured with disabled replication
traffic.

Signed-off-by: Lucas Henry <polyedre@disroot.org>
polyedre added a commit to polyedre/rook that referenced this issue Jul 17, 2023
Some users want to deploy two CephObjectStores for a single Zone. The first
configures RGWs to process the synchronization of the data, while the second
CephObjectStore configures the client RGWs.

Currently, this can be implemented by setting the RGW option
'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not
really user friendly.

Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread

This commit adds a new option in the CephObjectStore CRD as defined in issue
rook#12272. The new option
'disableMultisiteSyncTraffic' determine whether the operator should disable the
multisite sync threads for the RGWs.

If the option is set to 'false', or if the option is not specified, the operator
does nothing. This ensures that the multisite sync threads will not be enabled
for users that disabled explicitely the multisite sync threads either manually
or with the 'rook-config-override' ConfigMap.

This commit also recommends to use two objectstore when scaling Ceph Objectstore
Multisite replication, with one objectstore configured with disabled replication
traffic.

Signed-off-by: Lucas Henry <polyedre@disroot.org>
mergify bot pushed a commit that referenced this issue Jul 17, 2023
Some users want to deploy two CephObjectStores for a single Zone. The first
configures RGWs to process the synchronization of the data, while the second
CephObjectStore configures the client RGWs.

Currently, this can be implemented by setting the RGW option
'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not
really user friendly.

Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread

This commit adds a new option in the CephObjectStore CRD as defined in issue
#12272. The new option
'disableMultisiteSyncTraffic' determine whether the operator should disable the
multisite sync threads for the RGWs.

If the option is set to 'false', or if the option is not specified, the operator
does nothing. This ensures that the multisite sync threads will not be enabled
for users that disabled explicitely the multisite sync threads either manually
or with the 'rook-config-override' ConfigMap.

This commit also recommends to use two objectstore when scaling Ceph Objectstore
Multisite replication, with one objectstore configured with disabled replication
traffic.

Signed-off-by: Lucas Henry <polyedre@disroot.org>
(cherry picked from commit 393d093)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants