-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Issue with Ceph Object Storage Multisite Replication #12272
Comments
Can you please elaborate on this feature, So if we keep the |
|
Perhaps [ @smanjara ] or @alimaredia have some insights here. It's my understanding that some Ceph users create multiple deployments of RGWs for a Zone where one deployment is for clients and one for multisite replication. Perhaps that would work for you. I would suggest trying to deploy a CephObjectStore to serve as the replication RGW and another CephObjectStore for client RGWs. [Both using the same CephObjectZone] [Addendum]: |
|
@polyedre what version of Ceph are the Object Stores in Rook? |
|
Thank you all for your valuable insights and responses. We are currently using version 1.9.13 and are in the process of upgrading our clusters to version 1.10.13. We did not know that it was possible to deploy several CephObjectStores in a CephObjectZone. This approach will simplify the way we currently deploy the additional Rados Gateways. We agree that distinguishing the first RGW from others within Rook may not be a desired behavior for all users. We will try the solution proposed by @BlaineEXE and report the performance problem with multi-site replication directly to Ceph's redmine bug tracking system. Thank you. |
|
@BlaineEXE @thotz @smanjara @cbodley So I just learned that there is a config variable for an RGW that you can set to disable that RGW from doing sync (https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread). I wonder if that would be something we should set in the |
Make sense, I am not quite sure which CRD need to hold this info |
|
@thotz It would have to be the object-store CRD since the changes would only apply to the RGWs in that object-store. I was thinking of something like this: I think there would have to be a check somewhere saying you need to have one object-store in the zone for sync traffic. |
That is the configuration variable that we set in order to disable the synchronisation on our user-facing RGWs. Sorry for not having detailed this earlier. We currently disable it directly from the Toolbox with a command like this: kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph config set ${KEY_ID} rgw_run_sync_thread falseThis is not ideal because we need to implement this into our IaC process. Having the possibility to configure this variable from the CephObjectStore would be great! May I suggest to add the new option to apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
name: user-facing-rgw
spec:
gateway:
instances: 1
runSyncTraffic: false
preservePoolsOnDelete: false
zone:
name: zone1 |
|
I think the gateway spec seems like a good place for that. I'm not sure So maybe Unless "sync" traffic can be something other than multisite (in which case we should drop "Multisite"), I think that is a clear name for the API config that doesn't require reviewing docs to remember what the config does at a glance. |
|
@polyedre you can do it via IaC currently by setting the option in the Update: I don't think we need a design doc for anyone who wishes to implement this. Adding this configurable field is pretty simple and merely needs a good implementation that pays attention to setting the value to what is specified in the CRD (or false if unspecified) on every reconcile. It would be most ideal to have an e2e CI test for it, but that is not a requirement IMO as long as unit tests are good and the behavior is tested manually for the implementing PR. If this weren't a multi-site PR, I would be tempted to put a good-first-issue flag on it. @polyedre are you or your associates interested in contributing this to Rook? |
|
We would be happy to contribute this to Rook ! The name I'll probably start working on this next week. Can you assign this to me? |
|
@polyedre You might be able to assign the task to you by creating a new comment with just Should it not work, feel free to ping me. |
|
The issue you are trying to assign to yourself is already assigned. |
i would just point out that the if you also want to prevent a radosgw from serving these replication requests from other zones, you'd want to keep it out of the zone's "endpoints" list |
We'll have to make sure to mention this in the |
|
@cbodley @alimaredia Thanks for mentioning this. Is there a reason to prevent radosgw dedicated to serve user traffic from serving replication requests from other zones? The reason why we dissociate user-facing radosgw from replication radosgw is that by scaling the number of replication radosgw horizontally we encountered high latencies in the replication. Allowing user-facing radosgw to server replication requests might reduce the load on the replication radosgw. Or am I misunderstanding something here? /assign |
|
The issue you are trying to assign to yourself is already assigned. |
Some users want to deploy two CephObjectStores for a single Zone. The first configures RGWs to process the synchronization of the data, while the second CephObjectStore configures the client RGWs. Currently, this can be implemented by setting the RGW option 'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not really user friendly. Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread This commit adds a new option in the CephObjectStore CRD as defined in issue rook#12272. The new option 'disableMultisiteSyncTraffic' determine whether the operator should disable the multisite sync threads for the RGWs. If the option is set to 'false', or if the option is not specified, the operator does nothing. This ensures that the multisite sync threads will not be enabled for users that disabled explicitely the multisite sync threads either manually or with the 'rook-config-override' ConfigMap.
Some users want to deploy two CephObjectStores for a single Zone. The first configures RGWs to process the synchronization of the data, while the second CephObjectStore configures the client RGWs. Currently, this can be implemented by setting the RGW option 'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not really user friendly. Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread This commit adds a new option in the CephObjectStore CRD as defined in issue rook#12272. The new option 'disableMultisiteSyncTraffic' determine whether the operator should disable the multisite sync threads for the RGWs. If the option is set to 'false', or if the option is not specified, the operator does nothing. This ensures that the multisite sync threads will not be enabled for users that disabled explicitely the multisite sync threads either manually or with the 'rook-config-override' ConfigMap. Signed-off-by: Lucas Henry <polyedre@disroot.org>
Some users want to deploy two CephObjectStores for a single Zone. The first configures RGWs to process the synchronization of the data, while the second CephObjectStore configures the client RGWs. Currently, this can be implemented by setting the RGW option 'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not really user friendly. Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread This commit adds a new option in the CephObjectStore CRD as defined in issue rook#12272. The new option 'disableMultisiteSyncTraffic' determine whether the operator should disable the multisite sync threads for the RGWs. If the option is set to 'false', or if the option is not specified, the operator does nothing. This ensures that the multisite sync threads will not be enabled for users that disabled explicitely the multisite sync threads either manually or with the 'rook-config-override' ConfigMap. Signed-off-by: Lucas Henry <polyedre@disroot.org>
Some users want to deploy two CephObjectStores for a single Zone. The first configures RGWs to process the synchronization of the data, while the second CephObjectStore configures the client RGWs. Currently, this can be implemented by setting the RGW option 'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not really user friendly. Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread This commit adds a new option in the CephObjectStore CRD as defined in issue rook#12272. The new option 'disableMultisiteSyncTraffic' determine whether the operator should disable the multisite sync threads for the RGWs. If the option is set to 'false', or if the option is not specified, the operator does nothing. This ensures that the multisite sync threads will not be enabled for users that disabled explicitely the multisite sync threads either manually or with the 'rook-config-override' ConfigMap. Signed-off-by: Lucas Henry <polyedre@disroot.org>
Some users want to deploy two CephObjectStores for a single Zone. The first configures RGWs to process the synchronization of the data, while the second CephObjectStore configures the client RGWs. Currently, this can be implemented by setting the RGW option 'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not really user friendly. Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread This commit adds a new option in the CephObjectStore CRD as defined in issue rook#12272. The new option 'disableMultisiteSyncTraffic' determine whether the operator should disable the multisite sync threads for the RGWs. If the option is set to 'false', or if the option is not specified, the operator does nothing. This ensures that the multisite sync threads will not be enabled for users that disabled explicitely the multisite sync threads either manually or with the 'rook-config-override' ConfigMap. Signed-off-by: Lucas Henry <polyedre@disroot.org>
Some users want to deploy two CephObjectStores for a single Zone. The first configures RGWs to process the synchronization of the data, while the second CephObjectStore configures the client RGWs. Currently, this can be implemented by setting the RGW option 'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not really user friendly. Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread This commit adds a new option in the CephObjectStore CRD as defined in issue rook#12272. The new option 'disableMultisiteSyncTraffic' determine whether the operator should disable the multisite sync threads for the RGWs. If the option is set to 'false', or if the option is not specified, the operator does nothing. This ensures that the multisite sync threads will not be enabled for users that disabled explicitely the multisite sync threads either manually or with the 'rook-config-override' ConfigMap. Signed-off-by: Lucas Henry <polyedre@disroot.org>
Some users want to deploy two CephObjectStores for a single Zone. The first configures RGWs to process the synchronization of the data, while the second CephObjectStore configures the client RGWs. Currently, this can be implemented by setting the RGW option 'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not really user friendly. Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread This commit adds a new option in the CephObjectStore CRD as defined in issue rook#12272. The new option 'disableMultisiteSyncTraffic' determine whether the operator should disable the multisite sync threads for the RGWs. If the option is set to 'false', or if the option is not specified, the operator does nothing. This ensures that the multisite sync threads will not be enabled for users that disabled explicitely the multisite sync threads either manually or with the 'rook-config-override' ConfigMap. Signed-off-by: Lucas Henry <polyedre@disroot.org>
|
@polyedre seeing high latencies as you scale the number of rgws solely dedicated to replication traffic is a Ceph issue that warrants further investigation. Could you open up a Ceph RGW tracker issue (https://tracker.ceph.com/projects/rgw/issues/new) describing more in detail about what you're seeing. This will have more visibility for the broader RGW team and if it's a matter of tuning more config we can add it to the docs or provide updates here. |
|
@alimaredia Thanks for reminding me to report this directly to Ceph. I will open up a Ceph RGW tracker issue as soon as my registration is approved. |
|
This issue is tracked at https://tracker.ceph.com/issues/61620. |
Some users want to deploy two CephObjectStores for a single Zone. The first configures RGWs to process the synchronization of the data, while the second CephObjectStore configures the client RGWs. Currently, this can be implemented by setting the RGW option 'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not really user friendly. Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread This commit adds a new option in the CephObjectStore CRD as defined in issue rook#12272. The new option 'disableMultisiteSyncTraffic' determine whether the operator should disable the multisite sync threads for the RGWs. If the option is set to 'false', or if the option is not specified, the operator does nothing. This ensures that the multisite sync threads will not be enabled for users that disabled explicitely the multisite sync threads either manually or with the 'rook-config-override' ConfigMap. Signed-off-by: Lucas Henry <polyedre@disroot.org>
Some users want to deploy two CephObjectStores for a single Zone. The first configures RGWs to process the synchronization of the data, while the second CephObjectStore configures the client RGWs. Currently, this can be implemented by setting the RGW option 'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not really user friendly. Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread This commit adds a new option in the CephObjectStore CRD as defined in issue rook#12272. The new option 'disableMultisiteSyncTraffic' determine whether the operator should disable the multisite sync threads for the RGWs. If the option is set to 'false', or if the option is not specified, the operator does nothing. This ensures that the multisite sync threads will not be enabled for users that disabled explicitely the multisite sync threads either manually or with the 'rook-config-override' ConfigMap. Signed-off-by: Lucas Henry <polyedre@disroot.org>
Some users want to deploy two CephObjectStores for a single Zone. The first configures RGWs to process the synchronization of the data, while the second CephObjectStore configures the client RGWs. Currently, this can be implemented by setting the RGW option 'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not really user friendly. Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread This commit adds a new option in the CephObjectStore CRD as defined in issue rook#12272. The new option 'disableMultisiteSyncTraffic' determine whether the operator should disable the multisite sync threads for the RGWs. If the option is set to 'false', or if the option is not specified, the operator does nothing. This ensures that the multisite sync threads will not be enabled for users that disabled explicitely the multisite sync threads either manually or with the 'rook-config-override' ConfigMap. This commit also recommends to use two objectstore when scaling Ceph Objectstore Multisite replication, with one objectstore configured with disabled replication traffic. Signed-off-by: Lucas Henry <polyedre@disroot.org>
Some users want to deploy two CephObjectStores for a single Zone. The first configures RGWs to process the synchronization of the data, while the second CephObjectStore configures the client RGWs. Currently, this can be implemented by setting the RGW option 'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not really user friendly. Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread This commit adds a new option in the CephObjectStore CRD as defined in issue rook#12272. The new option 'disableMultisiteSyncTraffic' determine whether the operator should disable the multisite sync threads for the RGWs. If the option is set to 'false', or if the option is not specified, the operator does nothing. This ensures that the multisite sync threads will not be enabled for users that disabled explicitely the multisite sync threads either manually or with the 'rook-config-override' ConfigMap. This commit also recommends to use two objectstore when scaling Ceph Objectstore Multisite replication, with one objectstore configured with disabled replication traffic. Signed-off-by: Lucas Henry <polyedre@disroot.org>
|
We have some news about this issue. Following the recommendations of some of Rook's maintainers, we deployed two CephObjectStore: one dedicated to the replication with a single replica, and one that serves user requests with many replicas. Before, we deployed only one CephObjectStore (the one dedicated to the replication), while the user facing rgws were deployed manually with a Deployment. With the two CephObjectStores, the performance of the synchronisation dropped suddently. We discovered that Rook creates a zone endpoint for each CephObjectStore of the zone. While we didn't noticed the drop of performance at first, we discovered that we could restore the performance only when removing the CephObjectStore that serves user requests from the zone endpoint. I hope this new information will help debug the cause of the performance issue when using multiple replication rgw. We tested :
If our conclusion are correct, it seems that we cannot merge #12327 as is: disabling the sync threads from the CephObjectStore should also remove the CephObjectStore from the zone endpoints. |
|
That conclusion about ensuring the zone endpoints are only RGWs dedicated to sync traffic is correct. I'll make sure to add my review to #12327 as well. |
Some users want to deploy two CephObjectStores for a single Zone. The first configures RGWs to process the synchronization of the data, while the second CephObjectStore configures the client RGWs. Currently, this can be implemented by setting the RGW option 'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not really user friendly. Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread This commit adds a new option in the CephObjectStore CRD as defined in issue rook#12272. The new option 'disableMultisiteSyncTraffic' determine whether the operator should disable the multisite sync threads for the RGWs. If the option is set to 'false', or if the option is not specified, the operator does nothing. This ensures that the multisite sync threads will not be enabled for users that disabled explicitely the multisite sync threads either manually or with the 'rook-config-override' ConfigMap. This commit also recommends to use two objectstore when scaling Ceph Objectstore Multisite replication, with one objectstore configured with disabled replication traffic. Signed-off-by: Lucas Henry <polyedre@disroot.org>
Some users want to deploy two CephObjectStores for a single Zone. The first configures RGWs to process the synchronization of the data, while the second CephObjectStore configures the client RGWs. Currently, this can be implemented by setting the RGW option 'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not really user friendly. Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread This commit adds a new option in the CephObjectStore CRD as defined in issue rook#12272. The new option 'disableMultisiteSyncTraffic' determine whether the operator should disable the multisite sync threads for the RGWs. If the option is set to 'false', or if the option is not specified, the operator does nothing. This ensures that the multisite sync threads will not be enabled for users that disabled explicitely the multisite sync threads either manually or with the 'rook-config-override' ConfigMap. This commit also recommends to use two objectstore when scaling Ceph Objectstore Multisite replication, with one objectstore configured with disabled replication traffic. Signed-off-by: Lucas Henry <polyedre@disroot.org>
Some users want to deploy two CephObjectStores for a single Zone. The first configures RGWs to process the synchronization of the data, while the second CephObjectStore configures the client RGWs. Currently, this can be implemented by setting the RGW option 'rgw_run_sync_thread' in the 'rook-config-override' ConfigMap, though it is not really user friendly. Ref: https://docs.ceph.com/en/latest/radosgw/config-ref/#confval-rgw_run_sync_thread This commit adds a new option in the CephObjectStore CRD as defined in issue #12272. The new option 'disableMultisiteSyncTraffic' determine whether the operator should disable the multisite sync threads for the RGWs. If the option is set to 'false', or if the option is not specified, the operator does nothing. This ensures that the multisite sync threads will not be enabled for users that disabled explicitely the multisite sync threads either manually or with the 'rook-config-override' ConfigMap. This commit also recommends to use two objectstore when scaling Ceph Objectstore Multisite replication, with one objectstore configured with disabled replication traffic. Signed-off-by: Lucas Henry <polyedre@disroot.org> (cherry picked from commit 393d093)
Summary
We have encountered a performance issue with the Multisite replication feature of Ceph Object Storage while using Rook clusters. Scaling the number of rados gateways to 2 or more significantly increases the replication latency, causing delays of 40 seconds or more.
Context
We have been working with Rook clusters and investigating the performance of Ceph Object Storage Multisite replication. Initially, replication between two zones within a Ceph Realm showed excellent results, with 99% of files successfully replicated and latency consistently under 400 ms.
However, when we scaled the number of rados gateways to 2 or more using Rook, we observed a substantial degradation in performance. The p99 latency during replication skyrocketed to 40 seconds or more, indicating a severe issue. The cause of the degradation of the latency seems to be the lock of the Ceph Object Storage bucket index logs.
Feature Request
To address this performance issue, I suggest that the operator distinguish between the first Rados gateway and the others.
The operator should create the first rados gateway with synchronization active, allowing it to handle replication tasks.
Subsequent rados gateways should be deployed without synchronization enabled, focusing on other purposes such as client access or data ingestion.
By implementing this approach, administrators could scale the number of gateways according to their specific needs while maintaining optimal replication performance.
Assistance Offered
We greatly appreciate the efforts of the Rook team in developing and maintaining this valuable tool. If you require any additional information or assistance to further understand the problem and its implications, we are more than willing to collaborate. We can provide any necessary details or perform additional testing to help resolve this performance issue and enhance the functionality of Rook.
Thank you for your attention to this matter.
Sincerely,
The text was updated successfully, but these errors were encountered: