Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgr/rgw: adding support for rgw multisite #47708

Merged
merged 13 commits into from Dec 7, 2022
Merged

Conversation

rkachach
Copy link
Contributor

@rkachach rkachach commented Aug 19, 2022

Fixes: https://tracker.ceph.com/issues/57160

Related PRs:
#50237
#50020
#50275
#48552

This effort builds on top of #42710 to add support for an easy RGW multi-site deployment and configuration. User should be able to create the corresponding RGW entities (realm, zonegroup and zone) and control the rgw daemons deployment by using a simple spec such as the following:

rgw_realm: myrealm
rgw_zonegroup: myzonegroup
rgw_zone: myzone
placement:
  hosts:
   - ceph-node-2
   - ceph-node-1
spec:
  rgw_frontend_port: 8080

Where the user in addition to specifying the the realm/zonegroup/zone to be created, also can control the details of RGW daemons deployments such as placement, port, etc. The RGW spec format is the same as used by cephadm so the user can provide advanced options already supported by cephadm RGW configuration such as SSL configuration.

Once deployed, the user can get the corresponding tokens for the available realms by using the command ceph rgw realm tokens which print the corresponding token for each realm.

Following is an example of this command output:

[ceph: root@ceph-node-0 /]# ceph rgw realm tokens | jq
[
  {
    "realm": "myrealm1",
    "token": "ewogICAgInJlYWxtX25hbWUiOiAibXlyZWFsbTEiLAogICAgInJlYWxtX2lkIjogImM5MmRmM2NiLTAxYmQtNDIwMS1hODVjLTIwYjYyMmM1ZDlmMSIsCiAgICAiZW5kcG9pbnQiOiAiaHR0cDovL2NlcGgtbm9kZS0xOjU1MDAiLAogICAgImFjY2Vzc19rZXkiOiAiSTZEOUxPRExHUlNGQU1JQzVZSUkiLAogICAgInNlY3JldCI6ICIxakZORmhEWEtrSUxaanhTc3JqWW5lMXNVV3ZCVjM2NFl3NHlBTFhoIgp9"
  },
  {
    "realm": "myrealm2",
    "token": "ewogICAgInJlYWxtX25hbWUiOiAibXlyZWFsbTIiLAogICAgInJlYWxtX2lkIjogIjE5YTgzZjJhLWUzNWItNDA1Ny1hNGIxLTFiYzQxYzc3MzY4YSIsCiAgICAiZW5kcG9pbnQiOiAiaHR0cDovL2NlcGgtbm9kZS0wOjY1MDAiLAogICAgImFjY2Vzc19rZXkiOiAiSzFRTTc0RDVPQ1M5QjRTMllDN0siLAogICAgInNlY3JldCI6ICJXMDdBVVVXanlMSjd0dmlMZUtCbHpwcmp3TkVmUE9mMDJRUU12ZDB0Igp9"
  }
]

By using the token, the user can create and sync new zones on the secondary cluster easily by providing a spec such as the following (please, notice the usage of the token as part of the spec). Again, the format of this spec is the same used by cephadm so the user can use all the options already supported by the orchestrator.

rgw_realm: myrealm1
rgw_zonegroup: myzonegroup1
rgw_zone: secondary-zone-1
rgw_realm_token: ewogICAgInJlYWxtX25hbWUiOiAibXlyZWFsbTEiLAogICAgInJlYWxtX2lkIjogImM5MmRmM2NiLTAxYmQtNDIwMS1hODVjLTIwYjYyMmM1ZDlmMSIsCiAgICAiZW5kcG9pbnQiOiAiaHR0cDovL2NlcGgtbm9kZS0xOjU1MDAiLAogICAgImFjY2Vzc19rZXkiOiAiSTZEOUxPRExHUlNGQU1JQzVZSUkiLAogICAgInNlY3JldCI6ICIxakZORmhEWEtrSUxaanhTc3JqWW5lMXNVV3ZCVjM2NFl3NHlBTFhoIgp9
placement:
  hosts:
   - ceph2-node-2
   - ceph2-node-1
spec:
  rgw_frontend_port: 5500

This spec is used to create the secondary zone by running the command:

ceph rgw zone create -i <spec-file>.yaml
This should create the secondary zone by using the provided token and start the RGW synchronization process. Following is an example of the sync status of the new created zone.

[ceph: root@ceph2-node-0 /]# radosgw-admin sync status
          realm c92df3cb-01bd-4201-a85c-20b622c5d9f1 (myrealm1)
      zonegroup 55e0f59c-9375-4f3c-9e93-6a58c9eadfc7 (myzonegroup1)
           zone c4135333-239c-47a8-b0f9-932c7508a7db (secondary-zone-1)
  metadata sync syncing
                full sync: 0/64 shards
                incremental sync: 64/64 shards
                metadata is caught up with master
      data sync source: 05fa4df5-812f-4863-b77c-8716822c69b1 (myzone1)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is caught up with source

Note:
In both specs batch mode is supported so the user can provide multiple specs separated by ---

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@rkachach rkachach changed the title [WIP] adding support for multisite on cephadm [WIP] mgr/rgw: adding support for multisite on cephadm Aug 22, 2022
@rkachach rkachach changed the title [WIP] mgr/rgw: adding support for multisite on cephadm [WIP] mgr/rgw: adding support for multisite Aug 22, 2022
@rkachach rkachach changed the title [WIP] mgr/rgw: adding support for multisite [WIP] mgr/rgw: adding support for rgw multisite Aug 22, 2022
@rkachach rkachach force-pushed the fix_issue_57160 branch 3 times, most recently from 4808d24 to fb334a5 Compare August 23, 2022 11:51
@github-actions github-actions bot added the mgr label Aug 23, 2022
@rkachach rkachach changed the title [WIP] mgr/rgw: adding support for rgw multisite mgr/rgw: adding support for rgw multisite Aug 26, 2022
@rkachach rkachach marked this pull request as ready for review August 26, 2022 09:40
@rkachach rkachach requested review from a team as code owners August 26, 2022 09:40
@epuertat epuertat requested review from a team, aaSharma14 and bryanmontalvan and removed request for a team August 29, 2022 11:00
@rkachach
Copy link
Contributor Author

jenkins retest this please

doc/mgr/rgw.rst Outdated Show resolved Hide resolved
src/pybind/mgr/rgw/module.py Outdated Show resolved Hide resolved
src/pybind/mgr/rgw/module.py Outdated Show resolved Hide resolved
@cbodley
Copy link
Contributor

cbodley commented Sep 7, 2022

cc @yehudasa

revisiting the description from #42710, i think we need to run the ceph rgw realm bootstrap part only once to create the original realm/zonegroup/zone

additional zones need to use the ceph rgw zone create part, providing the same realm_token from bootstrap

this workflow is necessary to support deployments that span ceph clusters

@rkachach rkachach force-pushed the fix_issue_57160 branch 3 times, most recently from a76c3d7 to ef19265 Compare September 27, 2022 07:31
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
Copy link
Contributor

@adk3798 adk3798 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like all of my comments have been addressed pending any potential continuing discussions on the endpoints. Will do another in depth look over around when we start testing this but it seems to generally look good.

src/python-common/ceph/rgw/rgwam_core.py Outdated Show resolved Hide resolved
src/python-common/ceph/rgw/types.py Show resolved Hide resolved
Adding logic to modify the master zonegroup endpoints
Do no call pull realm when modifying zone
Only update the endpoints if the modified zone is master
Adding support to set custom endpoints when creating realm or zone

Signed-off-by: Redouane Kachach <rkachach@redhat.com>
@rkachach
Copy link
Contributor Author

jenkins test make check

@rkachach rkachach requested a review from adk3798 October 14, 2022 10:09
@rkachach
Copy link
Contributor Author

jenkins test make check arm64

Signed-off-by: Redouane Kachach <rkachach@redhat.com>
@rkachach
Copy link
Contributor Author

jenkins test make check

@rkachach
Copy link
Contributor Author

jenkins test make check arm64

@rkachach
Copy link
Contributor Author

jenkins test windows

@rkachach
Copy link
Contributor Author

jenkins test make check arm64

@rkachach
Copy link
Contributor Author

jenkins test dashboard cephadm

doc/mgr/rgw.rst Show resolved Hide resolved
doc/mgr/rgw.rst Show resolved Hide resolved
src/pybind/mgr/cephadm/serve.py Outdated Show resolved Hide resolved
src/pybind/mgr/rgw/module.py Outdated Show resolved Hide resolved
src/python-common/ceph/rgw/rgwam_core.py Outdated Show resolved Hide resolved
src/python-common/ceph/rgw/rgwam_core.py Show resolved Hide resolved
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
@adk3798
Copy link
Contributor

adk3798 commented Dec 5, 2022

https://pulpito.ceph.com/adking-2022-12-02_21:56:44-orch:cephadm-wip-adk-testing-2022-12-02-0932-distro-default-smithi/

Failures tracked by:

It's a lot of failures, but other than the the last one, they are all things we were aware of before the run (even the dead mgr-nfs-upgrade job had a mount.nfs: mounting smithi174:/fake failed, reason given by server: No such file or directory issue). The last one (https://tracker.ceph.com/issues/58173) as well seems to be a failure on a test setting flags on a pool, and given all the PRs in this run strictly touched mgr modules, I think it's unlikely any of the attached PRs caused the failures. Overall, I think the failures should only block merging on PRs related to nfs due to https://tracker.ceph.com/issues/58145 making it so we can't properly run our nfs related tests currently.

@adk3798
Copy link
Contributor

adk3798 commented Dec 5, 2022

jenkins retest this please

@adk3798
Copy link
Contributor

adk3798 commented Dec 6, 2022

jenkins test make check

@rkachach
Copy link
Contributor Author

rkachach commented Dec 7, 2022

jenkins test dashboard cephadm

@adk3798 adk3798 merged commit 08718ae into ceph:main Dec 7, 2022
9 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants