Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reverting from a ceph radosgw multisite failover is not working as expected #13969

Open
nkprince007 opened this issue Mar 22, 2024 · 2 comments
Assignees
Labels

Comments

@nkprince007
Copy link

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior:

Ceph RGW Multisite -- reverting from failover is not working

Expected behavior:

How to reproduce it (minimal and precise):

  • Create 2 new independent ceph clusters using values provided for DC1 and DC2
  • Apply the following kubernetes manifest on DC1
apiVersion: ceph.rook.io/v1
kind: CephObjectRealm
metadata:
  name: xshield-store
  namespace: rook-ceph
---
apiVersion: ceph.rook.io/v1
kind: CephObjectZoneGroup
metadata:
  name: xshield-store
  namespace: rook-ceph
spec:
  realm: xshield-store
---
apiVersion: ceph.rook.io/v1
kind: CephObjectZone
metadata:
  name: xshield-store
  namespace: rook-ceph
spec:
  zoneGroup: xshield-store
  metadataPool:
    failureDomain: host
    replicated:
      size: 3
  dataPool:
    failureDomain: host
    replicated:
      size: 3
  preservePoolsOnDelete: false
  # recommended to set this value if ingress used for exposing rgw endpoints
  customEndpoints:
    - "https://replication-rook-onprem.colortokens.com"
    - "https://rook-onprem.colortokens.com"
---
apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
  name: xshield-store-replication
  namespace: rook-ceph
spec:
  gateway:
    port: 80
    instances: 1
    disableMultisiteSyncTraffic: false
  zone:
    name: xshield-store
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    ingress.kubernetes.io/rewrite-target: /$1
  name: xshield-store-replication
  namespace: rook-ceph
spec:
  ingressClassName: kong-rook
  rules:
    - host: replication-rook-hdbfs.colortokens.com
      http:
        paths:
          - backend:
              service:
                name: rook-ceph-rgw-xshield-store-replication
                port:
                  number: 80
            path: /
            pathType: Prefix
  tls:
    - hosts:
        - replication-rook-hdbfs.colortokens.com
      secretName: ssl-cert
  • Gather secrets from DC1 for xshield-store-keys in rook-ceph namespace
  • Apply the following kubernetes manifest on DC2
apiVersion: v1
kind: Secret
data:
  access-key: TkY0NVZYUkpXRXhaT1U5WWIzaz0=
  secret-key: WTMxTUxXZEFMVnh3Ym05NVpUaEtiVkY3TVN0c0wxQmFZU3dwS1E9PQ==
metadata:
  name: xshield-store-keys
  namespace: rook-ceph
---
apiVersion: ceph.rook.io/v1
kind: CephObjectRealm
metadata:
  name: xshield-store
  namespace: rook-ceph
spec:
  pull:
    endpoint: https://rook-hdbfs.colortokens.com
---
apiVersion: ceph.rook.io/v1
kind: CephObjectZoneGroup
metadata:
  name: xshield-store
  namespace: rook-ceph
spec:
  realm: xshield-store
---
apiVersion: ceph.rook.io/v1
kind: CephObjectZone
metadata:
  name: xshield-store-zone-b
  namespace: rook-ceph
spec:
  zoneGroup: xshield-store
  metadataPool:
    failureDomain: host
    replicated:
      size: 3
  dataPool:
    failureDomain: host
    replicated:
      size: 3
  preservePoolsOnDelete: false
  # recommended to set this value if ingress used for exposing rgw endpoints
  #customEndpoints:
  #  - "https://replication-rook-onprem.colortokens.com"
  #  - "https://rook-onprem.colortokens.com"
---
apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
  name: xshield-store-b
  namespace: rook-ceph
spec:
  gateway:
    port: 80
    instances: 1
    disableMultisiteSyncTraffic: false
  zone:
    name: xshield-store-zone-b
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    ingress.kubernetes.io/rewrite-target: /$1
  name: xshield-store-b
  namespace: rook-ceph
spec:
  ingressClassName: kong-rook
  rules:
    - host: rook-hdbfs1.colortokens.com
      http:
        paths:
          - backend:
              service:
                name: rook-ceph-rgw-xshield-store-b
                port:
                  number: 80
            path: /
            pathType: Prefix
  tls:
    - hosts:
        - rook-hdbfs1.colortokens.com
      secretName: ssl-cert

File(s) to submit:

  • Cluster CR (custom resource), typically called cluster.yaml, if necessary

Helm chart values for rook-ceph-cluster v1.13.7 for DC 1

operatorNamespace: rook-ceph
clusterName:
kubeVersion:
configOverride:

# Installs a debugging toolbox deployment
toolbox:
  image: quay.io/ceph/ceph:v18.2.2
  enabled: true
  resources:
    limits:
      cpu: "500m"
      memory: "1Gi"
    requests:
      cpu: "50m"
      memory: "64Mi"
monitoring:
  enabled: true
  createPrometheusRules: true
  rulesNamespaceOverride:
  prometheusRule:
    labels: {}
    annotations: {}
pspEnable: false
imagePullSecrets: []
cephClusterSpec:
  cephVersion:
    image: quay.io/ceph/ceph:v18.2.2
    allowUnsupported: false
  dataDirHostPath: /var/lib/rook
  skipUpgradeChecks: false
  waitTimeoutForHealthyOSDInMinutes: 10
  mon:
    count: 2
    allowMultiplePerNode: true
  mgr:
    count: 2
    allowMultiplePerNode: true
    modules:
      - name: pg_autoscaler
        enabled: true
  dashboard:
    enabled: true
    ssl: false
  network:
    connections:
      encryption:
        enabled: false
      compression:
        enabled: false
      requireMsgr2: false
  crashCollector:
    disable: false
  logCollector:
    enabled: true
    periodicity: daily
    maxLogSize: 500M
  cleanupPolicy:
    confirmation: ""
    sanitizeDisks:
      method: quick
      dataSource: zero
      iteration: 1
  resources:
    mgr:
      limits:
        cpu: "1000m"
        memory: "1Gi"
      requests:
        cpu: "100m"
        memory: "512Mi"
    mon:
      limits:
        cpu: "2000m"
        memory: "2Gi"
      requests:
        cpu: "100m"
        memory: "256Mi"
    osd:
      limits:
        cpu: "2000m"
        memory: "4Gi"
      requests:
        cpu: "100m"
        memory: "512Mi"
    prepareosd:
      requests:
        cpu: "500m"
        memory: "50Mi"
    mgr-sidecar:
      limits:
        cpu: "100m"
        memory: "100Mi"
      requests:
        cpu: "100m"
        memory: "40Mi"
    crashcollector:
      limits:
        cpu: "500m"
        memory: "60Mi"
      requests:
        cpu: "100m"
        memory: "60Mi"
    logcollector:
      limits:
        cpu: "500m"
        memory: "1Gi"
      requests:
        cpu: "100m"
        memory: "100Mi"
    cleanup:
      limits:
        cpu: "500m"
        memory: "1Gi"
      requests:
        cpu: "100m"
        memory: "100Mi"
  priorityClassNames:
    mon: system-node-critical
    osd: system-node-critical
    mgr: system-cluster-critical
    storageClassDeviceSets: []
  disruptionManagement:
    managePodBudgets: true
    pgHealthCheckTimeout: 0
  healthCheck:
    daemonHealth:
      mon:
        disabled: false
        interval: 45s
      osd:
        disabled: false
        interval: 60s
      status:
        disabled: false
        interval: 60s
    # Change pod liveness probe, it works for all mon, mgr, and osd pods.
    livenessProbe:
      mon:
        disabled: false
      mgr:
        disabled: false
      osd:
        disabled: false
ingress:
  dashboard:
    enabled: false
cephBlockPools: []
cephFileSystems: []
cephFileSystemVolumeSnapshotClass: {}
cephBlockPoolsVolumeSnapshotClass: {}
cephObjectStores:
  - name: xshield-store
    spec:
      preservePoolsOnDelete: true
      gateway:
        port: 80
        resources:
          limits:
            cpu: "2000m"
            memory: "2Gi"
          requests:
            cpu: "250m"
            memory: "512Mi"
        # securePort: 443
        # sslCertificateRef:
        instances: 3
        priorityClassName: system-cluster-critical
        disableMultisiteSyncTraffic: true
      zone:
        name: xshield-store
    storageClass:
      enabled: true
      name: ceph-bucket
      reclaimPolicy: Delete
      volumeBindingMode: "Immediate"
      # see https://github.com/rook/rook/blob/master/Documentation/ceph-object-bucket-claim.md#storageclass for available configuration
      parameters:
        # note: objectStoreNamespace and objectStoreName are configured by the chart
        region: us-east-1
    ingress:
      enabled: true
      annotations:
        ingress.kubernetes.io/rewrite-target: /$1
      host:
        # name: rook-onprem.colortokens.com
        path: "/"
        name: rook-hdbfs.colortokens.com
      # tls:
      # - hosts:
      #   - rook-onprem.colortokens.com
      #   secretName: ssl-cert
      ingressClassName: kong-rook
      tls:
        - secretName: ssl-cert
          hosts:
            - rook-hdbfs.colortokens.com

Helm chart values for DC 2

operatorNamespace: rook-ceph
clusterName:
kubeVersion:
configOverride:

# Installs a debugging toolbox deployment
toolbox:
  image: quay.io/ceph/ceph:v18.2.2
  enabled: true
  resources:
    limits:
      cpu: "500m"
      memory: "1Gi"
    requests:
      cpu: "50m"
      memory: "64Mi"
monitoring:
  enabled: true
  createPrometheusRules: true
  rulesNamespaceOverride:
  prometheusRule:
    labels: {}
    annotations: {}
pspEnable: false
imagePullSecrets: []
cephClusterSpec:
  cephVersion:
    image: quay.io/ceph/ceph:v18.2.2
    allowUnsupported: false
  dataDirHostPath: /var/lib/rook
  skipUpgradeChecks: false
  waitTimeoutForHealthyOSDInMinutes: 10
  mon:
    count: 2
    allowMultiplePerNode: true
  mgr:
    count: 2
    allowMultiplePerNode: true
    modules:
      - name: pg_autoscaler
        enabled: true
  dashboard:
    enabled: true
    ssl: false
  network:
    connections:
      encryption:
        enabled: false
      compression:
        enabled: false
      requireMsgr2: false
  crashCollector:
    disable: false
  logCollector:
    enabled: true
    periodicity: daily
    maxLogSize: 500M
  cleanupPolicy:
    confirmation: ""
    sanitizeDisks:
      method: quick
      dataSource: zero
      iteration: 1
  resources:
    mgr:
      limits:
        cpu: "1000m"
        memory: "1Gi"
      requests:
        cpu: "100m"
        memory: "512Mi"
    mon:
      limits:
        cpu: "2000m"
        memory: "2Gi"
      requests:
        cpu: "100m"
        memory: "256Mi"
    osd:
      limits:
        cpu: "2000m"
        memory: "4Gi"
      requests:
        cpu: "100m"
        memory: "512Mi"
    prepareosd:
      requests:
        cpu: "500m"
        memory: "50Mi"
    mgr-sidecar:
      limits:
        cpu: "100m"
        memory: "100Mi"
      requests:
        cpu: "100m"
        memory: "40Mi"
    crashcollector:
      limits:
        cpu: "500m"
        memory: "60Mi"
      requests:
        cpu: "100m"
        memory: "60Mi"
    logcollector:
      limits:
        cpu: "500m"
        memory: "1Gi"
      requests:
        cpu: "100m"
        memory: "100Mi"
    cleanup:
      limits:
        cpu: "500m"
        memory: "1Gi"
      requests:
        cpu: "100m"
        memory: "100Mi"
  priorityClassNames:
    mon: system-node-critical
    osd: system-node-critical
    mgr: system-cluster-critical
    storageClassDeviceSets: []
  disruptionManagement:
    managePodBudgets: true
    pgHealthCheckTimeout: 0
  healthCheck:
    daemonHealth:
      mon:
        disabled: false
        interval: 45s
      osd:
        disabled: false
        interval: 60s
      status:
        disabled: false
        interval: 60s
    # Change pod liveness probe, it works for all mon, mgr, and osd pods.
    livenessProbe:
      mon:
        disabled: false
      mgr:
        disabled: false
      osd:
        disabled: false
ingress:
  dashboard:
    enabled: false
cephBlockPools: []
cephFileSystems: []
cephFileSystemVolumeSnapshotClass: {}
cephBlockPoolsVolumeSnapshotClass: {}
cephObjectStores: []

Logs to submit:

Failing over back to DC1 is not working.

[root@rook-ceph-tools-5cb7d9cf4d-sq52c /]# radosgw-admin period commit
failed to commit period: (2) No such file or directory
failed to commit period: (2) No such file or directory
2024-03-22T13:45:22.800+0000 7f97ec0b1a80  0 period failed to read sync status: (2) No such file or directory
2024-03-22T13:45:22.800+0000 7f97ec0b1a80  0 failed to update metadata sync status: (2) No such file or directory
[root@rook-ceph-tools-5cb7d9cf4d-sq52c /]# radosgw-admin period update --commit
failed to commit period: (2) No such file or directory
2024-03-22T13:46:50.865+0000 7f6c4c7f3a80  0 period failed to read sync status: (2) No such file or directory
2024-03-22T13:46:50.865+0000 7f6c4c7f3a80  0 failed to update metadata sync status: (2) No such file or directory
failed to commit period: (2) No such file or directory
[root@rook-ceph-tools-5cb7d9cf4d-sq52c /]# radosgw-admin period update --commit
failed to commit period: 2024-03-22T13:47:52.194+0000 7f8e252f6a80  0 period failed to read sync status: (2) No such file or directory
2024-03-22T13:47:52.194+0000 7f8e252f6a80  0 failed to update metadata sync status: (2) No such file or directory
(2) No such file or directory
failed to commit period: (2) No such file or directory

Environment:

  • OS (e.g. from /etc/os-release): Ubuntu 20.04.6 LTS (Focal Fossa)
  • Kernel (e.g. uname -a): Linux kubespray 5.4.0-174-generic jenkins: git clone should use local reference repo for acceleration #193-Ubuntu SMP Thu Mar 7 14:29:28 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
  • Cloud provider or hardware configuration: VMWare Vsphere
  • Rook version (use rook version inside of a Rook Pod): 1.13.7
  • Storage backend version (e.g. for ceph do ceph -v): ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable)
  • Kubernetes version (use kubectl version):
    Client Version: v1.27.7
    Kustomize Version: v5.0.1
    Server Version: v1.27.7
  • Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift):
    Bare metal with kubespray
  • Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox):
    HEALTH_OK on both sites
@nkprince007
Copy link
Author

I would also like to know the answers to the following questions.

  1. I performed a manual failover to DC2 (using reference: https://docs.ceph.com/en/latest/radosgw/multisite/#setting-up-failover-to-the-secondary-zone) and then created a bucket and pushed some data into it on DC 2 cluster. However, that data was not synced back to DC1. To be sure, I pushed some data into the former master zone at DC1 and to my surprise it was getting synced back to DC2 cluster.

Is this the intended behavior?

  1. Executing radosgw-admin sync status command is working only on DC2 and not on DC1
DC2 # radosgw-admin sync status
          realm c0e21ab6-57a2-4339-96b8-9aa4b7c5ea52 (xshield-store)
      zonegroup 63695c1b-316e-4836-a0a9-2021dc4c74b1 (xshield-store)
           zone 78a175ba-62f0-4ba1-8217-aea003125c1e (xshield-store-zone-b)
   current time 2024-03-22T15:52:46Z
zonegroup features enabled: resharding
                   disabled: compress-encrypted
  metadata sync no sync (zone is master)
      data sync source: f934eba3-d647-4543-aed1-216e317e31a2 (xshield-store)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is caught up with source
DC1 # radosgw-admin sync status
          realm c0e21ab6-57a2-4339-96b8-9aa4b7c5ea52 (xshield-store)
      zonegroup 63695c1b-316e-4836-a0a9-2021dc4c74b1 (xshield-store)
           zone f934eba3-d647-4543-aed1-216e317e31a2 (xshield-store)
   current time 2024-03-22T15:53:51Z
zonegroup features enabled: resharding
                   disabled: compress-encrypted
  metadata sync failed to read sync status: (2) No such file or directory
2024-03-22T15:53:52.688+0000 7f338ed56a80  0 ERROR: failed to fetch datalog info
      data sync source: 78a175ba-62f0-4ba1-8217-aea003125c1e (xshield-store-zone-b)
                        failed to retrieve sync info: (5) Input/output error
  1. What does "pull the latest realm configuration" mean - just metadata pull or the data too ? Reference:

Screenshot 2024-03-22 at 9 26 42 PM

@thotz
Copy link
Contributor

thotz commented Apr 3, 2024

Even though the ceph part may work as per documentation. But here in Rook we also have CRDs. We may need to update CRDs as well because, from the Rook Operator's point of view, DC1 is still master. @alimaredia any thoughts??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants