Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot make use of RADOS namespace for external cluster with EC block pool #13633

Closed
bauerjs1 opened this issue Jan 26, 2024 · 27 comments · Fixed by #13769
Closed

Cannot make use of RADOS namespace for external cluster with EC block pool #13633

bauerjs1 opened this issue Jan 26, 2024 · 27 comments · Fixed by #13769
Assignees
Labels

Comments

@bauerjs1
Copy link

bauerjs1 commented Jan 26, 2024

When using an erasure-coded data pool for RBDs, there must exist an additional replicated metadata pool. Since Ceph does not support RADOS namespaces in EC pools, I only created it in the metadata pool. When I pass the flags

--rbd-data-pool-name ceph-block
--rbd-metadata-ec-pool-name ceph-block-metadata # afaik metadata pool can't be EC so the name is a bit misleading
--rados-namespace staging

to create-external-cluster-resources.py, the script looks for namespace staging in the EC data pool ceph-block (where it cannot exist) instead of the metadata pool.

How can I use tenant isolation with RADOS namespaces for external clusters when the data pool for RBDs is erasure-coded?

Thanks in advance!

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior:
create-external-cluster-resources.py script fails with

The provided rados Namespace, 'staging', is not found in the pool 'ceph-block'

Expected behavior:
To my understanding, the script should look for the provided namespace in the metadata block pool if the data pool is erasure-coded (please correct me if I am wrong here)

How to reproduce it:
Apply below resources in source cluster and try to extract information for a Rook external consumer cluster with

python3 create-external-cluster-resources.py --rbd-data-pool-name ceph-block --rbd-metadata-ec-pool-name ceph-block-metadata --rados-namespace staging

File(s) to submit:
Resources created in the source cluster:

# metadata pool (for use in consumer StorageClass.parameters.pool)
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: ceph-block-metadata
spec:
  replicated:
    size: 3
---
# data pool (for use in consumer StorageClass.parameters.dataPool)
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: ceph-block
spec:
  erasureCoded:
    codingChunks: 1
    dataChunks: 2
---
apiVersion: ceph.rook.io/v1
kind: CephBlockPoolRadosNamespace
metadata:
  name: staging
spec:
  blockPoolName: ceph-block-metadata
  name: staging

Environment:
Source cluster:

  • OS: Ubuntu 20.04
  • Kernel: 5.15.0
  • Rook version: 1.13.3
  • Storage backend version: 18.2.1
  • Kubernetes version: 1.25
  • Kubernetes cluster type: bare metal
@bauerjs1
Copy link
Author

bauerjs1 commented Feb 9, 2024

Any ideas? I would probably go with a replicated pool for block storage, otherwise.

@subhamkrai
Copy link
Contributor

@bauerjs1 Something seems to be wrong, please wait for @parth-gr to response he will back in next week. Thanks

@parth-gr
Copy link
Member

parth-gr commented Feb 12, 2024

@bauerjs1 We currently do not support rados namespace with ec block pool, I think even not ceph,

To my understanding, the script should look for the provided namespace in the metadata block pool if the data pool is erasure-coded (please correct me if I am wrong here)

That can be done, but its just the python script check, we need that support from the ceph backend.

I you have any recommendations on how to use it please let us know.

cc @travisn

@bauerjs1
Copy link
Author

Yes, afaik Ceph does not support namespaced EC block pool.

To my understanding, the script should look for the provided namespace in the metadata block pool if the data pool is erasure-coded (please correct me if I am wrong here)

That can be done, but its just the python script check, we need that support from the ceph backend.

I you have any recommendations on how to use it please let us know.

I am not sure what you mean here.
Shouldn't it be sufficient that if --rbd-metadata-ec-pool-name is provided, the script should look for the --rados-namespace there (instead of the data pool). This should be supported by Ceph, I was able to do that from the CLI.

However, I am unsure if that construct of having a namespaced metadata pool but a non-namespaced data pool would make sense at all. I am far from being a Ceph expert and that just felt most intuitive to me, tbh.

@parth-gr
Copy link
Member

I tried creating radosnamesapce in ec pool and it was successful, then I created a ec Storageclass(updating the cluster-id throuh radosnamesapce status) and created the pvc,

But I got this warning in pvc and still in pending state, not sure how it worked for you @bauerjs1

Type     Reason                Age                  From                                                                                                                Message
  ----     ------                ----                 ----                                                                                                                -------
  Normal   Provisioning          8s (x9 over 2m16s)   [openshift-storage.rbd.csi.ceph.com](http://openshift-storage.rbd.csi.ceph.com/)_csi-rbdplugin-provisioner-57488675f8-mvhh6_e21e5f28-22bf-46eb-9469-3fe1119588ee  External provisioner is provisioning volume for claim "openshift-storage/rbd-pvc-test"
  Warning  ProvisioningFailed    8s (x9 over 2m16s)   openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-57488675f8-mvhh6_e21e5f28-22bf-46eb-9469-3fe1119588ee  failed to provision volume with StorageClass "rook-ceph-block-ec": rpc error: code = InvalidArgument desc = failed to fetch monitor list using clusterID (cb04b59bcd4b2535c5cb1b1ebec4c59b): missing configuration for cluster ID "cb04b59bcd4b2535c5cb1b1ebec4c59b"
  Normal   ExternalProvisioning  1s (x11 over 2m16s)  persistentvolume-controller                                                                                         Waiting for a volume to be created either by the external provisioner 'openshift-storage.rbd.csi.ceph.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
[rider@localhost rbd]$

cc @Madhu-1 should this need fix in csi

@parth-gr
Copy link
Member

[rider@localhost rbd]$ oc get CephBlockPoolRadosNamespace -o yaml
apiVersion: v1
items:
- apiVersion: ceph.rook.io/v1
  kind: CephBlockPoolRadosNamespace
  metadata:
    creationTimestamp: "2024-02-13T15:09:40Z"
    finalizers:
    - cephblockpoolradosnamespace.ceph.rook.io
    generation: 1
    managedFields:
    - apiVersion: ceph.rook.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:spec:
          .: {}
          f:blockPoolName: {}
      manager: kubectl-create
      operation: Update
      time: "2024-02-13T15:09:40Z"
    - apiVersion: ceph.rook.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:finalizers:
            .: {}
            v:"cephblockpoolradosnamespace.ceph.rook.io": {}
      manager: rook
      operation: Update
      time: "2024-02-13T15:09:40Z"
    - apiVersion: ceph.rook.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:status:
          .: {}
          f:info:
            .: {}
            f:clusterID: {}
          f:phase: {}
      manager: rook
      operation: Update
      subresource: status
      time: "2024-02-13T15:10:58Z"
    name: namespace-a
    namespace: openshift-storage
    resourceVersion: "3688541"
    uid: 8fde04f9-1a40-4098-b552-a0978e6f4146
  spec:
    blockPoolName: ec-pool
  status:
    info:
      clusterID: cb04b59bcd4b2535c5cb1b1ebec4c59b
    phase: Failure
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

@parth-gr
Copy link
Member

ohh isee now even the rados namespace is in failurre state so we cant create radosnamesapce on ec pool

@bauerjs1
Copy link
Author

Yes, afaik Ceph does not support namespaced EC block pool.

Yes, unfortunately Ceph doesn't support this. So the idea was to create the namespaces only in the metadata pool (which is always replicated and not erasure-coded). I can confirm that Rook's RADOS namespace objects are successfully created on the metadata pool:

apiVersion: ceph.rook.io/v1
kind: CephBlockPoolRadosNamespace
metadata:
  name: staging
  namespace: storage
spec:
  blockPoolName: ceph-block-metadata
  name: staging
status:
  info:
    clusterID: 429b...
  phase: Ready

However, there is currently no possibility to pass this to the create-external-cluster-resources.py script.

parth-gr added a commit to parth-gr/rook that referenced this issue Feb 14, 2024
currently addded the support for rados namesapce
for rbd ec pools upstream

closes: rook#13633

Signed-off-by: parth-gr <partharora1010@gmail.com>
parth-gr added a commit to parth-gr/rook that referenced this issue Feb 14, 2024
currently addded the support for rados namesapce
for rbd ec pools upstream

closes: rook#13633

Signed-off-by: parth-gr <partharora1010@gmail.com>
@parth-gr
Copy link
Member

parth-gr commented Feb 14, 2024

@bauerjs1 is this what you are looking for #13769,
Can you try it now

@parth-gr
Copy link
Member

@bauerjs1 so can you try and test the new script changes #13769 and lemme know is that fix you wanted, as the requirements were quite confusing.

@bauerjs1
Copy link
Author

Sorry for the delay @parth-gr, I'm quite busy with different stuff in the past few weeks but I'm currently on it again. I will let you know as soon as I have tested it.

@bauerjs1
Copy link
Author

bauerjs1 commented Feb 22, 2024

I suppose it's a different problem, but I am now getting the error

Execution Failed: Only 'v2' address type is enabled, user should also enable 'v1' type as well

so I am not yet able to tell if this works now.

PR #8083 seems to be the origin of the error message but I have no clue, why this fails and how to resolve that. Since I enabled encryption, I passed the flag --v2-port-enable as explained in the docs.

Networking config:

  network:
    connections:
      compression:
        enabled: false
      encryption:
        enabled: true
      requireMsgr2: false

@parth-gr
Copy link
Member

@bauerjs1 so you were able to test the original change for ec pools?

@bauerjs1
Copy link
Author

bauerjs1 commented Feb 22, 2024

Yes I tried to test it but I'm afraid I can't tell you whether the original issue is solved because the script still fails, as mentioned above

@parth-gr
Copy link
Member

@bauerjs1 can you show the output of
ceph quorum_status

@bauerjs1
Copy link
Author

This is the output:

{
  "election_epoch": 12,
  "quorum": [
    0,
    1,
    2
  ],
  "quorum_names": [
    "a",
    "b",
    "c"
  ],
  "quorum_leader_name": "a",
  "quorum_age": 505050,
  "features": {
    "quorum_con": "4540138322906710015",
    "quorum_mon": [
      "kraken",
      "luminous",
      "mimic",
      "osdmap-prune",
      "nautilus",
      "octopus",
      "pacific",
      "elector-pinging",
      "quincy",
      "reef"
    ]
  },
  "monmap": {
    "epoch": 3,
    "fsid": "49b5d4a6-a0bb-4c8f-9736-1f57ba3a5425",
    "modified": "2024-02-22T12:29:01.352672Z",
    "created": "2024-02-22T12:28:22.909053Z",
    "min_mon_release": 18,
    "min_mon_release_name": "reef",
    "election_strategy": 1,
    "disallowed_leaders: ": "",
    "stretch_mode": false,
    "tiebreaker_mon": "",
    "removed_ranks: ": "",
    "features": {
      "persistent": [
        "kraken",
        "luminous",
        "mimic",
        "osdmap-prune",
        "nautilus",
        "octopus",
        "pacific",
        "elector-pinging",
        "quincy",
        "reef"
      ],
      "optional": []
    },
    "mons": [
      {
        "rank": 0,
        "name": "a",
        "public_addrs": {
          "addrvec": [
            {
              "type": "v2",
              "addr": "10.233.5.247:3300",
              "nonce": 0
            }
          ]
        },
        "addr": "10.233.5.247:3300/0",
        "public_addr": "10.233.5.247:3300/0",
        "priority": 0,
        "weight": 0,
        "crush_location": "{}"
      },
      {
        "rank": 1,
        "name": "b",
        "public_addrs": {
          "addrvec": [
            {
              "type": "v2",
              "addr": "10.233.44.101:3300",
              "nonce": 0
            }
          ]
        },
        "addr": "10.233.44.101:3300/0",
        "public_addr": "10.233.44.101:3300/0",
        "priority": 0,
        "weight": 0,
        "crush_location": "{}"
      },
      {
        "rank": 2,
        "name": "c",
        "public_addrs": {
          "addrvec": [
            {
              "type": "v2",
              "addr": "10.233.24.101:3300",
              "nonce": 0
            }
          ]
        },
        "addr": "10.233.24.101:3300/0",
        "public_addr": "10.233.24.101:3300/0",
        "priority": 0,
        "weight": 0,
        "crush_location": "{}"
      }
    ]
  }
}

I see that the mons have only v2 ports enabled but I did not find a way to change that. I thought it would be requireMsgr2: false and I bootstrapped a fresh cluster with this setting but it didn't help.

@parth-gr
Copy link
Member

this is moreover a ceph configuration nothing to relate with external script,

Currently external script either checks if v1 exists or both v1 and v2 exists,

@travisn should we support where only v2 exists?

@bauerjs1
Copy link
Author

Since this cluster is set up by rook, is there a possibility to enable v1 ports, e.g. in CephCluster.spec?

@travisn
Copy link
Member

travisn commented Feb 28, 2024

With the network encryption enabled, the v2 ports are enabled. If you change it to false, the v1 ports should be available.

    encryption:
        enabled: true

@parth-gr
Copy link
Member

So this is for the external mode, lets discuss

@bauerjs1
Copy link
Author

bauerjs1 commented Feb 29, 2024

With the network encryption enabled, the v2 ports are enabled. If you change it to false, the v1 ports should be available.

    encryption:
        enabled: true

Hm, I still don't understand, why v1 ports are required. Doesn't this render encryption unusable for external clusters?

According to the docs I'd just have to use the --v2-port-enable flag when using encryption. Which I did, but it does not seem to remove the dependency on v1 ports. Is the documentation wrong here?

@parth-gr
Copy link
Member

parth-gr commented Mar 4, 2024

@bauerjs1 yes that is moreover the internal mode settings,

With this fix #13856 you can use just v2 port

@parth-gr
Copy link
Member

parth-gr commented Mar 5, 2024

@bauerjs1 so #13856 is fixed can you try out the #13769

@bauerjs1
Copy link
Author

Sorry for delay, I'll get to this in the next couple of days, hopefully. Thanks in advance for the effort!

@bauerjs1
Copy link
Author

bauerjs1 commented Mar 21, 2024

Sorry for the late feedback. I've commented on both PRs.

Still having several issues and errors with the script. Since my source cluster is also created by Rook, I wonder if it is possible to do everything in a declarative way, which would be totally awesome! Afaik, the script mainly creates new users and keys. Can this be done by Rook CRs like CephObjectStoreUsers and CephClients and syncing the generated secrets into the consumer cluster? If that is possible, I'd also be happy to contribute to the ceph-cluster Helm chart, to create the required resources for external access 🙂

@parth-gr
Copy link
Member

No we dont have admin privilege with rook client in external mode so they cant create anything

@bauerjs1
Copy link
Author

bauerjs1 commented Mar 22, 2024

Yea, but the source cluster isn't running in external mode, only the consumer cluster. Of course I'd need to create the required CRs in the source cluster. Wouldn't that be possible? Is there any documentation on what the "consumer" Rook needs in the source cluster?

I can also open a new issue on that, if you want, so we don't mix up too many topics here

mergify bot pushed a commit that referenced this issue Apr 22, 2024
currently addded the support for rados namesapce
for rbd ec pools upstream

closes: #13633

Signed-off-by: parth-gr <partharora1010@gmail.com>
(cherry picked from commit adf40b1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants