Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon entries in csi configmap is not updated for subvolume group/rados namespace in external cluster mode #10126

Closed
Madhu-1 opened this issue Apr 21, 2022 · 12 comments · Fixed by #10135
Labels

Comments

@Madhu-1
Copy link
Member

Madhu-1 commented Apr 21, 2022

The mon IP are updated only for the default cluster ID in case of the external mode and more subvolume groups created, Below is the configmap output.

apiVersion: v1
data:
  csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["10.0.161.89:6789","10.0.162.250:6789","10.0.163.112:6789"]},{"clusterID":"301f300cac48f552a5cd6b837c49d355","monitors":["10.0.163.80:6789"],"cephFS":{"subvolumeGroup":"cephfilesystemsubvolumegroup-storageconsumer-82b1ab88-29c0-4878-b572-fb4dda0623aa"}}]'
kind: ConfigMap

For cluster-ID rook-ceph we have all 3 monitors Rook reconciled but for the cluster-ID 301f300cac48f552a5cd6b837c49d355 we have only one Mon. Rook should update all the clusterID belonging to one ceph cluster with required mon information

AS discussed with @travisn below could be two possible approaches

data:
  csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["10.0.161.89:6789","10.0.162.250:6789","10.0.163.112:6789"]},{"clusterID":"301f300cac48f552a5cd6b837c49d355","monitors":["10.0.163.80:6789"],"cephFS":{"subvolumeGroup":"cephfilesystemsubvolumegroup-storageconsumer-82b1ab88-29c0-4878-b572-fb4dda0623aa"}}]'
  clusterIDMappings:'[{"clusterID":"rook-ceph","cephFS":[{"subvolumeGroupName":"cephfilesystemsubvolumegroup-123"},{"subvolumeGroupName":"cephfilesystemsubvolumegroup-567"}],"rbd":[{"radosNamespace":"radosnamespace1234"},{"radosNamespace":"radosnamespace5678"}]}]'
kind: ConfigMap

or

apiVersion: v1
data:
  csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["10.0.161.89:6789","10.0.162.250:6789","10.0.163.112:6789"]},{"clusterID":"301f300cac48f552a5cd6b837c49d355","monitors":["10.0.163.80:6789"],"cephFS":{"subvolumeGroup":"cephfilesystemsubvolumegroup-storageconsumer-82b1ab88-29c0-4878-b572-fb4dda0623aa"}}]'
  clusterIDMappings:'[{"rook-ceph":["clusterID1","clusterID2","clusterID3"]},{"rook-ceph-external":["clusterID4","clusterID5","clusterID6"]}]'
kind: ConfigMap

with new empty in configmap clusterIDMappings we will have a mapping between clusterID's of subvolumegroups and rados namespaces to the namespace name of ceph cluster.

@Madhu-1 Madhu-1 added the bug label Apr 21, 2022
@Madhu-1
Copy link
Member Author

Madhu-1 commented Apr 21, 2022

cc @travisn

@parth-gr
Copy link
Member

Are entries working fine with the normal rook-ceph mode?

@Madhu-1
Copy link
Member Author

Madhu-1 commented Apr 21, 2022

Are entries working fine with the normal rook-ceph mode?

I don't think so if new mon's are added/deleted (still need to test it) but rook-operator pod update should fix it i think

@Madhu-1
Copy link
Member Author

Madhu-1 commented Apr 21, 2022

Are entries working fine with the normal rook-ceph mode?

I don't think so if new mon's are added/deleted (still need to test it) but rook-operator pod update should fix it i think

A restart will not work because of the predicate we added, as there is no change in CR it will not get reconciled again.

@travisn
Copy link
Member

travisn commented Apr 21, 2022

I'm imagining a format slightly different from the first option you mentioned.

  • No need for any subvolumegroup info in the csi-cluster-config-json (perhaps you unintentionally included it in the example?)
  • The subvolumegroups and namespaces can be a simple list
data:
  csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["10.0.161.89:6789","10.0.162.250:6789","10.0.163.112:6789"]}]'
  clusterIDMappings:'[{"clusterID":"rook-ceph","cephFS":["cephfilesystemsubvolumegroup-123","cephfilesystemsubvolumegroup-567"],"rbd":["radosnamespace1234","radosnamespace5678"]}]'
kind: ConfigMap

@Madhu-1
Copy link
Member Author

Madhu-1 commented Apr 21, 2022

This might not work for upgraded cluster we might need to consider that one as well.

@travisn
Copy link
Member

travisn commented Apr 21, 2022

This might not work for upgraded cluster we might need to consider that one as well.

What versions do we need to worry about? It seems like a new feature that we could assume starting fresh, no?

@Madhu-1
Copy link
Member Author

Madhu-1 commented Apr 21, 2022

Existing cluster which already have subvolumegroups created?

@Madhu-1
Copy link
Member Author

Madhu-1 commented Apr 21, 2022

Am fine with new clusters also.

@travisn
Copy link
Member

travisn commented Apr 21, 2022

This has been in v1.8 so I guess there could be some clusters. Perhaps it would be simple for rook to just convert those entries to the new format if it finds them. I'll look at that.

@Madhu-1
Copy link
Member Author

Madhu-1 commented Apr 21, 2022

This has been in v1.8 so I guess there could be some clusters. Perhaps it would be simple for rook to just convert those entries to the new format if it finds them. I'll look at that.

Sounds good. Thanks @travisn

@travisn
Copy link
Member

travisn commented Apr 22, 2022

Per offline discussion, #10131 isn't feasible since the unique clusterID for each subvolumegroup and rados namespace is key to the current design.

What I want to avoid is that the operator should not have to update the mon endpoints in multiple places if there are multiple subvolumegroups and rados namespaces. The list of mons should only be defined in a single place for each cluster. What if we just add a new field to the ClusterInfo struct such as monitorsRef, which is the namespace where the mons are defined. The monitorsRef would be set instead of the monitors for subvolume groups and rados namespaces. Then when it finds a ClusterInfo with no monitors set, but a monitorsRef, the csi driver just needs to look up in the configmap based on the clusterID defined by the monitorsRef.

@Madhu-1 Madhu-1 linked a pull request Apr 22, 2022 that will close this issue
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants