Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

csi: add/remove mon IP from csi config #10135

Merged
merged 1 commit into from Apr 22, 2022
Merged

Conversation

Madhu-1
Copy link
Member

@Madhu-1 Madhu-1 commented Apr 22, 2022

Description of your changes:

If a mon is added or removed in a ceph cluster all the CSI clusterID belonging to the ceph cluster should also get updated.

Added a new cluster identifier to the csi configmap, i.e ceph cluster namespace which helps to identify the clusterID belongs to that ceph cluster.

Note:- To support the upgraded clusters,we can ask users to update the mon IPs for the problamatic clusterID as it won't be updated by the operator.

Signed-off-by: Madhu Rajanna madhupr007@gmail.com

Which issue is resolved by this Pull Request:
Resolves #10126

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Skip Tests for Docs: If this is only a documentation change, add the label skip-ci on the PR.
  • Reviewed the developer guide on Submitting a Pull Request
  • Pending release notes updated with breaking and/or notable changes for the next minor release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

@travisn this is the other approach as discussed. PTAL

@Madhu-1
Copy link
Member Author

Madhu-1 commented Apr 22, 2022

Manual test results

Without this PR

  • mon entries in configmap
csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["10.100.33.92:6789"]},{"clusterID":"4df0bb6e1b7ddc0c1ea080cbe043dd1a","monitors":["10.100.33.92:6789"],"cephFS":{"subvolumeGroup":"group-a"}},{"clusterID":"80fc4f4bacc064be641633e6ed25ba7e","monitors":["10.100.33.92:6789"],"radosNamespace":"namespace-a"}]'
  • Increase mon count
kuberc get po
NAME                                    READY   STATUS      RESTARTS   AGE
rook-ceph-mds-myfs-a-7f6d54b444-zs269   1/1     Running     0          6m22s
rook-ceph-mds-myfs-b-5c784676db-9x7rv   1/1     Running     0          6m21s
rook-ceph-mgr-a-575c855584-s25bn        1/1     Running     0          7m
rook-ceph-mon-a-656df96ff8-l9z2b        1/1     Running     0          7m26s
rook-ceph-mon-b-c4c6768c4-2gzwj         1/1     Running     0          38s
rook-ceph-mon-c-fbf5668b8-x66gc         1/1     Running     0          26s
rook-ceph-operator-8597df65d-m7sqt      1/1     Running     0          9m38s
rook-ceph-osd-0-b5d869559-8m7q2         1/1     Running     0          6m32s
rook-ceph-osd-prepare-minikube-hwfqv    0/1     Completed   0          8s
rook-ceph-tools-55b77678f-6q69b         1/1     Running     0          9m36s
csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["10.104.122.171:6789","10.109.207.178:6789","10.100.33.92:6789"]},{"clusterID":"4df0bb6e1b7ddc0c1ea080cbe043dd1a","monitors":["10.100.33.92:6789"],"cephFS":{"subvolumeGroup":"group-a"}},{"clusterID":"80fc4f4bacc064be641633e6ed25ba7e","monitors":["10.100.33.92:6789"],"radosNamespace":"namespace-a"}]'

Note:- the mon IP are not getting updated for the other clusterID's than rook-ceph

With this PR

  • Create new subvolume and rados namespace
csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["10.100.33.92:6789","10.104.122.171:6789","10.109.207.178:6789"],"clusterIdentifier":""},{"clusterID":"4df0bb6e1b7ddc0c1ea080cbe043dd1a","monitors":["10.100.33.92:6789"],"clusterIdentifier":"","cephFS":{"subvolumeGroup":"group-a"}},{"clusterID":"80fc4f4bacc064be641633e6ed25ba7e","monitors":["10.100.33.92:6789"],"clusterIdentifier":"","radosNamespace":"namespace-a"},{"clusterID":"cb84d01f680eec8c88328f785c51221b","monitors":["10.100.33.92:6789","10.104.122.171:6789","10.109.207.178:6789"],"clusterIdentifier":"rook-ceph","radosNamespace":"namespace-new"},{"clusterID":"0fe5aea972cdb15acfd7891f5953f9e2","monitors":["10.100.33.92:6789","10.104.122.171:6789","10.109.207.178:6789"],"clusterIdentifier":"rook-ceph","cephFS":{"subvolumeGroup":"group-new"}}]'
  • Increase the mon count
[🎩︎]mrajanna@fedora examples $]kuberc get po
NAME                                    READY   STATUS      RESTARTS   AGE
rook-ceph-mds-myfs-a-7f6d54b444-zs269   1/1     Running     0          27m
rook-ceph-mds-myfs-b-5c784676db-9x7rv   1/1     Running     0          27m
rook-ceph-mgr-a-575c855584-s25bn        1/1     Running     0          27m
rook-ceph-mon-a-656df96ff8-l9z2b        1/1     Running     0          28m
rook-ceph-mon-b-c4c6768c4-2gzwj         1/1     Running     0          21m
rook-ceph-mon-c-fbf5668b8-x66gc         1/1     Running     0          21m
rook-ceph-mon-d-74f5886db6-xfkl8        1/1     Running     0          17m
rook-ceph-mon-e-5594766db4-dhjcq        1/1     Running     0          17m
rook-ceph-mon-f-644fc6ccdd-44hgs        1/1     Running     0          13m
rook-ceph-mon-g-56f889ffcb-k6bzq        1/1     Running     0          12m
rook-ceph-operator-8689858c86-n6859     1/1     Running     0          17s
rook-ceph-osd-0-56ffbc8497-czgmq        1/1     Running     0          12m
rook-ceph-osd-prepare-minikube-bk6dg    0/1     Completed   0          113s
rook-ceph-tools-55b77678f-6q69b         1/1     Running     0          30m
  • configmap update for all the entries with clusterIdentifier and default clusterID
csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["10.100.33.92:6789","10.104.122.171:6789","10.110.31.239:6789","10.104.116.14:6789","10.110.66.28:6789","10.97.202.54:6789","10.109.207.178:6789"],"clusterIdentifier":""},{"clusterID":"4df0bb6e1b7ddc0c1ea080cbe043dd1a","monitors":["10.100.33.92:6789"],"clusterIdentifier":"","cephFS":{"subvolumeGroup":"group-a"}},{"clusterID":"80fc4f4bacc064be641633e6ed25ba7e","monitors":["10.100.33.92:6789"],"clusterIdentifier":"","radosNamespace":"namespace-a"},{"clusterID":"cb84d01f680eec8c88328f785c51221b","monitors":["10.100.33.92:6789","10.104.122.171:6789","10.110.31.239:6789","10.104.116.14:6789","10.110.66.28:6789","10.97.202.54:6789","10.109.207.178:6789"],"clusterIdentifier":"rook-ceph","radosNamespace":"namespace-new"},{"clusterID":"0fe5aea972cdb15acfd7891f5953f9e2","monitors":["10.100.33.92:6789","10.104.122.171:6789","10.110.31.239:6789","10.104.116.14:6789","10.110.66.28:6789","10.97.202.54:6789","10.109.207.178:6789"],"clusterIdentifier":"rook-ceph","cephFS":{"subvolumeGroup":"group-new"}}]'
  • Delete subvolumegroup and radosnamespace and check entries
csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["10.109.207.178:6789","10.100.33.92:6789","10.104.122.171:6789","10.110.31.239:6789","10.104.116.14:6789","10.110.66.28:6789","10.97.202.54:6789"],"clusterIdentifier":""},{"clusterID":"4df0bb6e1b7ddc0c1ea080cbe043dd1a","monitors":["10.100.33.92:6789"],"clusterIdentifier":"","cephFS":{"subvolumeGroup":"group-a"}},{"clusterID":"80fc4f4bacc064be641633e6ed25ba7e","monitors":["10.100.33.92:6789"],"clusterIdentifier":"","radosNamespace":"namespace-a"}]'

pkg/operator/ceph/csi/cluster_config.go Outdated Show resolved Hide resolved
@@ -112,6 +113,16 @@ func updateCsiClusterConfig(curr, clusterKey string, newCsiClusterConfigEntry *C
// a lock is acquired for the update operation. So concurrent updates (rare event) will block and
// wait for the other update to complete. Monitors and Subvolumegroup will be updated
// independently and won't collide.
for i, centry := range cc {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can avoid requiring the clusterInfo param if we loop in the following way:

if newCsiClusterConfigEntry != nil {
	for i, centry := range cc {
		if newCsiClusterConfigEntry.ClusterID == centry.Namespace {
			// update the mons
		}
	}
}

If newCsiClusterConfigEntry is an entry for a subvolume group or rados namespace, the ClusterID just won't match any namespaces. If it is for a cluster, it will match all the namespaces.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newCsiClusterConfigEntry will not be sent in the with clusterID, newCsiClusterConfigEntry.ClusterID will remain empty in all the places. not sure about the check will work. If we need to make it work I need to send clusterID in other places where we are calling SaveClusterConfig. let me know if its fine i will make changes, as i need to test it manually it might take time

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, part of my PR had been to fix getting ClusterID always set. :) The way it is we can actually compare instead with clusterKey, right?

if clusterKey == centry.Namespace {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work with a couple of other changes. PTAL

@Madhu-1 Madhu-1 force-pushed the fix-mon-problem branch 2 times, most recently from 7a6ccac to e9818a4 Compare April 22, 2022 16:54
If a mon is added or removed in a ceph
cluster all the CSI clusterID's belongs
to the ceph cluster should also gets updated.

Added a new cluster identifier to the csi configmap,
i.e ceph clusterNamespace which helps to identify
the clusterID's belongs to that ceph cluster.

Note:- to support the upgraded clusters,we
can ask users to update the mon IP's for the
problamatic clusterID as it wont be updated
by the operator.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
@Madhu-1
Copy link
Member Author

Madhu-1 commented Apr 22, 2022

  • Manual testing logs

  • Without mon change (3 mons)

csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["10.110.66.28:6789","10.97.202.54:6789","10.104.116.14:6789"],"namespace":""},{"clusterID":"4df0bb6e1b7ddc0c1ea080cbe043dd1a","monitors":["10.97.202.54:6789","10.104.116.14:6789","10.110.66.28:6789"],"namespace":"","cephFS":{"subvolumeGroup":"group-a"}},{"clusterID":"80fc4f4bacc064be641633e6ed25ba7e","monitors":["10.110.66.28:6789","10.97.202.54:6789","10.104.116.14:6789"],"namespace":"","radosNamespace":"namespace-a"},{"clusterID":"0fe5aea972cdb15acfd7891f5953f9e2","monitors":["10.110.66.28:6789","10.97.202.54:6789","10.104.116.14:6789"],"namespace":"rook-ceph","cephFS":{"subvolumeGroup":"group-new"}},{"clusterID":"cb84d01f680eec8c88328f785c51221b","monitors":["10.110.66.28:6789","10.97.202.54:6789","10.104.116.14:6789"],"namespace":"rook-ceph","radosNamespace":"namespace-new"}]'
  • After adding 2 mons (total 5 mons)
csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["10.104.116.14:6789","10.97.46.251:6789","10.100.119.29:6789","10.110.66.28:6789","10.97.202.54:6789"],"namespace":""},{"clusterID":"4df0bb6e1b7ddc0c1ea080cbe043dd1a","monitors":["10.110.66.28:6789","10.97.202.54:6789","10.104.116.14:6789"],"namespace":"","cephFS":{"subvolumeGroup":"group-a"}},{"clusterID":"80fc4f4bacc064be641633e6ed25ba7e","monitors":["10.110.66.28:6789","10.97.202.54:6789","10.104.116.14:6789"],"namespace":"","radosNamespace":"namespace-a"},{"clusterID":"0fe5aea972cdb15acfd7891f5953f9e2","monitors":["10.104.116.14:6789","10.97.46.251:6789","10.100.119.29:6789","10.110.66.28:6789","10.97.202.54:6789"],"namespace":"rook-ceph","cephFS":{"subvolumeGroup":"group-new"}},{"clusterID":"cb84d01f680eec8c88328f785c51221b","monitors":["10.104.116.14:6789","10.97.46.251:6789","10.100.119.29:6789","10.110.66.28:6789","10.97.202.54:6789"],"namespace":"rook-ceph","radosNamespace":"namespace-new"}]'
  • After removing 2 mons (total 3 mons)
csi-cluster-config-json: '[{"clusterID":"rook-ceph","monitors":["10.97.202.54:6789","10.97.46.251:6789","10.100.119.29:6789"],"namespace":""},{"clusterID":"4df0bb6e1b7ddc0c1ea080cbe043dd1a","monitors":["10.110.66.28:6789","10.97.202.54:6789","10.104.116.14:6789","10.97.46.251:6789","10.100.119.29:6789"],"namespace":"","cephFS":{"subvolumeGroup":"group-a"}},{"clusterID":"80fc4f4bacc064be641633e6ed25ba7e","monitors":["10.97.202.54:6789","10.104.116.14:6789","10.97.46.251:6789","10.100.119.29:6789","10.110.66.28:6789"],"namespace":"","radosNamespace":"namespace-a"},{"clusterID":"0fe5aea972cdb15acfd7891f5953f9e2","monitors":["10.97.202.54:6789","10.97.46.251:6789","10.100.119.29:6789"],"namespace":"rook-ceph","cephFS":{"subvolumeGroup":"group-new"}},{"clusterID":"cb84d01f680eec8c88328f785c51221b","monitors":["10.97.202.54:6789","10.97.46.251:6789","10.100.119.29:6789"],"namespace":"rook-ceph","radosNamespace":"namespace-new"}]'

@travisn travisn merged commit ab4ae01 into rook:master Apr 22, 2022
mergify bot added a commit that referenced this pull request Apr 22, 2022
csi: add/remove mon IP from csi config (backport #10135)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

mon entries in csi configmap is not updated for subvolume group/rados namespace in external cluster mode
2 participants