Skip to content

Commit

Permalink
csi: only create CSI config configmap in CSI reconciler
Browse files Browse the repository at this point in the history
There have been some issues with non-CSI Rook controllers that are
creating the CSI config configmap (`rook-ceph-csi-config`). This causes
problems with the K8s OwnerReference. The primary CSI reconciler
(controller) creates the configmap with the correct owner reference,
which is supposed to be the operator deployment.

Other instances were creating the configmap with owner references set to
the CephCluster. This is a minor bug, but it can result in this
configmap being deleted along with the first CephCluster that initially
created it.

To fix this issue, remove all instances of `CreateCsiConfigMap()` except
the single usage which the CSI reconcile uses to initially create the
configmap. Other controllers that might have attempted to create this
configmap previously will return an error indicating that it is waiting
for the configmap to be created.

Signed-off-by: Blaine Gardner <blaine.gardner@ibm.com>
  • Loading branch information
BlaineEXE committed Apr 18, 2024
1 parent 7cc5bdf commit 8010a51
Show file tree
Hide file tree
Showing 3 changed files with 55 additions and 9 deletions.
49 changes: 49 additions & 0 deletions deploy/examples/cluster-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,55 @@
# kubectl create -f crds.yaml -f common.yaml -f operator.yaml
# kubectl create -f cluster-test.yaml
#################################################################################################################

#
# Trying to repro a race condition wherein the CephFilesystemSubVolumeGroup controller is the first
# one to attempt to create the `rook-ceph-csi-config` configmap
# Create those resources first to see if that reproduces the issue
#

apiVersion: ceph.rook.io/v1
kind: CephFilesystem
metadata:
name: myfs
namespace: rook-ceph # namespace:cluster
spec:
metadataPool:
replicated:
size: 1
requireSafeReplicaSize: false
dataPools:
- name: replicated
failureDomain: osd
replicated:
size: 1
requireSafeReplicaSize: false
preserveFilesystemOnDelete: false
metadataServer:
activeCount: 1
activeStandby: true
---
# create default csi subvolume group
apiVersion: ceph.rook.io/v1
kind: CephFilesystemSubVolumeGroup
metadata:
name: myfs-csi # lets keep the svg crd name same as `filesystem name + csi` for the default csi svg
namespace: rook-ceph # namespace:cluster
spec:
# The name of the subvolume group. If not set, the default is the name of the subvolumeGroup CR.
name: csi
# filesystemName is the metadata name of the CephFilesystem CR where the subvolume group will be created
filesystemName: myfs
# reference https://docs.ceph.com/en/latest/cephfs/fs-volumes/#pinning-subvolumes-and-subvolume-groups
# only one out of (export, distributed, random) can be set at a time
# by default pinning is set with value: distributed=1
# for disabling default values set (distributed=0)
pinning:
distributed: 1 # distributed=<0, 1> (disabled=0)
# export: # export=<0-256> (disabled=-1)
# random: # random=[0.0, 1.0](disabled=0.0)

---
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
Expand Down
10 changes: 5 additions & 5 deletions pkg/operator/ceph/cluster/cluster_external.go
Original file line number Diff line number Diff line change
Expand Up @@ -108,11 +108,11 @@ func (c *ClusterController) configureExternalCephCluster(cluster *cluster) error
}
}

// Create CSI config map
err = csi.CreateCsiConfigMap(c.OpManagerCtx, c.namespacedName.Namespace, c.context.Clientset, cluster.ownerInfo)
if err != nil {
return errors.Wrap(err, "failed to create csi config map")
}
// // Create CSI config map
// err = csi.CreateCsiConfigMap(c.OpManagerCtx, c.namespacedName.Namespace, c.context.Clientset, cluster.ownerInfo)
// if err != nil {
// return errors.Wrap(err, "failed to create csi config map")
// }

// update the msgr2 flag
for _, m := range cluster.ClusterInfo.Monitors {
Expand Down
5 changes: 1 addition & 4 deletions pkg/operator/ceph/csi/cluster_config.go
Original file line number Diff line number Diff line change
Expand Up @@ -292,10 +292,7 @@ func SaveClusterConfig(clientset kubernetes.Interface, clusterNamespace string,
configMap, err := clientset.CoreV1().ConfigMaps(csiNamespace).Get(clusterInfo.Context, ConfigName, metav1.GetOptions{})
if err != nil {
if k8serrors.IsNotFound(err) {
err = CreateCsiConfigMap(clusterInfo.Context, csiNamespace, clientset, clusterInfo.OwnerInfo)
if err != nil {
return errors.Wrap(err, "failed creating csi config map")
}
return errors.Wrap(err, "waiting for CSI config map to be created")
}
return errors.Wrap(err, "failed to fetch current csi config map")
}
Expand Down

0 comments on commit 8010a51

Please sign in to comment.