New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
operator: load cluster owner info in LoadClusterInfo #14079
operator: load cluster owner info in LoadClusterInfo #14079
Conversation
The CreateOrLoadClusterInfo (and therefore LoadClusterInfo) methods were not loading the ClusterInfo.OwnerInfo. This should help future CRD controllers get the full ClusterInfo struct without having missing information. Current controllers that need ClusterInfo fill the field themselves. During testing, I observed one corner case where an upgraded cluster was missing the `rook-ceph-csi-config` configmap. The cluster had a CephFilesystemSubVolumeGroup resource created, and the reconcile for that resource was attempting to create the missing CSI configmap and failing with a nil pointer exception due to the missing OwnerInfo field in ClusterInfo. This cluster condition hasn't been reproduced in healthy environments, and it is unknown how the CSI configmap came to be missing. However, the case did expose the missing loaded info as a potential for causing nil pointer exceptions during corner cases when code is otherwise correct. Signed-off-by: Blaine Gardner <blaine.gardner@ibm.com>
I don't anticipate this showing up in user environments because the corner case here was the missing But it seemed like something that would be a good fix anyway for the sake of developers. I've had nil pointer issues in the past that end up being caused by Rook only partially completing struct information during runtime. It's always tedious and annoying, and I think it's worth trying to make that less common for devs in the future. |
@@ -141,6 +141,9 @@ func CreateOrLoadClusterInfo(clusterdContext *clusterd.Context, context context. | |||
} else { | |||
return nil, maxMonID, monMapping, errors.New("failed to find either the cluster admin key or the username") | |||
} | |||
if len(secrets.OwnerReferences) > 0 { | |||
clusterInfo.OwnerInfo = k8sutil.NewOwnerInfoWithOwnerRef(&secrets.GetOwnerReferences()[0], namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems safe to assume there is always an ownerref on this secret, like you said we have had it there for a long time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I think it should be safe because of this code above:
secrets, err := clusterdContext.Clientset.CoreV1().Secrets(namespace).Get(context, AppName, metav1.GetOptions{})
if err != nil {
if !kerrors.IsNotFound(err) {
return nil, maxMonID, monMapping, errors.Wrap(err, "failed to get mon secrets")
}
if ownerInfo == nil {
return nil, maxMonID, monMapping, ClusterInfoNoClusterNoSecret
}
When a CephCluster is first created and the Secret is initially created and populated (by initClusterInfo()
/PopulateExternalClusterInfo()
), the secret shouldn't get created unless there is a CephCluster given to act as the owner.
operator: load cluster owner info in LoadClusterInfo (backport #14079)
The CreateOrLoadClusterInfo (and therefore LoadClusterInfo) methods were not loading the ClusterInfo.OwnerInfo. This should help future CRD controllers get the full ClusterInfo struct without having missing information. Current controllers that need ClusterInfo fill the field themselves.
During testing, I observed one corner case where an upgraded cluster was missing the
rook-ceph-csi-config
configmap. The cluster had a CephFilesystemSubVolumeGroup resource created, and the reconcile for that resource was attempting to create the missing CSI configmap and failing with a nil pointer exception due to the missing OwnerInfo field in ClusterInfo. This cluster condition hasn't been reproduced in healthy environments, and it is unknown how the CSI configmap came to be missing. However, the case did expose the missing loaded info as a potential for causing nil pointer exceptions during corner cases when code is otherwise correct.Checklist: