New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nfs: restart nfs servers when configmap is updated #9104
Conversation
2ff8604
to
38855ac
Compare
|
||
svcs, err := r.context.Clientset.CoreV1().Services(ns).List(context.TODO(), metav1.ListOptions{}) | ||
assert.NoError(t, err) | ||
// Each NFS server gets a service. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the current behavior, but I'm not sure this is right. Should this be the design?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part of the design doc, so I guess it seems good.
The operator creates a k8s service for each of the ganesha server pods
to allow each of the them to have a stable IP address.
When the configuration configmap is updated for a CephNFS server, the NFS application should restart to ensure it is running with the latest config. Fixes rook#9028 Signed-off-by: Blaine Gardner <blaine.gardner@redhat.com>
38855ac
to
5a7dee2
Compare
@@ -141,6 +141,12 @@ func (r *ReconcileCephNFS) makeDeployment(nfs *cephv1.CephNFS, cfg daemonConfig) | |||
ObjectMeta: metav1.ObjectMeta{ | |||
Name: resourceName, | |||
Labels: getLabels(nfs, cfg.ID, true), | |||
Annotations: map[string]string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're seeing an update to the annotations cause a pod restart? I was thinking the annotations didn't affect the pod restart, but that must only be the case when the annotations are on the deployment instead of the podTemplateSpec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I had a similar comment in my initial review, somehow I cannot find it anymore. Why not use the deployment annotation instead of the pod's one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got distracted from my testing by other things. I'll start testing again today to verify that annotations can cause pods to restart. I'm pretty sure they can.
A changed annotation on the deployment won't cause the pod to restart, but a changed annotation on the pod should. Relatedly, that's why the ceph_version
and rook_version
labels are on deployments and not pods, because they don't cause Pods to restart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed that the annotation causes a pod restart as desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more question, the configmap is updated during the reconcile right? So regardless of the annotation being on pod or the deployment, the pod is going to be restarted right? If this is true I don't understand why we don't place the annotation on the deployment instead.
Quickly glancing at the code, I doubt the CephNFS config should be changed or supports any changes despite GaneshaServerSpec
. The only CR spec that impacts the configmap is n.Spec.RADOS
, especially n.Spec.RADOS.Namespace
, is it safe to change it? Since recently changing n.Spec.RADOS.Pool
does not do anything since it always defaults to .nfs
.
If all the above is true and changing n.Spec.RADOS.Namespace
is risky then I'm wondering why we would need to restart the pod if the config changes if there is a misconfiguration in the CR perhaps? I just don't see where it could be.
Thanks.
ConfigConfigMap string // name of configmap holding config | ||
DataPathMap *config.DataPathMap // location to store data in container | ||
ID string // letter ID of daemon (e.g., a, b, c, ...) | ||
ConfigConfigMap string // name of configmap holding config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this configmap related to the RADOS object where the config is stored? If the RADOS config is updated, do we get events for that to restart the daemon as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RADOS config contains additional config. This config is needed in order for Ganesha to find the RADOS config(s).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something worth documenting? Or maybe an overhaul of the ceph nfs doc is better left for a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This [configmap] isn't something that users should be touching, so I don't think it's worth documenting. It's what allows us to run the NFS servers with a generated config that doesn't get made in an init container that needs to run the Rook binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one more question... When is the configmap updated? When exports are added/removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only when the CephNFS CR is updated AFAIK.
FWIW, this provides some idea of why config reloading is necessary, though it is from 2017: nfs-ganesha/nfs-ganesha#173 |
ConfigConfigMap string // name of configmap holding config | ||
DataPathMap *config.DataPathMap // location to store data in container | ||
ID string // letter ID of daemon (e.g., a, b, c, ...) | ||
ConfigConfigMap string // name of configmap holding config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one more question... When is the configmap updated? When exports are added/removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need some more clarification. Thanks
@@ -141,6 +141,12 @@ func (r *ReconcileCephNFS) makeDeployment(nfs *cephv1.CephNFS, cfg daemonConfig) | |||
ObjectMeta: metav1.ObjectMeta{ | |||
Name: resourceName, | |||
Labels: getLabels(nfs, cfg.ID, true), | |||
Annotations: map[string]string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more question, the configmap is updated during the reconcile right? So regardless of the annotation being on pod or the deployment, the pod is going to be restarted right? If this is true I don't understand why we don't place the annotation on the deployment instead.
Quickly glancing at the code, I doubt the CephNFS config should be changed or supports any changes despite GaneshaServerSpec
. The only CR spec that impacts the configmap is n.Spec.RADOS
, especially n.Spec.RADOS.Namespace
, is it safe to change it? Since recently changing n.Spec.RADOS.Pool
does not do anything since it always defaults to .nfs
.
If all the above is true and changing n.Spec.RADOS.Namespace
is risky then I'm wondering why we would need to restart the pod if the config changes if there is a misconfiguration in the CR perhaps? I just don't see where it could be.
Thanks.
This is during reconcile, but if they pod template spec in the deployment doesn't change, then the pod doesn't get restarted, even if the deployment itself changes. That is why the annotation is needed on the pod.
This is all also correct. These updates are only useful on Ceph Octopus, where the config is still allowed to change. I was thinking that this would be desirable for migrating to the |
Sorry earlier I meant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally approving based on #9104 (comment). Thanks!
When the configuration configmap is updated for a CephNFS server, the
NFS application should restart to ensure it is running with the latest
config.
Fixes #9028
Signed-off-by: Blaine Gardner blaine.gardner@redhat.com
Description of your changes:
Which issue is resolved by this Pull Request:
Resolves #
Checklist:
make codegen
) has been run to update object specifications, if necessary.