-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nfs: add livness-probe to nfs-ganesha container #12845
Conversation
49f7c71
to
da46467
Compare
b0f160a
to
fc1e7c7
Compare
I'd like to get some input from the NFS-Ganesha devs here as well to make sure adding a liveness probe will have the right behavior for the whole system. I'll reach out to them to comment. We are recently learning that liveness probes can be problematic for the MDS (#12789), and we have learned in the past that liveness probes are problematic for RGW. For RGW this is explained well here: rook/pkg/operator/ceph/object/spec.go Lines 440 to 448 in 0ba1578
|
I got some offline advice from Kaleb
I think it may be good to change to a script-based probe that checks the PID instead, since that is the latest advice |
I will try create another PR with the |
Distillation of discussion today: Using a null RPC call to the NFS-Ganesha server will be the best check we can implement today. As long as the timeout on the null RPC call is sufficiently long, it should be extremely rare to get a false-failure status. It's not clear to me how to initiate a null RPC call, nor how long the timeout should be to make false-failures exceedingly rare. I'll reach out to NFS-Ganesha devs to clarify those 2 questions. |
@ffilz suggested:
As another piece of info, at least one Golang NFS client library exists https://github.com/Cyberax/go-nfs-client It looks like |
I can't find an alternative to [root@d575d9e94d8b /]# ls -alFh /usr/bin/rpcinfo
-rwxr-xr-x. 1 root root 69K Aug 6 2022 /usr/bin/rpcinfo*
[root@d575d9e94d8b /]# ls -alFh /usr/bin/rpcbind
-rwxr-xr-x. 1 root root 70K Aug 6 2022 /usr/bin/rpcbind* |
I opened a PR to add I think the best thing we can do for Rook users in this case is to add a liveness probe that uses the NULL RPC check outlined by Frank and disable it by default. For any Rook users that know they are using images that have It could be possible for Rook to implement the probe conditionally (like below), but I believe that defeats some of the intent of the probe itself. if which rpcinfo; then
# do probe
else
exit 0 # assume success if cannot check
fi |
@BlaineEXE qrote:
not at all saying that Kaleb is wrong here but maybe this could be refined by avoiding to probe
yes, a separate probe script would be great!
I agree that making the probe script so generic that it is not specific to a k8s deployment would be ideal! It May even be worth considering to add it to the ganesha codebase itself and have rook consume it once it has landed? |
@synarete because we are proposing (at this point) to have the probe be disabled by default, we need to add an option to control the server liveness probe via the CephNFS CRD. There is an example of how/where that's done with CephFilesystem MDS here: rook/pkg/apis/ceph.rook.io/v1/types.go Lines 1130 to 1131 in d845107
That would be in the rook/pkg/apis/ceph.rook.io/v1/types.go Line 2063 in 9a5ea4c
We are proposing to disable it by default today, but once Rook can assume that all builds of Ceph will have the Because of the above, I would suggest for Rook's code something like: probeSpec := nfs.Spec.Server.LivenessProbe
livenessProbe = genDefaultLivenessProbe(/* ... */)
if probeSpec == nil && !cephVersion.IsAtLeast(cephver.CephVersion{"18", "2", "1"}) {
// do not add probe when `rpcinfo` tool used for probe may not be present in image
// ref: https://github.com/ceph/ceph-container/pull/2156
defaultProbe = emptyProbe()
} else if probeSpec != nil && probeSpec.Disabled {
livenessProbe = emptyProbe() // disable probe if explicitly disabled
}
// TODO: use GetProbeWithDefaults to apply `livenessProbe` to the pod spec [update] In the script that runs
|
@synarete the latest build of quay.io/ceph/daemon-base has the |
Ceph v18.2.0 is the latest release today, so we should be able to use v18.2.1 as the check for determining if we can use |
@BlaineEXE When configure nfs-ganesha without nfsv3, |
Why would rpcinfo not be relevant for liveness without NFSv3? rpcinfo is fully able to probe NFSv4. pynfs is a test case tool and awfully heavy for doing a liveness check. It also by default creates files on the chosen export. |
@synarete this thread seems to show a similar issue using rpcbind with NFSv4 (I believe with the linux kernel nfs server), but they were able to resolve it by specifying a domain name. @ffilz do you think we could be seeing a similar issue here with nfs-ganesha? https://forums.freebsd.org/threads/nfs-doesnt-register-with-rpcbind.78203/ |
I guess one issue would be if rpcbind isn't running on the host/vm/container nfs-ganesha is running on, then rpcinfo can't be used, except you should still be able to use: rpcinfo -n 2049 -t {host} nfs 4 And then I believe it will not attempt to contact rpcbind. Also, unless the Ganesha root pseudofs has been restrictively configured, the following would work for a reasonable liveness test:
Maybe put that in a shell script so it can make sure to clean up /tmp/$host even if the mount fails. |
A bit of context: I deploy a simple rook+cephfs+nfs setup on OpenShift (4.13, AWS). All pods run ok. There isn't rpcbind process runnings, and
@ffilz Please let us know if anything is missing in the config file:
|
You can do: rpcinfo -a 1.2.3.4.8.1 -T tcp nfs 4 Where 1.2.3.4 is the IP address of the server (or you can use an IPv6 address). The .8.1 at the end is port 2049 in the right format. -n is supposed to work, but got broken... |
@ffilz Many tanks for this hint. Using this notation I was able to successfully send-recv rpcinfo to the nfs-ganesha server, both from within the container itself and from external pod:
@BlaineEXE I will now fix this PR accordingly. |
fc1e7c7
to
af2fd12
Compare
rook/pkg/apis/ceph.rook.io/v1/types.go Lines 2785 to 2794 in 38c262c
|
0181c98
to
bec8dc3
Compare
// liveness-probe using K8s builtin TCP-socket action | ||
return controller.GenerateLivenessProbeTcpPort(nfsPort, failureThreshold) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of having a backup option when RPC isn't available!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this looks pretty great. Very concise and clear.
The only thing that I think needs updated is slight adjustment to the user-defined behavior of the probe. You can look into how this is done in the RGW code for an example:
rook/pkg/operator/ceph/object/spec.go
Line 425 in ccd961b
func configureReadinessProbe(container *v1.Container, healthCheck cephv1.ObjectHealthCheckSpec) { |
Also, please let me know if you'd like me to take over the work for any reason, no questions asked. :)
pkg/operator/ceph/nfs/spec_test.go
Outdated
LivenessProbe: &cephv1.ProbeSpec{ | ||
Disabled: false, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to also verify that the probe is present in the output?
With adding the ability to override the probe also, I think it would be good to override one of the properties here and check that the output has default values for the probe, except for the overridden value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a dedicate test to check livness-probe settings. Naturally, we may check the correctness of rpcinfo
as liveness-probe only against real nfs-ganesha container.
bec8dc3
to
ece4371
Compare
7c8be8e
to
2a88ae4
Compare
There are ongoing NFS CI issues that I'm starting to look into. I think it's important to resolve those before moving forward here. Sorry for the added delay here. |
7e0eaeb
to
1776528
Compare
1776528
to
abd8510
Compare
Allow end-user to have full control on NFS-Ganesha liveness probe: - Disabled - Enabled, using user-provided definition - Enabled, using Rook's default liveness probe for Ganesha. Signed-off-by: Shachar Sharon <ssharon@redhat.com>
Use K8s LivenessProbe mechanism to check OK-status of nfs-ganesha container. A user may define his own lineness-probe, or a default one which expects NFS TCP-port 2049 to be active; that is, willing to accept new connections: for Ceph>=18.2.1 issue 'rpcinfo' call on local pod; otherwise use standard K8s TCP-socket liveness probe mechanism. Define permissive values to liveness-probe to ensure that the NFS service is defined in failed-state only when it has non-recoverable error. The current default definition of LivenessProbe is expected to guard the nfs pod from at least the following two cases: - Deadlocks: where an nfs-ganesha server is running, but unable serve new connections due to internal bad-state. - Resource exhaustion on the host node (e.g. OOM) which prevents the server from accepting new connections and reply to NULL RPC request. In both cases we expect K8s to reschedule the nfs pod, most likely on different host node. Refs rook issue rook#12719 Signed-off-by: Shachar Sharon <ssharon@redhat.com>
abd8510
to
e05184d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I did a manual test also, and the probe defaulting behavior works as expected. Thanks @synarete, and sorry for the delay in reviewing.
nfs: add livness-probe to nfs-ganesha container (backport #12845)
Use K8s LivenessProbe mechanism to check OK-status of nfs-ganesha container. Expects NFS TCP-port 2049 to be active; that is, willing to accept new connections. Define permissive values to liveness-probe to ensure that the nfs-service is defined in failed-state only when it has non-recoverable error.
The current definition of LivenessProbe is expected to guard the nfs pod from at least the following two cases:
Deadlocks: where an nfs-ganesha server is running, but unable serve new connections due to internal bad-state.
Resource exhaustion on the host node (e.g. OOM) which prevents the server from accepting new connections.
In both cases we expect K8s to reschedule the nfs pod, most likely on different host node.
Refs rook issue #12719
Description of your changes:
Which issue is resolved by this Pull Request:
Resolves #
Checklist:
skip-ci
on the PR.