New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use NFS hard mount instead of soft mount to avoid RO VMs (or offer option)? #334
Comments
I think it might be interesting to ask the question to Citrix storage guys. We should create an XSO to get their opinion and maybe their reasons about their current choices. |
Perhaps I can suggest to always use a unique fsid= export option for each exported path on the nfs server. This ought to be documented in the docs and wiki :) |
The thing is that if NFS is served by a cluster (example - PaceMaker), failover event will work flawlessly if NFS is mounted with 'hard' option on the XenServer. Otherwise, VMs will experience a (short) disk loss and the Linux ones will get, by default, a read-only filesystem.
This is an ugly workaround, but it allows VMs to live, which is more important that the beauty of the hack. |
I believe it is possible to add custom NFS mount options when adding a new SR through XOA. Have you tested this? |
Doesn't work. The hard-coded 'soft' directive in nfs.py overrides it. |
Yes, that's why it would require a XAPI modification for this. That's doable :) I think we should keep the default behavior, but allow an override: this will let people who want to test, to test it. In theory, we should:
That should be it. @ezaton do you want to contribute? |
I am not sure I have the Python know-how, but I will make an effort during the next few days. This is a major thing I am carrying with me since XS version 6.1 or so. These were my early NFS clusters days. Nowadays - I have so many NFS clusters in so many locations. So - yeah. I want to contribute. I will see that I can actually do it. Thanks! |
Okay so IIRC, you might indeed check how NFS version is passed down to the driver (from XAPI to the NFS Python file). It's a good start to understand how it works, and then do the same for the hard/soft mount thing :) edit: @Wescoeur knows a lot about SMAPIv1, so he might assist you on this (if you have questions). |
I thought subsequent mount-options override previous mount options. This is how we can add nfsver=4.1 for example, isn't it. I haven't tried, but it might be worth trying. |
This is a quote from 'man 5 nfs':
Look at the comment. I believe that hard should be the default - at least for regular SR. ISO-SR is another thing. |
Maybe increasing that value could be a less intrustive option and could be supplied without being ignored? |
These are meant to mitigate (some of) the problems caused by soft mount, instead of just mounting 'hard'. Look - when it's your virtual machine there, you do not want a momentary network disruption to kill your VMs. The safety of your virtual machines is the key requirement. Soft mount just doesn't provide it. |
I have edited nfs.py and NFSSR.py and created a pull request here: xapi-project/sm#485 |
Thanks. I think you need to add context and explain why hard would be better than soft and what tests you did to have a chance of getting it merged. |
I will add all these details in the pull request. |
I just tried in XOA to create a new SR with the "hard" mount option. Seems to stick when looking at the output from
|
@Gatak if it's the case it's even easier :D Can you double check it's the correct |
This is a change of behaviour from what I am remembering, however - I have just tested it, and this is true. Consistent across reboots and across detach/reattach - so my patch is (partially) redundant. |
Yes, based on the documentation provided it does seem the safest option. |
Yes, but you can't decide to do this change for everyone without a consensus. We'll talk more with Citrix team to understand their original choice. What we can do in XO: expose a menu that select "hard" by default. This will encourage Does this sound reasonable for you? |
Sounds good. Many use soft because you could not abort/unmount a hard mounted NFS share. But this may be old truths..
I think it is important to mention that the NFS export should use the |
What about NFS HA? (regarding |
NFS HA maintains fsid. If you setup an NFS cluster, you handle your fsid, or else, it doesn't work very well. For stand-alone systems, the fsid is derived from the device id, but not for clusters. |
I wrote some condideration on the forum thread about this issue and report here the post important one. using hard as default is risky in my opinion. I have to say that usually on servers i set hard,intr in order to protect the poor written application software from receiving I/O error and with the intr option still be able to kill the process if I need to umount the fs. |
This is incorrect. All Linux servers I have had the pleasure of working with - RHEL5/6/7, Centos, Oracle Linux, Ubuntu and some more - all of them mount by default with the directive onerror=readonly. You have to explicitly change this behaviour for your Linux to not fail(!) when NFS performs failover with soft mount. Xapi - and SM-related tasks, are handled independently per-SR - check the logs. I agree that ISO SR should remain soft (although this can crash VMs, but this is less of a problem, because the ISO is originally read-only), so my patch (and the proposed change to the GUI) is to have 'hard' mount option for VM data disks, and 'soft' for ISO SR. |
According to https://linux.die.net/man/5/nfs the I did one test yesterday with a Windows server VM on a This did not previously work when i had the |
I made a test with ubuntu server 19.10. installed with defaults setting without LVM. I tested with a script that update a file every second on the VM. I repeated the test with retrans=360, I expected that the client didn't received error for a heck of time but I was wrong. after about 5 minutes the root fs of the VM get remounted read-only. I investigated on the timeout parameter of the disk normaly in /sys/block/sd*/device/timeout I still have to understand what really happen: |
some more test. It turn out that one possible problem was how I conducted the test. I now tried with null routing as suggested on the forum, ip route add <xcp host ip/32> via 127.0.0.1 dev lo to block all traffic between NFS server and xcp host and then ip route del to rollback. I'm going to retest with timeo=100,retrans=360 to be sure it works and to verify how the tcp timeouts interact. I think this tell us 2 things:
|
Just a quick word to say that this discussion is very interesting, whatever what the outcome will be. I'm following it closely. |
I think you have other issues with your OMV setup, at least based on the issues from the forum thread. |
Potentially but how do I establish that? Also the forum issue is with nfs4, and the reason I started to look into nfs4 because of the this issue with my existing nfs3 share. |
I assume you stopped all VMs before you remounted the VM storage? (it would not be surprising at all that the VMs went RO, if you unmounted the storage, remounted it as hard, while the VMs are running) A suggestion: Also, DON'T experiment further with all your VMs at risk. I'd use a test VM for that, untill certain that all is working robustly. |
This is what I did:
This is where the bigger issue started, I was able to restart the VM's with soft mount and no problem. Now with hard mounted, it wont reboot, force shutdown wont work, after several toolstack restart and force shutdown, the VM's will shutdown, now none of the VM's will restart, I had to restart the entire cluster to start the VM's again. |
I suggest to keep using the forum for understanding what is going on, and come back here with the conclusions, to keep that issue readable. |
I want to clarify the 2 different issues that are I am dealing with that started with VM's going into RO state:
I am happy to do more test or provide logs if you want but I would caution using hard as default because I am not sure if that gets us the desired results consistently. |
What was interesting is the |
Ok, I understand better. I don't understand how this is technically possible that the VMs go RO over a hard mounted share, but apparently it did in your case. |
I can do more test later today and try to reproduce the issue, here is my hard mounted nfs sr, let me know if I need to change anything with this:
Also what other logs you would like to see? |
From my understanding of NFS (and I have a little understanding of it), this should never happen. There is no way that hard NFS mount would result in VMs in RO. Tests here showed that you can kill your NFS server for a very long time, and still - VMs would hang, but not go into RO. |
I doubt that NFS server will have a bad block (possible but..), its consist of 12 mirrored VDEV using 24 new Intel NVMe enterprise drives over ZFS, but as I said potentially yes. I am providing the nfs export of the NAS server (OMV - its just a Debian 10 with a gui to manage NAS settings) and what I see on the xcp-ng side: NAS Export:
XCP-ng Mount
What log files do you want me to get for you when i am able to reproduce the RO issue? |
@geek-baba IMHO I think you should fix the root cause of the failed NFS4 mounts in your setup before continuing here, as it if difficult to know of those issues also affect this RO/hard mount problem for you. I can't explain why your NFS4 mount does not work, and I agree with @StreborStrebor that you should set up a test VM with a NFS server to test and rule out things. It won't take more than 10 minutes to set up. |
|
Hello everyone. What's the status of this. Did we come to some conclusion on what is the best choice and if choices should be made available in XO? |
Not exactly. We know for sure that soft mount does not cope well with NFS server disconnections, but we haven't established 100% that hard mounts would be the best solution in all cases. |
I understand. I do think it is important to solve this, if not at least provide some guiding suggestions and information to the user before creating he SR? |
Bumping this, another XCP-ng customer has been bit by this with HA NFS storage |
I think we'll leave the default choice "as is" (too risky to change it), but we can help to configure |
I disagree. I think that the 'soft' default means that VMs will(!) get IO errors in case of a network glitch or any momentary disconnection, and that the danger to VMs in this case is much more severe than blocking IO. |
Then, open a support ticket so we can see with you how to simplify this for your entire infrastructure. "The default has to be hard" is easy to tell when you don't have thousand users all around the world, this is a kind of breaking change, so it's out of question for XCP-ng 8.2 at least. |
I am curious, in what way is this a breaking change? |
It's changing a previous behavior for all our users. You can't fathom the potential consequences at this scale, by changing the behavior on how a storage respond to being mounted in hard vs soft. Helping people doing the change on their own is acceptable, but changing it by default during an LTS release is not. |
Then, maybe provide the option |
That's exactly what I'm saying. |
This is perfectly fine from my point of view. Thanks. |
@Fohdeesha how far we are for this? |
@olivierlambert I've confirmed that adding I suppose if you wanted to go further, XOA could add a hard/soft toggle button that automatically added this, with an explainer of the pros/cons of each. |
Okay so let's do that in XO, pinging @marcungeschikts so it's added in XO backlog 👍 |
See proposal and testimony from user on forum: https://xcp-ng.org/forum/post/21940
We may also consider changing the default timeout options.
The text was updated successfully, but these errors were encountered: