Add sysctl flag to force nfsd to make all writes async even if client requests sync #208

bjornbouetsmith · 2019-10-09T08:55:39Z

Hi,

I have made a simple patch for the nfs server.

This is a more "sane" version of the patch that floats around on the internet which just disables sync writes entirely.

This is a sysctl setting which can be set to either 0 or 1.

If set to 1, then the ioflags will ignore the sync request from the client.

… request sync

ghost · 2019-10-09T16:28:48Z

Hi, thank you for your suggestion. Have you considered setting the ZFS property sync=disabled on the shared dataset instead?

bjornbouetsmith · 2019-10-09T16:53:48Z

Yes, but that would leave everything async, that is not the point of this changeset. I just want to make this an alternative patch to the "silly" patch that people are doing, where they remove the possibility to make sync writes via NFS at all.

Furthermore tests have shown that sync=disabled on the dataset does not yield same kind of performance as forcing NFS to do only async writes.

This will make it possible to do via a sysctl setting, and will only have effect on NFS, not the entire dataset that is exposed via NFS.

It would be nice to extend this, so it could be "paths" that could be set async only.

But I am not sure how that would work with sysctl.

I guess it could be done with a combination of sysctl and lines in the nfsd.conf where you specify which paths exported that should be treated as async only.

And since this patch is "opt-in" - It would help those that have a ups backed freenas and would like to run async all the way from the NFS server to the dataset, and would have no consequence for those that do not want to run async.

ghost · 2019-10-09T17:29:42Z

Furthermore tests have shown that sync=disabled on the dataset does not yield same kind of performance as forcing NFS to do only async writes.

Can you elaborate on this? If you have measurements I'd be interested to see them.

bjornbouetsmith · 2019-10-09T17:44:00Z

To be honest I haven't tested it myself. I have just always been unhappy with the NFS performance in FreeNAS.

And to be fully transparent, its mostly an issue when using NFS from esxi, since it forces all writes to be sync.

This guy has done testing of a similar patch:
https://www.ateamsystems.com/tech-blog/solved-performance-issues-with-freebsd-zfs-backed-esxi-storage-over-nfs/

I am guessing that it must be all the things that happens when you ask ZFS for a sync write, like cache flushing, etc.

I have experienced abmyssal performance on NFS on FreeNAS running a 4x striped mirror of Intel Datacenter SSD's and still not being able to do more than 350MB/s on a 40Gbit connection.

No matter what kind of tweaks/sysctl changes, nothing helped - not even setting sync=forced and adding a Intel Optane 900P as a ZIL. Sync on ZFS is just slow when being triggered via NFS.

ghost

Though it's an improvement from the original patch you referenced, it still disables sync for all NFS clients. Would it be possible to leverage the existing mechanism for disabling sync through *stable so that this can be configured on a per-share basis through the exports file?.

bjornbouetsmith · 2019-10-21T05:31:24Z

@freqlabs Yes it would be nice to have it per share basis for sure, but that would complicate the patch quite a bit, since the code I have changed do not have direct access to the configuration.

But if we can get the NFS Server to pass in whether or not the given share should respect sync writes or not via the stable parameter, then it would be nice.

Then this sysctl flag would not be needed, unless you wanted all your shares to be async forced.

But I do not see any integration in FreeNAS where you can set a NFS share as async only, so a change to this would require that that part be implemented first.

But when that has been implemented and FreeNAS passes in the correct flags when it creates the exports file, then it would work just out of the box I am guessing.

Unless the nfs server would prioritise what the client sends over how the share is configured.

i.e. client has defined its share with sync and server with async.

bjornbouetsmith · 2019-10-21T05:38:32Z

@freqlabs
An alternative to using the stable parameter if the nfs server overwrites the configuration on the share with what the client sends would to implement a sysctl flag for each share that would be created when the nfs server starts and exports the shares.

i.e. vfs.nfds..ignore_sync_writes

Then the "only" challenge would be to know what share the write request is part of, but that is propably relatively easy

so pseudo code:

//read share name from write request

if (vfs.nfds.[share].ignore_sync_writes == 1)
{
syncflags = IO_NODELOCKED;
}

But this kind of patch can be added as an extra patch easily, since it ties easily into the existing patch and would just be an alternative to doing it for all shares.

So I would really this my original patch merged and then I would be happy to work on a different patch that would enable a per share enabling/disabling of this

bjornbouetsmith · 2019-10-21T05:47:06Z

and to update on numbers:
I have just done some testing on my own FreeNAS box, where NFS from ESXi is compared to ISCSi.

I have a pool with 8 intel datacenter SSD's set up in a 4xmirror.-

When writing via NFS (ESXi) I can maximum write 350MB/s sequentially, no matter what tweaks I do on a 40Gb network.

When doing the same via ISCSi from a windows machine running as a VM on EXSi, I can write 3500MB/s sequentially.

Same settings on the dataset, compression enabled, sync=standard - and both writes 0's, so even if ESXi sends utter zeroes, NFS somehome trips ZFS.

And though I don't expect quite the same performance from a patched NFS I am sure it will be better than the existing and worth it.

Unfortunately I don't have a clue on how to set up a FreeNSD build environment where I can build this patch, otherwise I would be more than happy to test a patched NFS server on my production FreeNAS, just so we can get some numbers to compare with, i.e.

sync=disabled on dataset
sync=forced on dataset (with zil)
patch enabled

ghost · 2019-10-21T19:18:08Z

Thank you for your contribution. Among the people I discussed this with internally, there are none in favor of carrying the extra diff against upstream FreeBSD. We do not recommend ignoring sync, it is demanded for a good reason. If there is a performance bug in the NFS server, this is not the correct way to fix it.

If you are interested in setting up a FreeNAS build system for your personal development and testing, please see https://github.com/freenas/build (I recommend at least doubling the minimum hardware requirements it describes if you have the option).

bjornbouetsmith · 2019-10-21T19:32:52Z

Ok,

That was extremely dissapointing - and from my point of view a bit close minded of you.

You should not be the judge of how people treat their NAS.

If you run with battery backed SSD's, UPS, redundant power supplies, the risk is close to zero that anything bad would happen.

And in no way would it risk the underlying pool, it would only risk the files that was written from the network at the time in case the NFS Server or FreeBSD crashed fatally.

its a tiny opt-in feature, meaning it would have no impact unless people actively added the sysctl flag.

Furthermore it would literally take less than a minute to merge in case any conflict occured with the upstream FreeBSD.

So that means if I want this into FreeNAS, I wold have to get the patch into the FreeBSD source code and then it would automatically be merged into FreeNAS at some point?

How often do you merge changes from the FreeBSD version you have branched from?

ghost · 2019-10-21T21:33:47Z

That was extremely dissapointing

Sorry to disappoint you.

You should not be the judge of how people treat their NAS.
the risk is close to zero

I'm not vehemently opposed to the idea of there being some mechanism for ignoring sync if desired. However, the proposed solution is sloppy. You have improved upon the inflexibility of the original patch, but it still introduces a global setting that enables dangerous behavior. I would much prefer to see this implemented in a way that can be controlled from the export options so it is properly integrated with the system.

This feature is best discussed with the FreeBSD project, where hopefully you can get feedback from developers responsible for the NFS code in question.

it would literally take less than a minute to merge

With an entire operating system like FreeNAS, the little patches here and there all add up. Adding another one for something we don't use and don't recommend using is counter to our ongoing efforts to reduce our patch set. A drop in the bucket yes, but unjustified nonetheless.

How often do you merge changes from the FreeBSD version you have branched from?

@amotin can correct me if I'm wrong, but I don't believe we have a fixed merge schedule. Checking through the commit history I see sometimes merges happen a month apart, sometimes days apart.

Add sysctl flag to force nfsd to make all writes async even if client…

cc7c8ee

… request sync

dlavigne requested review from amotin and a user October 20, 2019 13:52

ghost reviewed Oct 21, 2019

View reviewed changes

ghost closed this Oct 21, 2019

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sysctl flag to force nfsd to make all writes async even if client requests sync #208

Add sysctl flag to force nfsd to make all writes async even if client requests sync #208

bjornbouetsmith commented Oct 9, 2019

ghost commented Oct 9, 2019 •

edited by ghost

bjornbouetsmith commented Oct 9, 2019 •

edited

ghost commented Oct 9, 2019

bjornbouetsmith commented Oct 9, 2019

ghost left a comment

bjornbouetsmith commented Oct 21, 2019 •

edited

bjornbouetsmith commented Oct 21, 2019 •

edited

bjornbouetsmith commented Oct 21, 2019 •

edited

ghost commented Oct 21, 2019

bjornbouetsmith commented Oct 21, 2019 •

edited

ghost commented Oct 21, 2019

Add sysctl flag to force nfsd to make all writes async even if client requests sync #208

Add sysctl flag to force nfsd to make all writes async even if client requests sync #208

Conversation

bjornbouetsmith commented Oct 9, 2019

ghost commented Oct 9, 2019 • edited by ghost

bjornbouetsmith commented Oct 9, 2019 • edited

ghost commented Oct 9, 2019

bjornbouetsmith commented Oct 9, 2019

ghost left a comment

Choose a reason for hiding this comment

bjornbouetsmith commented Oct 21, 2019 • edited

bjornbouetsmith commented Oct 21, 2019 • edited

bjornbouetsmith commented Oct 21, 2019 • edited

ghost commented Oct 21, 2019

bjornbouetsmith commented Oct 21, 2019 • edited

ghost commented Oct 21, 2019

ghost commented Oct 9, 2019 •

edited by ghost

bjornbouetsmith commented Oct 9, 2019 •

edited

bjornbouetsmith commented Oct 21, 2019 •

edited

bjornbouetsmith commented Oct 21, 2019 •

edited

bjornbouetsmith commented Oct 21, 2019 •

edited

bjornbouetsmith commented Oct 21, 2019 •

edited