New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
disable 'fsync' by default, but force fsync() in more situations #8304
Conversation
shada_write_file() is called on exit (:quit and friends), this can be very slow. Note: AFAICT Vim (do_viminfo()) does not appear to fsync() viminfo.
Vim has the 'swapsync' option which we removed in 62d137c. Instead let 'fsync' control swapfile-fsync. These cases ALWAYS force fsync (ignoring 'fsync' option): - Idle (CursorHold). - Exit caused by deadly signal. - SIGPWR signal. - Explicit :preserve command.
ref neovim#6725 fsync() is very slow on some systems. And since the parent commit, Nvim is smarter about flushing files at certain times (e.g. CursorHold), regardless of whether 'fsync' is enabled. So it's less risky to disable 'fsync'. Profiling showed slow (2-4s) :write and :quit caused by fsync(): :quit shada_write_file(NULL, false); :write + fsync 0 0x00007f72da567b2d in fsync () at ../sysdeps/unix/syscall-template.S:84 1 0x0000000000638970 in uv__fs_fsync (req=<optimized out>) at /home/vagrant/neovim/.deps/build/src/libuv/src/unix/fs.c:150 2 uv__fs_work (w=<optimized out>) at /home/vagrant/neovim/.deps/build/src/libuv/src/unix/fs.c:953 3 0x0000000000639a70 in uv_fs_fsync (loop=<optimized out>, req=<optimized out>, file=41, cb=0x7f72da567b2d <fsync+45>) at /home/vagrant/neovim/.deps/build/src/libuv/src/unix/fs.c:1094 4 0x0000000000573694 in os_fsync (fd=41) at ../src/nvim/os/fs.c:631 5 0x00000000004ec9dc in buf_write (buf=<optimized out>, fname=<optimized out>, sfname=<optimized out>, start=1, end=1997, eap=0x7fffc864c570, append=<optimized out>, forceit=<optimized out>, reset_changed=<optimized out>, filtering=<optimized out>) at ../src/nvim/fileio.c:3387 6 0x00000000004b44ff in do_write (eap=0x7fffc864c570) at ../src/nvim/ex_cmds.c:1745 ... :write + nofsync 0 0x00007f72da567b2d in fsync () at ../sysdeps/unix/syscall-template.S:84 1 0x0000000000638970 in uv__fs_fsync (req=<optimized out>) at /home/vagrant/neovim/.deps/build/src/libuv/src/unix/fs.c:150 2 uv__fs_work (w=<optimized out>) at /home/vagrant/neovim/.deps/build/src/libuv/src/unix/fs.c:953 3 0x0000000000639a70 in uv_fs_fsync (loop=<optimized out>, req=<optimized out>, file=36, cb=0x7f72da567b2d <fsync+45>) at /home/vagrant/neovim/.deps/build/src/libuv/src/unix/fs.c:1094 4 0x0000000000573694 in os_fsync (fd=36) at ../src/nvim/os/fs.c:631 5 0x0000000000528f5a in mf_sync (mfp=0x7f72d8968d00, flags=5) at ../src/nvim/memfile.c:466 6 0x000000000052d569 in ml_preserve (buf=0x7f72d890f000, message=0) at ../src/nvim/memline.c:1659 7 0x00000000004ebadf in buf_write (buf=<optimized out>, fname=<optimized out>, sfname=<optimized out>, start=1, end=1997, eap=0x7fffc864c570, append=<optimized out>, forceit=<optimized out>, reset_changed=<optimized out>, filtering=<optimized out>) at ../src/nvim/fileio.c:3071 8 0x00000000004b44ff in do_write (eap=0x7fffc864c570) at ../src/nvim/ex_cmds.c:1745 ...
af2a657
to
814e1e0
Compare
Use it to verify fsync() behavior.
operations may sometimes take a few seconds. | ||
|
||
Files are ALWAYS flushed ('fsync' is ignored) when: | ||
- |CursorHold| event is triggered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CursorHold
is based on 'updatetime'
but it looks like the decision is being based on 'updatecount'
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it's 'updatetime'. before_blocking()
calls updatescript(0)
which is special-cased.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reflection, I think it should also happen for 'updatecount'.
Just to clarify, this change means that when The save would cause |
The file itself is not fsync'd, only its "memfile" is, and only if the buffer is It would be nice to fsync the file asynchronously or after idle, but (I would guess) that involves managing file descriptor lifetimes. (And that would be my preference, so then 'fsync' can be enabled by default.) My assumption is that swapfiles are enough to protect against data loss (within 'updatetime'). Also this suggests that fsync() by itself is not always effective (maybe libuv takes extra measures?). I left 9139bf8 as a separate commit if we want to revert it. But again, the goal should be to fsync asynchronously. #6725 also needs to be fixed. references: |
Lines 3378 to 3387 in ad60927
Now the file being saved is not fsync'd on write and the memfile is only sync'd after a delay.
Redis does something like that and there was some noise recently about issues PostgreSQL saw related to that.
Or, as tytso suggested in your reference, do the write and fsync async (together). |
Yes, if 'fsync' is not set. That's what I said, right? :)
Seems higher risk.
Different idea: what if we called |
FEATURES: 3cc7ebf #7234 built-in VimL expression parser 6a7c904 #4419 implement <Cmd> key to invoke command in any mode b836328 #7679 'startup: treat stdin as text instead of commands' 58b210e :digraphs : highlight with hl-SpecialKey #2690 7a13611 #8276 'startup: Let `-s -` read from stdin' 1e71978 events: VimSuspend, VimResume #8280 1e7d5e8 #6272 'stdpath()' f96d99a #8247 server: introduce --listen e8c39f7 #8226 insert-mode: interpret unmapped META as ESC 98e7112 msg: do not scroll entire screen (#8088) f72630b #8055 let negative 'writedelay' show all redraws 5d2dd2e win: has("wsl") on Windows Subsystem for Linux #7330 a4f6cec cmdline: CmdlineEnter and CmdlineLeave autocommands (#7422) 207b7ca #6844 channels: support buffered output and bytes sockets/stdio API: f85cbea #7917 API: buffer updates 418abfc #6743 API: list information about all channels/jobs. 36b2e3f #8375 API: nvim_get_commands 273d2cd #8329 API: Make nvim_set_option() update `:verbose set …` 8d40b36 #8371 API: more reliable/descriptive VimL errors ebb1acb #8353 API: nvim_call_dict_function 9f994bb #8004 API: nvim_list_uis 3405704 #7520 API/UI: forward option updates to UIs 911b1e4 #7821 API: improve nvim_command_output WINDOWS OS: 9cefd83 #8084, #8516 build/win: support MSVC ee4e1fd win: Fix reading content from stdin (#8267) TUI: ffb8904 #8309 TUI: add support for mouse release events in urxvt 8d5a46e #8081 TUI: implement "standout" attribute 6071637 TUI: support TERM=konsole-256color 67848c0 #7653 TUI: report TUI info with -V3 ('verbose' >= 3) 3d0ee17 TUI/rxvt: enable focus-reporting d109f56 #7640 TUI: 'term' option: reflect effective terminal behavior FIXES: ed6a113 #8273 'job-control: avoid kill-timer race' 4e02f1a #8107 'jobs: separate process-group' 451c48a terminal: flush vterm output buffer on pty output #8486 5d6732f :checkhealth fixes #8335 53f11dc #8218 'Fix errors reported by PVS' d05712f inccommand: pause :terminal redraws (#8307) 51af911 inccommand: do not execute trailing commands #8256 84359a4 terminal: resize to the max dimensions (#8249) d49c1dd #8228 Make vim_fgets() return the same values as in Vim 60e96a4 screen: winhl=Normal:Background should not override syntax (#8093) 0c59ac1 #5908 'shada: Also save numbered marks' ba87a2c cscope: ignore EINTR while reading the prompt (#8079) b1412dc #7971 ':terminal Enter/Leave should not increment jumplist' 3a5721e TUI: libtermkey: force CSI driver for mouse input #7948 6ff13d7 #7720 TUI: faster startup 1c6e956 #7862 TUI: fix resize-related segfaults a58c909 #7676 TUI: always hide cursor when flushing, never flush buffers during unibilium output 303e1df #7624 TUI: disable BCE almost always 249bdb0 #7761 mark: Make sure that jumplist item will not have zero lnum 6f41ce0 #7704 macOS: Set $LANG based on the system locale a043899 #7633 'Retry fgets on EINTR' CHANGES: ad60927 #8304 default to 'nofsync' f3f1970 #8035 defaults: 'fillchars' a6052c7 #7984 defaults: sidescroll=1 b69fa86 #7888 defaults: enable cscopeverbose 7c4bb23 defaults: do :filetype stuff unless explicitly "off" 2aa308c #5658 'Apply :lmap in macros' 8ce6393 terminal: Leave 'relativenumber' alone (#8360) e46534b #4486 refactor: Remove maxmem, maxmemtot options 131aad9 win: defaults: 'shellcmdflag', 'shellxquote' #7343 c57d315 #8031 jobwait(): return -2 on interrupt also with timeout 6452831 clipboard: macOS: fallback to tmux if pbcopy is broken #7940 300d365 #7919 Make 'langnoremap' apply directly after a map ada1956 #7880 'lua/executor: Remove lightuserdata' INTERNAL: de0a954 #7806 internal statistics for list impl dee78a4 #7708 rewrite internal list impl
@lewis6991 it depends on the filesystem. This PR isn't perfect, but in general we shouldn't need to eagerly fsync(), instead we should lazily fsync() at 'updatetime' (swapfiles at least, but ideally a queue of all writes). |
The default value of updatetime is 4 seconds though? That seems quite a lot. |
CI sometimes fails: FAILED test/functional/core/fileio_spec.lua @ 52: fileio fsync() codepaths neovim#8304 test/functional/core/fileio_spec.lua:87: Expected objects to be the same. Passed in: (number) 3 Expected: (number) 2 stack traceback: test/functional/core/fileio_spec.lua:87: in function <test/functional/core/fileio_spec.lua:52>
Problem: CI sometimes fails: FAILED test/functional/core/fileio_spec.lua @ 52: fileio fsync() codepaths neovim#8304 test/functional/core/fileio_spec.lua:87: Expected objects to be the same. Passed in: (number) 3 Expected: (number) 2 stack traceback: test/functional/core/fileio_spec.lua:87: in function <test/functional/core/fileio_spec.lua:52> Solution: Something is triggering an extra fsync, possibly timing-related after the jobstart() call. To avoid the (speculated) race, rearrange the test so that jobstart() is done last.
Problem: CI sometimes fails: FAILED test/functional/core/fileio_spec.lua @ 52: fileio fsync() codepaths neovim#8304 test/functional/core/fileio_spec.lua:87: Expected objects to be the same. Passed in: (number) 3 Expected: (number) 2 stack traceback: test/functional/core/fileio_spec.lua:87: in function <test/functional/core/fileio_spec.lua:52> Solution: Something is triggering an extra fsync, possibly timing-related after the jobstart() call. To avoid the (speculated) race, rearrange the test so that jobstart() is done last.
Problem: CI sometimes fails: FAILED test/functional/core/fileio_spec.lua @ 52: fileio fsync() codepaths neovim#8304 test/functional/core/fileio_spec.lua:87: Expected objects to be the same. Passed in: (number) 3 Expected: (number) 2 stack traceback: test/functional/core/fileio_spec.lua:87: in function <test/functional/core/fileio_spec.lua:52> Solution: Something is triggering an extra fsync, possibly timing-related after the jobstart() call. Set 'updatecount' to a high value.
Problem: CI sometimes fails: FAILED test/functional/core/fileio_spec.lua @ 52: fileio fsync() codepaths neovim#8304 test/functional/core/fileio_spec.lua:87: Expected objects to be the same. Passed in: (number) 3 Expected: (number) 2 stack traceback: test/functional/core/fileio_spec.lua:87: in function <test/functional/core/fileio_spec.lua:52> Solution: Something is triggering an extra fsync, possibly timing-related after the jobstart() call. Set 'updatecount' to a high value.
Problem: CI sometimes fails. Something is triggering an extra fsync(). FAILED test/functional/core/fileio_spec.lua @ 52: fileio fsync() codepaths neovim#8304 test/functional/core/fileio_spec.lua:87: Expected objects to be the same. Passed in: (number) 3 Expected: (number) 2 stack traceback: test/functional/core/fileio_spec.lua:87: in function <test/functional/core/fileio_spec.lua:52> Solution: Relax the assertion to `fsync >= 2` instead of exactly 2. (Note this is not a behavior change: the next assertion has always checked `fsync == 4`, it's just that the intermediate 3rd fsync was never explicitly asserted.)
Problem: CI sometimes fails. Something is triggering an extra fsync(). FAILED test/functional/core/fileio_spec.lua @ 52: fileio fsync() codepaths #8304 test/functional/core/fileio_spec.lua:87: Expected objects to be the same. Passed in: (number) 3 Expected: (number) 2 stack traceback: test/functional/core/fileio_spec.lua:87: in function <test/functional/core/fileio_spec.lua:52> Solution: Relax the assertion to `fsync >= 2` instead of exactly 2. (Note this is not a behavior change: the next assertion has always checked `fsync == 4`, it's just that the intermediate 3rd fsync was never explicitly asserted.)
Followup to 27501d3. Problem: CI sometimes fails. Something is triggering an extra fsync(). FAILED test/functional/core/fileio_spec.lua @ 52: fileio fsync() with 'nofsync' neovim#8304 test/functional/core/fileio_spec.lua:100: Expected objects to be the same. Passed in: (number) 5 Expected: (number) 4 Solution: Relax the assertion.
Followup to 27501d3. Problem: CI sometimes fails. Something is triggering an extra fsync(). FAILED test/functional/core/fileio_spec.lua @ 52: fileio fsync() with 'nofsync' neovim#8304 test/functional/core/fileio_spec.lua:100: Expected objects to be the same. Passed in: (number) 5 Expected: (number) 4 Solution: Relax the assertion.
Followup to 27501d3. Problem: CI sometimes fails. Something is triggering an extra fsync(). FAILED test/functional/core/fileio_spec.lua @ 52: fileio fsync() with 'nofsync' #8304 test/functional/core/fileio_spec.lua:100: Expected objects to be the same. Passed in: (number) 5 Expected: (number) 4 Solution: Relax the assertion.
fsync() is very slow on my system. It randomly causes
:write
and:quit
to take 3+ seconds.This PR does several things:
fsync()
(regardless of 'fsync' option) in these cases:Also mitigates #6725.
ref ludovicchabant/vim-gutentags#167