Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call `fsync` on `flush` #128

Merged
merged 2 commits into from Jan 14, 2020
Merged

Call `fsync` on `flush` #128

merged 2 commits into from Jan 14, 2020

Conversation

@pascutto
Copy link
Contributor

pascutto commented Nov 14, 2019

Fixes #126.
I'm not sure why this should be an option though, I feel like calling flush we expect the content to be... flushed. Exposing some details of our internal caches doesn't feel right to me, and I can't think of a scenario where the use calls flush but agrees that some data is still not written to the disk, so this PR calls fsync on every flush.
Let me know if I got something wrong about it, maybe @samoht you have some insights about it?

@pascutto pascutto force-pushed the pascutto:with_fsync branch from 2380df9 to fdf065a Nov 14, 2019
@samoht

This comment has been minimized.

Copy link
Member

samoht commented Nov 14, 2019

There are various consistency and durability models :-) see lmdb and/or the doc for open. Sometimes it’s fine to just be consistent but not durable, so we should let users have that option

@edwintorok

This comment has been minimized.

Copy link

edwintorok commented Nov 17, 2019

It would be useful to do an fsync in rename too.
See https://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/ and https://www.kernel.org/doc/Documentation/filesystems/ext4.txt. Ext4 has some code to detect "broken applications" that do not use fsync before renaming, but not all FS may do:

Many broken applications don't use fsync() when
replacing existing files via patterns such as
fd = open("foo.new")/write(fd,..)/close(fd)/
rename("foo.new", "foo"), or worse yet,
fd = open("foo", O_TRUNC)/write(fd,..)/close(fd).
If auto_da_alloc is enabled, ext4 will detect
the replace-via-rename and replace-via-truncate
patterns and force that any delayed allocation
blocks a

More details with sample code: https://lwn.net/Articles/457667/

For just flushing the buffers fdatasync may be a lower overhead option (it writes out only the data, and not file metadata like updated mtime)

@pascutto pascutto force-pushed the pascutto:with_fsync branch from fdf065a to 8446844 Jan 13, 2020
pascutto added 2 commits Nov 14, 2019
@pascutto pascutto force-pushed the pascutto:with_fsync branch from 8446844 to 6580e0f Jan 13, 2020
@pascutto

This comment has been minimized.

Copy link
Contributor Author

pascutto commented Jan 13, 2020

Rebased, added fsync as an option only, and applied it before renaming, as well as on merge and close since those are not frequent and performing a fsync for those would require the user to manually specify the files, since you can't flush after close, and flush does not flush the data file..
Comments welcome on naming and documentation.

@samoht
samoht approved these changes Jan 14, 2020
@pascutto pascutto merged commit 386a82c into mirage:master Jan 14, 2020
2 checks passed
2 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
ocaml-ci Passed
Details
@pascutto pascutto deleted the pascutto:with_fsync branch Jan 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

4 participants
You can’t perform that action at this time.