fsync all the things #4030

ethomson · 2016-12-15T21:03:53Z

Optionally fsync loose objects, packfiles and their indexes, loose references and packed reference files. This can be enabled by setting the option GIT_OPT_ENABLE_SYNCHRONIZED_OBJECT_CREATION, and it is not enabled by default.

Additionally, enable the do_fsync option to the loose odb backend to perform similar calls.

At present, we do not honor core.fsyncObjectFiles, though this would be a logical next step.

ethomson · 2016-12-27T23:50:42Z

Fixed this up for windows; should be ready for review.

ethomson · 2017-02-13T14:27:36Z

Fixed up the conflicts.

ioquatix · 2017-02-13T14:46:14Z

Are you calling fsync on the parent directory when creating new files?

ethomson · 2017-02-13T15:34:37Z

Are you calling fsync on the parent directory when creating new files?

No - good call. On at least POSIX-type systems this is necessary when creating new files. As far as I can tell, you cannot FlushFileBuffers on a directory handle on win32.

ioquatix · 2017-02-13T15:36:59Z

Once you commit this into master and expose it via rugged (Ruby) somehow, I will test it and let you know how it affects performance.

ioquatix · 2017-02-13T21:12:53Z

No - good call. On at least POSIX-type systems this is necessary when creating new files.

One more case - after calling link to "move" data into place, should we be calling fsync on the parent directory to ensure the results of link are persisted to disk?

ioquatix · 2017-02-13T22:17:39Z

I'd like to see this PR get merged, but I'd also like to start an extended discussion on how to make this more efficient.

Of course, calling fsync after every write, is not efficient. It will introduce significant latency. For many use cases, fsync only needs to be called before updating the target ref.

I think it would be nice to review how Apache Lucene implements durability: http://blog.mikemccandless.com/2014/04/testing-lucenes-index-durability-after.html

In particular, the way they batch changes, and test that fsync works seem like good ideas suitable for the odb backend.

I'm not suggesting we defer this PR at all - correct is more important than performance. But, we should probably also consider performance too. If we could make the fsync code path almost as fast as the alternative, and it makes libgit2 durable by default, I'd humbly propose it becomes the default.

The data git manages is too critical for corruption and/or failures.

ghost · 2017-02-14T23:05:54Z

@ioquatix

I'd like to see this PR get merged, but I'd also like to start an extended discussion on how to make this more efficient.

This is why git doesn't call fsync, it kills performance. Databases solves this by having a single file for the undo/redo logs, which is truly append only (the SATA NCQ really likes to optimize writes for these logs)

I guess this is why git is not a concurrent database system, and never will be.

NOTE: do_fsync will always be default off, right?

ghost · 2017-02-15T08:48:07Z

@ethomson Don't use FlushFileBuffers, it can be extremely slow (by Microsoft's own admission in the CreateFile docs). Use FILE_FLAG_WRITE_THROUGH | FILE_FLAG_NO_BUFFERING whenever you can instead.

ethomson · 2017-02-15T09:37:03Z

The data git manages is too critical for corruption and/or failures.

I agree with @pureconfig here. This is why we have a pluggable odb backend. Our default is to mimic git; if they change the way they write object files, we will change to match. But if you want some guarantees that git doesn't offer, then I would recommend putting your objects in a proper database.

In particular, the way they batch changes, and test that fsync works seem like good ideas suitable for the odb backend.

I haven't read this paper, but my knee-jerk reaction is that any time we differ from git, we are incorrect. Generally speaking, we aim for bug-for-bug compatibility with git.

NOTE: do_fsync will always be default off, right?

Yes. If git changes this default, so will we, but I can't imagine them doing so.

ethomson · 2017-02-15T09:40:34Z

Don't use FlushFileBuffers, it can be extremely slow (by Microsoft's own admission in the CreateFile docs). Use FILE_FLAG_WRITE_THROUGH | FILE_FLAG_NO_BUFFERING whenever you can instead.

/cc @whoisj

👋 I had it in my head that your filesystem team explicitly recommended that we use FlushFileBuffers here. Do you happen to remember the details here?

If FILE_FLAG_WRITE_THROUGH | FILE_FLAG_NO_BUFFERING is safe enough, then I can use that instead. Or we may be able to have a multistate setting that is unsafe and fast, reasonably safe and slow, and safest and punishing.

ioquatix · 2017-02-15T10:00:56Z

I agree with @pureconfig here. This is why we have a pluggable odb backend. Our default is to mimic git; if they change the way they write object files, we will change to match. But if you want some guarantees that git doesn't offer, then I would recommend putting your objects in a proper database.

Absolutely, but the git ODB with fsync would be good enough for my use case. Even without fsync, it's good enough actually, but it's a nice feature to have to improve durability.

ghost · 2017-02-15T14:10:24Z

@ioquatix

Even without fsync, it's good enough actually, but it's a nice feature to have to improve durability

But you also said

correct is more important than performance.

So "nice feature" is an understatement. As mentioned by someone else, it sounds like you want guarantees implying the use of a proper database :)

whoisj · 2017-02-15T15:11:15Z

👋 I had it in my head that your filesystem team explicitly recommended that we use FlushFileBuffers here. Do you happen to remember the details here?

@ethomson this sounds correct, but it was about a year ago (or more?) that we had the discussion with them... and man, I'm getting old. My memory isn't what it used to be. So, on that note I've sent them a query re: this topic and hopefully will have a concrete answer for you in a day or so. 😄

ioquatix · 2017-02-15T23:03:18Z

correct is more important than performance.

Yeah, I still believe this and I think libgit2 should aim for this by default, but I also appreciate the "bug-for-bug" compatibility goal.

Even without fsync, it's good enough actually, but it's a nice feature to have to improve durability

End to end durability isn't as simple as just calling fsync and being done with it, as I'm sure you know.

ioquatix · 2017-02-15T23:07:04Z

@ethomson Do you guys use libgit2 in Spokes (DGit)? How do you deal with durability issues there? According to the blog, you wait for two confirmed writes out of 3, but how do you validate those writes locally?

ethomson · 2017-02-17T11:27:59Z

@ioquatix We use libgit2 and git both on our fileservers, which are all in Spokes. I don't want to spend a lot of time discussing GitHub's products in the libgit2 issue tracker, but the point of Spokes itself is to deal with problems like this. You can learn more at https://www.youtube.com/watch?v=f7ecUqHxD7o and https://www.youtube.com/watch?v=DY0yNRNkYb0

whoisj · 2017-02-17T16:53:26Z

@ethomson the file system experts advise to not use the FILE_FLAG_WRITE_THROUGH | FILE_FLAG_NO_BUFFERING as they will likely cause exactly the behavior you're trying to avoid by emulating fsync() for Windows. They asked me to share this blog post: https://blogs.msdn.microsoft.com/oldnewthing/20140306-00/?p=1583/ .

Per the experts, the extra time spent not buffering and forcing the write to block on IO availability, almost guarantees your data won't be get to persistent media if there's a catastrophe. Instead, the recommend that the application flush data at appropriate times.

To quote them exactly:

What is the point in having the data durable on the disk without the metadata durable? So the bits are there, but after the power failure the file isn’t readable.

FILE_FLAG_NO_BUFFERING => Make it really slow to read the data later because it is not buffered.
FILE_FLAG_WRITE_THROUGH => Make it really slow to write the data now because the I/O can’t be scheduled efficiently.

If the developer wants a flush they should use the API for flush.

ethomson · 2017-02-17T16:56:31Z

Thanks @whoisj for the definitive answer.

I've updated this to fsync parent directories in most cases (when a directory entry needs updating), except on Windows (where such semantics aren't permitted and don't make sense).

I've also changed the name of the option to GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION which makes more sense, I think.

We've had an fsync option for a long time, but it was "ignored". Stop ignoring it.

Allow users to enable `SYNCHRONIZED_OBJECT_CREATION` with a setting.

Introduce a simple counter that `p_fsync` implements. This is useful for ensuring that `p_fsync` is called when we expect it to be, for example when we have enabled an odb backend to perform `fsync`s when writing objects.

Honor `git_object__synchronized_writing` when creating a packfile and corresponding index.

Add a custom `O_FSYNC` bit (if it's not been defined by the operating system`) so that `git_futils_writebuffer` can optionally do an `fsync` when it's done writing. We call `fsync` ourselves, even on systems that define `O_FSYNC` because its definition is no guarantee of its actual support. Mac, for instance, defines it but doesn't support it in an `open(2)` call.

Only use defaults for `git_futils_writebuffer` when flags == 0, lest (1 << 31) be treated as the defaults.

When fsync'ing files, fsync the parent directory in the case where we rename a file into place, or create a new file, to ensure that the directory entry is flushed correctly.

Rename `GIT_OPT_ENABLE_SYNCHRONIZED_OBJECT_CREATION` -> `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION`.

Windows doesn't support it.

ethomson · 2017-02-28T14:14:34Z

I added support for core.fsyncObjectFiles; this involved a little refactoring around how odbs get created. I cribbed the pattern from git_index_set_caps, which I sort of loathe, but it's effective enough and an existing pattern.

This seems to be failing on Linux, though, after that. I'll take a look.

pks-t · 2017-03-01T17:51:01Z

The question is more like why did it work on Windows:

diff --git a/src/odb.c b/src/odb.c
index 7ca678fdc..0e0dc5256 100644
--- a/src/odb.c
+++ b/src/odb.c
@@ -632,7 +632,7 @@ int git_odb__set_caps(git_odb *odb, int caps)
 			return -1;
 		}
 
-		if (!git_repository__cvar(&val, repo, GIT_CVAR_IGNORECASE))
+		if (!git_repository__cvar(&val, repo, GIT_CVAR_FSYNCOBJECTFILES))
 			odb->do_fsync = !!val;
 	}
 
diff --git a/tests/odb/loose.c b/tests/odb/loose.c
index d389016eb..851e9083f 100644
--- a/tests/odb/loose.c
+++ b/tests/odb/loose.c
@@ -193,8 +193,8 @@ void test_odb_loose__fsync_obeys_repo_setting(void)
 	git_oid oid;
 
 	cl_git_pass(git_repository_init(&repo, "test-objects", 1));
-	cl_git_pass(git_repository_odb__weakptr(&odb, repo));
 	cl_repo_set_bool(repo, "core.fsyncObjectFiles", true);
+	cl_git_pass(git_repository_odb__weakptr(&odb, repo));
 	cl_git_pass(git_odb_write(&oid, odb, "Test data\n", 10, GIT_OBJ_BLOB));
 	cl_assert(p_fsync__cnt > 0);
 	git_repository_free(repo);

Two issues:

This one is obvious, simply a typo
You set the "core.fsyncObjectFiles" to true after getting the ODB. As this will initialize db->do_fsync before you've changed the configuration, it will naturally be set to 0.

So this leaves me wondering why it did not fail on Windows.

ethomson · 2017-03-01T17:58:55Z

Haha; it succeeded on Mac and Windows explicitly because of the typo. Their ignoreCase-ness is configured long before we set up the odb.

Thanks for catching that, I'll fix it tonight.

ethomson · 2017-03-01T18:03:54Z

Yep; I did the same thing as your patch and will verify that the CI builds are happy before squashing it into the previous commit.

ethomson · 2017-03-01T18:10:04Z

If your question was inquiring why we didn't fail on Windows because it was fsync'ing when it shouldn't be, then the answer is that we didn't actually test that case. We ensured that we didn't fsync given an odb that isn't attached to a repo, but I never tested that given a repo, so we were simply always fsync'ing on platforms that were case insensitive.

Anyway, I also added a test to ensure that we don't do that, since that would be for reals bad to regress. 😛

ioquatix · 2017-03-01T21:21:40Z

core.fsyncObjectFiles is the config option but ENABLE_SYNCHRONIZED_OBJECT_CREATION is the command line option? is this inconsistent? Do we have a chance to review this? fsyncObjectFiles is sort of an implementation detail. What about core.durableObjectCreation or something more abstract? I'd suggest perhaps ENABLE_DURABLE_OBJECT_CREATION or something which directly reflects the goal of the feature, not it's implementation?

ethomson · 2017-03-01T21:31:54Z

fsyncObjectFiles is sort of an implementation detail.

core.fsyncObjectFiles is the name of the configuration option that git uses. This explicit goal is so that git and libgit2 will behave the same way on the same repository. It would not make a lot of sense if a user set core.fsyncObjectFiles to expect that their object files would be fsynced and they were with git but then they were not with Git Kraken.

ethomson · 2017-03-22T20:29:19Z

This has been hanging out for a while without any objections and since it doesn't change any default behavior (and some people seem interested in it) I'm going to go ahead and merge it.

whoisj · 2017-03-23T03:54:18Z

@ethomson this is rather a big deal, congrats on landing this and cross-platform no doubt. Well done sir. Well done indeed. 😄

ioquatix · 2017-03-23T03:55:30Z

If I update rugged from git, will it pull this code, or do you need to make a new release? I'd like to measure performance.

whoisj · 2017-03-23T04:00:51Z

If I update rugged from git, will it pull this code, or do you need to make a new release? I'd like to measure performance.

It is open-source and use of cmake makes it very flexible. Why not compile and drop into existing framework to test?

ioquatix · 2017-03-23T04:01:55Z

Because I use the rugged gem and my preference is that it just works without changing the existing build process.

ethomson · 2017-03-23T09:11:13Z

@ioquatix Rugged is tied to an exact version of libgit2, so it won't be updated automatically. I will send a PR to Rugged shortly.

ioquatix · 2017-03-23T09:12:06Z

@ethomson That's awesome, thanks. Then, I'll report back. It will be really interesting to see what the impact is.

ioquatix · 2017-07-31T13:31:57Z

Okay here is my report.

I have two tests: https://github.com/ioquatix/relaxo/blob/9db8348fdeec8b5e3572bc543ab02af1bf0055b2/spec/relaxo/performance_spec.rb#L57-L77

One test creates lots of objects for each commit, and the other creates lots of separate commits with one object.

Without fsync, on an oldish i7, running on an Samsung 840 PRO:

Relaxo Performance
Warming up --------------------------------------
              single   372.000  i/100ms
Calculating -------------------------------------
              single     25.267k (± 3.8%) i/s -    504.432k in  20.007610s
  single transaction should be fast
Warming up --------------------------------------
            multiple   373.000  i/100ms
Calculating -------------------------------------
            multiple      3.203k (± 4.8%) i/s -     64.156k in  20.091674s
  multiple transactions should be fast

With fsync enabled:

Relaxo Performance
Warming up --------------------------------------
              single    12.000  i/100ms
Calculating -------------------------------------
              single      1.432k (±18.6%) i/s -     26.712k in  20.018961s
  single transaction should be fast
Warming up --------------------------------------
            multiple     4.000  i/100ms
Calculating -------------------------------------
            multiple     40.816  (± 4.9%) i/s -    816.000  in  20.087377s
  multiple transactions should be fast

ioquatix · 2017-07-31T13:36:51Z

My conclusion, is that for real world usage, using fsync should have no appreciable difference for general day to day git usage, but for high performance services, it might be an issue.

ethomson mentioned this pull request Feb 13, 2017

Potential for data corruption as writes are not followed by fsync? #4123

Closed

ethomson force-pushed the ethomson/fsync branch from d9f8dec to dd80d55 Compare February 13, 2017 14:26

ethomson force-pushed the ethomson/fsync branch from c5e72a3 to 15f92da Compare February 17, 2017 16:48

ethomson force-pushed the ethomson/fsync branch from 15f92da to f4b7712 Compare February 17, 2017 17:45

Edward Thomson added 7 commits February 28, 2017 13:27

git_filebuf: optionally fsync when committing

fc77891

odb_loose: actually honor the fsync option

fc27fe2

We've had an fsync option for a long time, but it was "ignored". Stop ignoring it.

Add ENABLE_SYNCHRONIZED_OBJECT_CREATION option

6d3ad7e

Allow users to enable `SYNCHRONIZED_OBJECT_CREATION` with a setting.

odb_loose: fsync tests

e6ed0d2

Introduce a simple counter that `p_fsync` implements. This is useful for ensuring that `p_fsync` is called when we expect it to be, for example when we have enabled an odb backend to perform `fsync`s when writing objects.

packbuilder: honor git_object__synchronized_writing

1c2c0ae

Honor `git_object__synchronized_writing` when creating a packfile and corresponding index.

refdb_fs: optionally fsync loose references

af3dcb0

Edward Thomson added 5 commits February 28, 2017 13:28

refdb_fs: optionally fsync packed refs

eb56ed8

git_futils_writebuffer: default only when flags==0

5a747e0

Only use defaults for `git_futils_writebuffer` when flags == 0, lest (1 << 31) be treated as the defaults.

fsync parent directories when fsyncing

1229e1c

When fsync'ing files, fsync the parent directory in the case where we rename a file into place, or create a new file, to ensure that the directory entry is flushed correctly.

fsync: call it "synchronous" object writing

2a5ad7d

Rename `GIT_OPT_ENABLE_SYNCHRONIZED_OBJECT_CREATION` -> `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION`.

win32: don't fsync parent directories on Windows

3ac05d1

Windows doesn't support it.

ethomson force-pushed the ethomson/fsync branch from cff175b to 10a7e54 Compare February 28, 2017 13:29

Honor core.fsyncObjectFiles

1c04a96

ethomson force-pushed the ethomson/fsync branch from 493f124 to 1c04a96 Compare March 2, 2017 09:11

ethomson merged commit 6fd6c67 into master Mar 22, 2017

johnhaley81 mentioned this pull request Apr 20, 2017

Bump libgit2 to 9e76f2c7 nodegit/nodegit#1276

Merged

chescock mentioned this pull request May 18, 2017

Ensure packfiles with different contents have different names #4088

Merged

ethomson deleted the ethomson/fsync branch January 9, 2019 10:18

fsync all the things #4030

fsync all the things #4030

Conversation

ethomson commented Dec 15, 2016

ethomson commented Dec 27, 2016

ethomson commented Feb 13, 2017

ioquatix commented Feb 13, 2017 • edited Loading

ethomson commented Feb 13, 2017

ioquatix commented Feb 13, 2017

ioquatix commented Feb 13, 2017

ioquatix commented Feb 13, 2017 • edited Loading

ghost commented Feb 14, 2017 • edited by ghost Loading

ghost commented Feb 15, 2017 • edited by ghost Loading

ethomson commented Feb 15, 2017

ethomson commented Feb 15, 2017

ioquatix commented Feb 15, 2017

ghost commented Feb 15, 2017 • edited by ghost Loading

whoisj commented Feb 15, 2017

ioquatix commented Feb 15, 2017

ioquatix commented Feb 15, 2017 • edited Loading

ethomson commented Feb 17, 2017

whoisj commented Feb 17, 2017 • edited Loading

ethomson commented Feb 17, 2017

ethomson commented Feb 28, 2017

pks-t commented Mar 1, 2017

ethomson commented Mar 1, 2017

ethomson commented Mar 1, 2017

ethomson commented Mar 1, 2017

ioquatix commented Mar 1, 2017

ethomson commented Mar 1, 2017

ethomson commented Mar 22, 2017

whoisj commented Mar 23, 2017 • edited Loading

ioquatix commented Mar 23, 2017

whoisj commented Mar 23, 2017

ioquatix commented Mar 23, 2017

ethomson commented Mar 23, 2017

ioquatix commented Mar 23, 2017

ioquatix commented Jul 31, 2017 • edited Loading

ioquatix commented Jul 31, 2017

ioquatix commented Feb 13, 2017 •

edited

Loading

ioquatix commented Feb 13, 2017 •

edited

Loading

ghost commented Feb 14, 2017 •

edited by ghost

Loading

ghost commented Feb 15, 2017 •

edited by ghost

Loading

ghost commented Feb 15, 2017 •

edited by ghost

Loading

ioquatix commented Feb 15, 2017 •

edited

Loading

whoisj commented Feb 17, 2017 •

edited

Loading

whoisj commented Mar 23, 2017 •

edited

Loading

ioquatix commented Jul 31, 2017 •

edited

Loading