Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fsync all the things #4030

Merged
merged 13 commits into from
Mar 22, 2017
Merged

fsync all the things #4030

merged 13 commits into from
Mar 22, 2017

Conversation

ethomson
Copy link
Member

Optionally fsync loose objects, packfiles and their indexes, loose references and packed reference files. This can be enabled by setting the option GIT_OPT_ENABLE_SYNCHRONIZED_OBJECT_CREATION, and it is not enabled by default.

Additionally, enable the do_fsync option to the loose odb backend to perform similar calls.

At present, we do not honor core.fsyncObjectFiles, though this would be a logical next step.

@ethomson
Copy link
Member Author

Fixed this up for windows; should be ready for review.

@ethomson
Copy link
Member Author

Fixed up the conflicts.

@ioquatix
Copy link

ioquatix commented Feb 13, 2017

Are you calling fsync on the parent directory when creating new files?

@ethomson
Copy link
Member Author

Are you calling fsync on the parent directory when creating new files?

No - good call. On at least POSIX-type systems this is necessary when creating new files. As far as I can tell, you cannot FlushFileBuffers on a directory handle on win32.

@ioquatix
Copy link

Once you commit this into master and expose it via rugged (Ruby) somehow, I will test it and let you know how it affects performance.

@ioquatix
Copy link

No - good call. On at least POSIX-type systems this is necessary when creating new files.

One more case - after calling link to "move" data into place, should we be calling fsync on the parent directory to ensure the results of link are persisted to disk?

@ioquatix
Copy link

ioquatix commented Feb 13, 2017

I'd like to see this PR get merged, but I'd also like to start an extended discussion on how to make this more efficient.

Of course, calling fsync after every write, is not efficient. It will introduce significant latency. For many use cases, fsync only needs to be called before updating the target ref.

I think it would be nice to review how Apache Lucene implements durability: http://blog.mikemccandless.com/2014/04/testing-lucenes-index-durability-after.html

In particular, the way they batch changes, and test that fsync works seem like good ideas suitable for the odb backend.

I'm not suggesting we defer this PR at all - correct is more important than performance. But, we should probably also consider performance too. If we could make the fsync code path almost as fast as the alternative, and it makes libgit2 durable by default, I'd humbly propose it becomes the default.

The data git manages is too critical for corruption and/or failures.

@ghost
Copy link

ghost commented Feb 14, 2017

@ioquatix

I'd like to see this PR get merged, but I'd also like to start an extended discussion on how to make this more efficient.

This is why git doesn't call fsync, it kills performance. Databases solves this by having a single file for the undo/redo logs, which is truly append only (the SATA NCQ really likes to optimize writes for these logs)

I guess this is why git is not a concurrent database system, and never will be.

NOTE: do_fsync will always be default off, right?

@ghost
Copy link

ghost commented Feb 15, 2017

@ethomson Don't use FlushFileBuffers, it can be extremely slow (by Microsoft's own admission in the CreateFile docs). Use FILE_FLAG_WRITE_THROUGH | FILE_FLAG_NO_BUFFERING whenever you can instead.

@ethomson
Copy link
Member Author

The data git manages is too critical for corruption and/or failures.

I agree with @pureconfig here. This is why we have a pluggable odb backend. Our default is to mimic git; if they change the way they write object files, we will change to match. But if you want some guarantees that git doesn't offer, then I would recommend putting your objects in a proper database.

In particular, the way they batch changes, and test that fsync works seem like good ideas suitable for the odb backend.

I haven't read this paper, but my knee-jerk reaction is that any time we differ from git, we are incorrect. Generally speaking, we aim for bug-for-bug compatibility with git.

NOTE: do_fsync will always be default off, right?

Yes. If git changes this default, so will we, but I can't imagine them doing so.

@ethomson
Copy link
Member Author

Don't use FlushFileBuffers, it can be extremely slow (by Microsoft's own admission in the CreateFile docs). Use FILE_FLAG_WRITE_THROUGH | FILE_FLAG_NO_BUFFERING whenever you can instead.

/cc @whoisj

👋 I had it in my head that your filesystem team explicitly recommended that we use FlushFileBuffers here. Do you happen to remember the details here?

If FILE_FLAG_WRITE_THROUGH | FILE_FLAG_NO_BUFFERING is safe enough, then I can use that instead. Or we may be able to have a multistate setting that is unsafe and fast, reasonably safe and slow, and safest and punishing.

@ioquatix
Copy link

I agree with @pureconfig here. This is why we have a pluggable odb backend. Our default is to mimic git; if they change the way they write object files, we will change to match. But if you want some guarantees that git doesn't offer, then I would recommend putting your objects in a proper database.

Absolutely, but the git ODB with fsync would be good enough for my use case. Even without fsync, it's good enough actually, but it's a nice feature to have to improve durability.

@ghost
Copy link

ghost commented Feb 15, 2017

@ioquatix

Even without fsync, it's good enough actually, but it's a nice feature to have to improve durability

But you also said

correct is more important than performance.

So "nice feature" is an understatement. As mentioned by someone else, it sounds like you want guarantees implying the use of a proper database :)

@whoisj
Copy link

whoisj commented Feb 15, 2017

👋 I had it in my head that your filesystem team explicitly recommended that we use FlushFileBuffers here. Do you happen to remember the details here?

@ethomson this sounds correct, but it was about a year ago (or more?) that we had the discussion with them... and man, I'm getting old. My memory isn't what it used to be. So, on that note I've sent them a query re: this topic and hopefully will have a concrete answer for you in a day or so. 😄

@ioquatix
Copy link

correct is more important than performance.

Yeah, I still believe this and I think libgit2 should aim for this by default, but I also appreciate the "bug-for-bug" compatibility goal.

Even without fsync, it's good enough actually, but it's a nice feature to have to improve durability

End to end durability isn't as simple as just calling fsync and being done with it, as I'm sure you know.

@ioquatix
Copy link

ioquatix commented Feb 15, 2017

@ethomson Do you guys use libgit2 in Spokes (DGit)? How do you deal with durability issues there? According to the blog, you wait for two confirmed writes out of 3, but how do you validate those writes locally?

@ethomson
Copy link
Member Author

@ioquatix We use libgit2 and git both on our fileservers, which are all in Spokes. I don't want to spend a lot of time discussing GitHub's products in the libgit2 issue tracker, but the point of Spokes itself is to deal with problems like this. You can learn more at https://www.youtube.com/watch?v=f7ecUqHxD7o and https://www.youtube.com/watch?v=DY0yNRNkYb0

@whoisj
Copy link

whoisj commented Feb 17, 2017

@ethomson the file system experts advise to not use the FILE_FLAG_WRITE_THROUGH | FILE_FLAG_NO_BUFFERING as they will likely cause exactly the behavior you're trying to avoid by emulating fsync() for Windows. They asked me to share this blog post: https://blogs.msdn.microsoft.com/oldnewthing/20140306-00/?p=1583/ .

Per the experts, the extra time spent not buffering and forcing the write to block on IO availability, almost guarantees your data won't be get to persistent media if there's a catastrophe. Instead, the recommend that the application flush data at appropriate times.

To quote them exactly:

What is the point in having the data durable on the disk without the metadata durable? So the bits are there, but after the power failure the file isn’t readable.

FILE_FLAG_NO_BUFFERING => Make it really slow to read the data later because it is not buffered.
FILE_FLAG_WRITE_THROUGH => Make it really slow to write the data now because the I/O can’t be scheduled efficiently.

If the developer wants a flush they should use the API for flush.

@ethomson
Copy link
Member Author

Thanks @whoisj for the definitive answer.

I've updated this to fsync parent directories in most cases (when a directory entry needs updating), except on Windows (where such semantics aren't permitted and don't make sense).

I've also changed the name of the option to GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION which makes more sense, I think.

Edward Thomson added 7 commits February 28, 2017 13:27
We've had an fsync option for a long time, but it was "ignored".
Stop ignoring it.
Allow users to enable `SYNCHRONIZED_OBJECT_CREATION` with a setting.
Introduce a simple counter that `p_fsync` implements.  This is useful
for ensuring that `p_fsync` is called when we expect it to be, for
example when we have enabled an odb backend to perform `fsync`s when
writing objects.
Honor `git_object__synchronized_writing` when creating a packfile and
corresponding index.
Add a custom `O_FSYNC` bit (if it's not been defined by the operating
system`) so that `git_futils_writebuffer` can optionally do an `fsync`
when it's done writing.

We call `fsync` ourselves, even on systems that define `O_FSYNC` because
its definition is no guarantee of its actual support.  Mac, for
instance, defines it but doesn't support it in an `open(2)` call.
Edward Thomson added 5 commits February 28, 2017 13:28
Only use defaults for `git_futils_writebuffer` when flags == 0, lest
(1 << 31) be treated as the defaults.
When fsync'ing files, fsync the parent directory in the case where we
rename a file into place, or create a new file, to ensure that the
directory entry is flushed correctly.
Rename `GIT_OPT_ENABLE_SYNCHRONIZED_OBJECT_CREATION` ->
`GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION`.
@ethomson
Copy link
Member Author

I added support for core.fsyncObjectFiles; this involved a little refactoring around how odbs get created. I cribbed the pattern from git_index_set_caps, which I sort of loathe, but it's effective enough and an existing pattern.

This seems to be failing on Linux, though, after that. I'll take a look.

@pks-t
Copy link
Member

pks-t commented Mar 1, 2017

The question is more like why did it work on Windows:

diff --git a/src/odb.c b/src/odb.c
index 7ca678fdc..0e0dc5256 100644
--- a/src/odb.c
+++ b/src/odb.c
@@ -632,7 +632,7 @@ int git_odb__set_caps(git_odb *odb, int caps)
 			return -1;
 		}
 
-		if (!git_repository__cvar(&val, repo, GIT_CVAR_IGNORECASE))
+		if (!git_repository__cvar(&val, repo, GIT_CVAR_FSYNCOBJECTFILES))
 			odb->do_fsync = !!val;
 	}
 
diff --git a/tests/odb/loose.c b/tests/odb/loose.c
index d389016eb..851e9083f 100644
--- a/tests/odb/loose.c
+++ b/tests/odb/loose.c
@@ -193,8 +193,8 @@ void test_odb_loose__fsync_obeys_repo_setting(void)
 	git_oid oid;
 
 	cl_git_pass(git_repository_init(&repo, "test-objects", 1));
-	cl_git_pass(git_repository_odb__weakptr(&odb, repo));
 	cl_repo_set_bool(repo, "core.fsyncObjectFiles", true);
+	cl_git_pass(git_repository_odb__weakptr(&odb, repo));
 	cl_git_pass(git_odb_write(&oid, odb, "Test data\n", 10, GIT_OBJ_BLOB));
 	cl_assert(p_fsync__cnt > 0);
 	git_repository_free(repo);

Two issues:

  1. This one is obvious, simply a typo
  2. You set the "core.fsyncObjectFiles" to true after getting the ODB. As this will initialize db->do_fsync before you've changed the configuration, it will naturally be set to 0.

So this leaves me wondering why it did not fail on Windows.

@ethomson
Copy link
Member Author

ethomson commented Mar 1, 2017

Haha; it succeeded on Mac and Windows explicitly because of the typo. Their ignoreCase-ness is configured long before we set up the odb.

Thanks for catching that, I'll fix it tonight.

@ethomson
Copy link
Member Author

ethomson commented Mar 1, 2017

Yep; I did the same thing as your patch and will verify that the CI builds are happy before squashing it into the previous commit.

@ethomson
Copy link
Member Author

ethomson commented Mar 1, 2017

If your question was inquiring why we didn't fail on Windows because it was fsync'ing when it shouldn't be, then the answer is that we didn't actually test that case. We ensured that we didn't fsync given an odb that isn't attached to a repo, but I never tested that given a repo, so we were simply always fsync'ing on platforms that were case insensitive.

Anyway, I also added a test to ensure that we don't do that, since that would be for reals bad to regress. 😛

@ioquatix
Copy link

ioquatix commented Mar 1, 2017

core.fsyncObjectFiles is the config option but ENABLE_SYNCHRONIZED_OBJECT_CREATION is the command line option? is this inconsistent? Do we have a chance to review this? fsyncObjectFiles is sort of an implementation detail. What about core.durableObjectCreation or something more abstract? I'd suggest perhaps ENABLE_DURABLE_OBJECT_CREATION or something which directly reflects the goal of the feature, not it's implementation?

@ethomson
Copy link
Member Author

ethomson commented Mar 1, 2017

fsyncObjectFiles is sort of an implementation detail.

core.fsyncObjectFiles is the name of the configuration option that git uses. This explicit goal is so that git and libgit2 will behave the same way on the same repository. It would not make a lot of sense if a user set core.fsyncObjectFiles to expect that their object files would be fsynced and they were with git but then they were not with Git Kraken.

@ethomson
Copy link
Member Author

This has been hanging out for a while without any objections and since it doesn't change any default behavior (and some people seem interested in it) I'm going to go ahead and merge it.

@ethomson ethomson merged commit 6fd6c67 into master Mar 22, 2017
@whoisj
Copy link

whoisj commented Mar 23, 2017

@ethomson this is rather a big deal, congrats on landing this and cross-platform no doubt. Well done sir. Well done indeed. 😄

@ioquatix
Copy link

If I update rugged from git, will it pull this code, or do you need to make a new release? I'd like to measure performance.

@whoisj
Copy link

whoisj commented Mar 23, 2017

If I update rugged from git, will it pull this code, or do you need to make a new release? I'd like to measure performance.

It is open-source and use of cmake makes it very flexible. Why not compile and drop into existing framework to test?

@ioquatix
Copy link

Because I use the rugged gem and my preference is that it just works without changing the existing build process.

@ethomson
Copy link
Member Author

@ioquatix Rugged is tied to an exact version of libgit2, so it won't be updated automatically. I will send a PR to Rugged shortly.

@ioquatix
Copy link

@ethomson That's awesome, thanks. Then, I'll report back. It will be really interesting to see what the impact is.

@ioquatix
Copy link

ioquatix commented Jul 31, 2017

Okay here is my report.

I have two tests: https://github.com/ioquatix/relaxo/blob/9db8348fdeec8b5e3572bc543ab02af1bf0055b2/spec/relaxo/performance_spec.rb#L57-L77

One test creates lots of objects for each commit, and the other creates lots of separate commits with one object.

Without fsync, on an oldish i7, running on an Samsung 840 PRO:

Relaxo Performance
Warming up --------------------------------------
              single   372.000  i/100ms
Calculating -------------------------------------
              single     25.267k (± 3.8%) i/s -    504.432k in  20.007610s
  single transaction should be fast
Warming up --------------------------------------
            multiple   373.000  i/100ms
Calculating -------------------------------------
            multiple      3.203k (± 4.8%) i/s -     64.156k in  20.091674s
  multiple transactions should be fast

With fsync enabled:

Relaxo Performance
Warming up --------------------------------------
              single    12.000  i/100ms
Calculating -------------------------------------
              single      1.432k (±18.6%) i/s -     26.712k in  20.018961s
  single transaction should be fast
Warming up --------------------------------------
            multiple     4.000  i/100ms
Calculating -------------------------------------
            multiple     40.816  (± 4.9%) i/s -    816.000  in  20.087377s
  multiple transactions should be fast

@ioquatix
Copy link

My conclusion, is that for real world usage, using fsync should have no appreciable difference for general day to day git usage, but for high performance services, it might be an issue.

@ethomson ethomson deleted the ethomson/fsync branch January 9, 2019 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants