-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refdb_fs: fix loose/packed refs lookup racing with repacks #4984
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
When repacking references, git.git will first update the packed refs and only afterwards delete any existing loose references that have now been moved to the new packed refs file. Due to this, there is a potential for racing if one first reads the packfile (which has not been updated yet) and only then trying to read the loose reference (which has just been deleted). In this case, one will incorrectly fail to lookup the reference and it will be reported as missing. Naturally, this is exactly what we've been doing in `refdb_fs_backend__exists`. Fix the race by reversing the lookup: we will now first check if the loose reference exists and only afterwards refresh the packed file.
Refactor the error handling in `refdb_fs_backend__iterator` to always return the correct error code returned by the failing function.
When creating a new iterator, we eagerly load loose refs but only lazily create a copy of packed refs. The lazy load only happens as soon as we have iterated over all loose refs, opening up a potentially wide window for races. This may lead to an inconsistent view e.g. when the caller decides to reload packed references somewhen between iterating the loose refs, which is unexpected. Fix the issue by eagerly copying the sorted cache. Note that right now, we are heavily dependent on ordering here: we first need to reload packed refs, then we have to load loose refs and only as a last step are we allowed to copy the cache. This is because loading loose refs has the side-effect of setting the `PACKED_SHADOWED` flag in the packed refs cache, which we require to avoid outputting packed refs that already exist as loose refs.
Right now, loading loose refs has the side-effect of setting the `PACKREF_SHADOWED` flag for references that exist both in the loose and the packed refs. Because of this, we are force do first look up packed refs and only afterwards loading the packed refs. This is susceptible to a race, though, when refs are being repacked: when first loading the packed cache, then it may not yet have the migrated loose ref. But when now trying to look up the loose reference afterwards, then it may already have been migrated. Thus, we would fail to find this reference in this scenario. Remove this ordering dependency to allow fixing the above race. Instead of setting the flag when loading loose refs, we will now instead set it lazily when iterating over the loose refs. This even has the added benefit of not requiring us to lock the packed refs cache, as we already have an owned copy of it.
Right now, we first load the packed refs cache and only afterwards load the loose references. This is susceptible to a race when the loose ref is being migrated to a packed cache by e.g. git-pack-refs(1): libgit2 git-pack-refs 1. We load the packed ref, which does not yet have the migrated reference. 2. git-pack-refs updates the packed ref file to have the migrated ref. 3. git-pack-refs deletes the old loose ref. 4. We look up the loose ref. So we now do not find the reference at all and will never iterate over it. Fix the issue by reversing the order: instead of first loading the packed refs, we will now look up the loose reference first. If it has already been deleted, then it must already be present in the packed-refs by definition, as git.git will only delete the reference after updating the packed refs file.
07e68ed
to
94743da
Compare
Force-pushed to provide proper commit messages and remove unrelated commits. |
Cool, thanks for doing this, and thanks for the report @peff. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a different approach to #4981 that should correctly fix the issues described by @peff.
There had been an ordering dependency between loading loose and packed refs in that loading loose refs set the PACKREF_SHADOWED flag in already loaded packed refs. Upon reloading packed refs, this flag gets reset. Due to that, we always had to load loose and packed refs in the wrong order. This PR fixes this dependency and thus allows us to first load the loose refs and only afterwards load the packed refs, effectively mitigating the race.
It also fixes the same race condition in the
exists
callback that has been pointed out by @peff, too.