-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix revwalk limiting regression #4809
Conversation
When porting, we overlooked that the difference between git's and our's time representation and copied their way of getting the max value. Unfortunately git was using unsigned integers, so `~0ll` does correspond to their max value, whereas for us it corresponds to `-1`. This means that we always consider the last date to be smaller than the current commit's and always think commits are interesting. Change the initial value to the macro that gives us the maximum value on each platform so we can accurately consider commits interesting or not.
…tamp This is not a big deal, but it does make us match git more closely by checking only the first. The lists are sorted already, so there should be no functional difference other than removing a possible check from every iteration in the loop.
I've just ran and timed the reproducers from #4428 and a trimmed version of the reproducer in #4740 (single iteration) on the linux kernel repo. The times reported for libgit2 built from master (bc34cb6):
For the current tip (12a1790)
Honestly, I thought I did something wrong when I performed the second measurement for the #4428 reproducer, but apparently this particular regression is not resolved. However, the performance is much better for #4740. Edit: As expected, the performance of the #4428 reproducer also greatly increases if the commits are not sorted by time:
|
#4428 doesn't actually seem like a regression, but a fix and a misunderstanding of what different flags ought to do. |
Ouch! Thanks for the fix, @carlosmn. |
@@ -405,7 +405,7 @@ static int still_interesting(git_commit_list *list, int64_t time, int slop) | |||
static int limit_list(git_commit_list **out, git_revwalk *walk, git_commit_list *commits) | |||
{ | |||
int error, slop = SLOP; | |||
int64_t time = ~0ll; | |||
int64_t time = INT64_MAX; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised none of our tests catched this :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just a performance regression, as all this does is make us think that we never went back far enough. If we had perf tests, maybe we would have caugh tit.
On Sat, Sep 22, 2018 at 12:32:52PM -0700, Carlos Martín Nieto wrote:
carlosmn commented on this pull request.
> @@ -405,7 +405,7 @@ static int still_interesting(git_commit_list *list, int64_t time, int slop)
static int limit_list(git_commit_list **out, git_revwalk *walk, git_commit_list *commits)
{
int error, slop = SLOP;
- int64_t time = ~0ll;
+ int64_t time = INT64_MAX;
It's just a performance regression, as all this does is make us think that we never went back far enough. If we had perf tests, maybe we would have caugh tit.
Yeah, I realized that at a later point, too. But thanks for
clarifying!
|
When porting, we overlooked that the difference between git's and our's time
representation and copied their way of getting the max value.
Unfortunately git was using unsigned integers, so
~0ll
does correspond totheir max value, whereas for us it corresponds to
-1
. This means that wealways consider the last date to be smaller than the current commit's and always
think commits are interesting.
Change the initial value to the macro that gives us the maximum value on each
platform so we can accurately consider commits interesting or not.
The second commit is mostly just to reduce the drift from git, the actual perf difference should be negligible.
This fixes #4740 and we should backport it as it fixes a regression.