Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
git history can get borked #39
Our git-resource for cf-deployment is having a few problems.
First, it seems to fetch the wrong branch. The desired branch is
Second, even without the inclusion of this WIP branch, the remainder of the history is missing the most recent commits on
As you can see, ref
We're not sure what's causing this, or how to find out the root cause. (We were trying to delete a bunch of data from the
We use Pivotal Tracker to provide visibility into what our team is working on. A story for this issue has been automatically created.
The current status is as follows:
This comment, as well as the labels on the issue, will be automatically updated as the status in Tracker changes.
I hit the same problem tonight. I'm leaving it in its broken state so I can gather information for you. The repository in question is open source, and I can provide any information needed to help diagnose this.
I'm running Concourse 1.6. I upgraded to this from 1.1 earlier today.
From the looks of it, the problem has to do with the latest commit on the target branch being a merge commit.
Let's say I have two branches in this example:
Concourse is set up to define a resource for the
It's almost as if the
So it doesn't appear to be some major glitch in the resource scripts.
I dug into the build's history and saw that this began happening yesterday. We have made no commits to the tree in a few months, so it wasn't that. However, I have been working on getting our pipeline wired up, and some of the new jobs referenced the same resource. It's a bit hard right now for me to tell when the first incorrect revision was grabbed, and of course I have no idea why it was the wrong one anyway.
I did look in the reflogs for the affected builds. It appears that the initial clone was done, and then the commit in question was checked out, so it's definitely not starting with that commit. Obviously, there's not a lot of information to work with here, as the reflog doesn't give the commands or any sort of context for this.
A few questions:
3:36AM.. Great time for debugging... I realized that of course that cache data's in the database.
So I have two entries that reference this SHA. One (ID 388, in my case) contains all the information on the commit (in
I checked, and the one with the commit information is the resource I expected (
The second one with the null metadata references my brand new resource meant to temporarily work around this problem:
So I looked to see what cached info I had for that resource.
That seems odd to me. It's an older entry, with a lower check order (I am making assumptions about what this means, and I'm probably wrong here), enabled, and no metadata. Now, I just created this resource earlier to test things. I literally added the new entry by:
I saw (and continue to see) the
I checked the tables for the
Is there any chance this is in part due to some migration that may have happened when upgrading to the latest release of Concourse? Should I just wipe a bunch of this state and pretend it never happened? I don't know the changes made to the product at any low level to say, so hopefully some of this information is useful to you.
Thanks for looking into this, @chipx86! I'll reopen the issue to keep tabs on it.
A few clarifications that may help:
I think it's unlikely for this to be related to migrations; if it's picking up commits that it shouldn't see at all, that'd be super weird and likely a bug in the resource. If the commits are in the wrong order, then maybe, but that still seems unlikely.
Thanks for the info! Very helpful. I'm brand new to Concourse, so still feeling my way through this.
The new resource definitely has the bad SHA, and shouldn't have. I'm going to try again tonight to make it happen with a third resource, and do a better job of checking state after each operation, so I can help narrow this down..
Okay, I ran a test here, and need to get some additional information.
In my test, I created one more copy of the resource (
I repeatedly queried
I left everything alone, and then continually queried again. A minute later, a new entry appeared with the wrong SHA:
Somehow, an entry with the wrong SHA is appearing. I can reproduce this any number of times. Every new resource ends up with the bad entry, almost exactly one minute apart.
So is this running
I wasn't able to reproduce getting the wrong SHA by hand in a clone, but am still messing around with it. However, I have a successful, reliable repro case, so there's that!
I think I see what's happening.
Which results in:
And that's where the bad SHA is coming from. How that ends up sometimes becoming the primary SHA used for pulling down the repository, I don't know, but at least now I know where that's coming from.
The computed log range is trying to find all ancestors of the given commit, including the commit itself. That is, "Give me all the log entries from the last ref to now, including that ref (using
I think ideally, this should be returning all the merge commits, but not the commits from the other branches being merged in, correct? If so, this can use
Here's how a tree may look without
There's lots of commits on the wrong branches that can end up in the resulting list of refs. But with
Now we only get what belongs to this branch.
This nicely solves the problem being hit in this bug, as the log call above becomes:
Just to follow up (sorry for the flooded inbox here!) I applied the changes to my
Gave it a few minutes for the checks to occur, and found that the bad SHA never ended up in the database.
Of course, my original resource is still stuck on the bad ref, but I suspect as soon as I push an update (or wipe the
@chipx86 Cheers for the sleuthing! I don't mind the spam at all.
I'm curious about one thing though. It should have at least had the correct commit saved as the latest one, given that it returned them in chronological order. Can you show the resource page listing the versions in order? It should have
So I unfortunately screwed up in an attempt to debug something and nuked the entries from
All the dummy resources I've created have listed the refs in the correct order. They just had the bad entry in there, which is what set me toward trying to diagnose that.
It is entirely possible I am just really dumb.
As I was working to restore the entries for the database (which I've done), I noticed that the correct ref was set as disabled. That obviously would do it... I don't remember ever turning this off, and I was pretty sure I checked that before, but the database says otherwise. Sure enough, disabling the good ref reproduces the problem, and re-enabling it fixes the problem. Which is as designed.
(I don't suppose there's some automatic way that this would have been turned off without it being my fault? ;)
Anyway, I think the git change above is still relevant, as my version history is littered with commits from wrong branches, and that can lead to confusion and potentially other problems. Disabling the one ref shouldn't ever result in a different branch being built (and if it hadn't, I probably wouldn't have gone down this rabbit hole).
That may also still fix the original problem filed by @dsabeti. If you have a complex series of interwoven merge commits, it may walk the graph in such a way where commits appear out of order from what you'd expect, which I think is what's happening above (but I don't have enough of that tree to be able to tell).
@chipx86 Resource versions are only ever manually enabled or disabled. :) But something must have gone weird for you to have ended up here, anyway. I doubt you randomly went and disabled a version. (I guess it could have been a mis-click.)
I'm still curious as to how these merges would show up on GitHub's UI. A commit being merged from branch B to A will/should make it show up in the history for
I'm thinking from a CI perspective, since it's monitoring one branch, it's inaccurate to track individual refs from other branches except for the commit merging them in, as there may have been conflicts. If it's a fast-forward merge things shouldn't really matter (there's no real way to tell anyway).
I probably did mis-click. It was late at night and who knows what happened. (That or gremlins. Probably gremlins, now that I think about it.)
A merge commit is a valid commit like any other, and an important part of the history of a branch that we'd want to include. It differs from any other commit just in that it has more than one parent, but it still represents all those changes from the branch being merged in, and should be able to be tested like any other.
A UI displaying a flat list of commits would start at
I agree that from a CI perspective, it's inaccurate to track individual refs from other branches. That's what Concourse is currently doing today whenever it finds new merge commits between the last stored ref and HEAD, and what