Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Action fails in pull on new data branch #33

Closed
flaix opened this issue Nov 15, 2021 · 10 comments
Closed

Action fails in pull on new data branch #33

flaix opened this issue Nov 15, 2021 · 10 comments

Comments

@flaix
Copy link
Contributor

flaix commented Nov 15, 2021

The action does not require the data branch to exist up front. This is good. If the data branch does not exist, it will be created.

+ git checkout gitblit
error: pathspec 'gitblit' did not match any file(s) known to git
+ git checkout -b gitblit

But this breaks later in the script when a default pull is executed. The pull will have no tracked remote ref to operate on, since the branch is new. This results in the following error.

+ git pull
There is no tracking information for the current branch.
Please specify which branch you want to merge with.
See git-pull(1) for details.

    git pull <remote> <branch>

If you wish to set tracking information for this branch you can do so with:

    git branch --set-upstream-to=origin/<branch> gitblit

I believe the PR #30 is there to address this problem, from the looks of it. I haven't tested it, though.

@jgehrcke
Copy link
Owner

Thank you @flaix and @fboemer!

@jgehrcke
Copy link
Owner

Patched via #30.

@flaix
Copy link
Contributor Author

flaix commented Nov 15, 2021

I am really sorry to disappoint, but I just tested the commit in the PR, and it doesn't solve this problem. Sorry, I should have actually tested it, instead of just assuming that the PR fixed this error. 😞

With the change, the pull still doesn't work.

fatal: couldn't find remote ref gitblit

@fboemer , was that actually the problem you wanted to solve and did your change fix it for you?

I am not sure that the problem can be solved with a pull because it effectively will try to merge with a non-existent ref (the remote branch). At least I can't think of any from the top of my head, I would have to research this.

So I guess (ugh, guesswork again) if you want to keep the feature that a data branch can be created by a run, a different approach would be needed.

@fboemer
Copy link
Contributor

fboemer commented Nov 15, 2021

I am really sorry to disappoint, but I just tested the commit in the PR, and it doesn't solve this problem. Sorry, I should have actually tested it, instead of just assuming that the PR fixed this error. 😞

With the change, the pull still doesn't work.

fatal: couldn't find remote ref gitblit

@fboemer , was that actually the problem you wanted to solve and did your change fix it for you?

I am not sure that the problem can be solved with a pull because it effectively will try to merge with a non-existent ref (the remote branch). At least I can't think of any from the top of my head, I would have to research this.

So I guess (ugh, guesswork again) if you want to keep the feature that a data branch can be created by a run, a different approach would be needed.

Thanks for the report; I don't quite recall the error I was seeing (the related Actions artifacts have since expired) and I'm unfortunately out of Actions minutes for this month, so I can't reproduce for another week or so. @ajagann, do you have any insights here?

@flaix
Copy link
Contributor Author

flaix commented Nov 15, 2021

@jgehrcke , you may want to reopen this issue. Sorry, I was too quick in assuming this problem was already fixed.

The script does a pull and then push in order to handle collisions with jobs running in parallel. What about reversing that (to how a lazy developer would work), pushing, and if that fails, pulling to merge in the other jobs changes? Would that fix the parallel use case as well? It would handle this issue as the push would go through for an new branch not existing on the remote yet.

flaix added a commit to flaix/github-repo-stats that referenced this issue Nov 15, 2021
In order to handle multiple jobs running in parallel, and thus
pushing changes in parallel, a loop running pull and then push
tries to get a job's changes pushed to the data repository.

The problem here is that if the data branch was created locally
as a new branch, then the pull before the push will fail and the
whole script will exit. See issue jgehrcke#33.

This is a very simple attempt to solve this issue by mimicing
a developer's workflow, who tried to push and got rejected because
of the remote having changes from another push. The lazy developer
woud then pull and try to push again.

This does look like overkill, to be honest, and is only a quick
and dirty attempt at a solution. It is not very robust because
it relies on the error output of `git push` which might change.
Unfortunately, git commands do not have distinctive error codes
that could be used instead of guessing from the error message.
@jgehrcke
Copy link
Owner

The script does a pull and then push in order to handle collisions with jobs running in parallel. What about reversing that

We could also keep the order and simply ignore git pull errors. The downside is that for programming errors we might instead of properly crashing the wrapper script end up staying in the pull-push loop.

@flaix
Copy link
Contributor Author

flaix commented Nov 17, 2021

Well, you defined a maximum time to stay in the loop, so eventually it would error out at some point. The question is, what other errors could the pull run into at this point. Merge conflicts is the only one I can think of.

@jgehrcke
Copy link
Owner

The question is, what other errors could the pull run into at this point.

Not so sure. What are all the expected failure modes, and which ones do we not want to 'ignore'?

I mean, for example for the error discussed here in this issue where pull fails with There is no tracking information for the current branch -- there it was a good idea to fail/crash fast.

I'll think about this a tiny more soon. I think we're getting close. Thanks for all the valuable thinking here here!

@flaix
Copy link
Contributor Author

flaix commented Nov 17, 2021

Yes and no, I would argue that in this case discussed the crash was not helpful as it would have remedied itself with the following (first) push.

I'm not sure there is an elegant way. I didn't find a way for an "atomic" "check-for-branch-and-push" with the porcelain git commands. Right now I do a push first, then the pull and another push if the first push failed. That works, but in a scenario where the pull/push is necessary, i.e. where concurrent pushes occur, the first push is bound to fail. It works for me as I have it set up with a max parallel = 1 setting.

jgehrcke added a commit that referenced this issue Nov 19, 2021
jgehrcke added a commit that referenced this issue Nov 19, 2021
pull/push loop: ignore pull errors (fix #33)
@jgehrcke
Copy link
Owner

I would argue that in this case discussed the crash was not helpful as it would have remedied itself with the following (first) push.

Accidental treatment of special cases is something I don't want to have in code :D But yes you're right, by now it's not accidental anymore. We're ignoring pull errors now: https://github.com/jgehrcke/github-repo-stats/pull/35/files#diff-6f9d41d046756f0ddc2fcee0626bdb50100d12b88f293734eff742818e03efa2L214 -- and I have noted the special case in the corresponding code comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants