You can clone with
I noticed that one of my builds failed because git timed out while cloning: http://travis-ci.org/#!/charliesome/twostroke/jobs/566176
If a git clone fails, it should retry rather than failing on the spot.
I agree with you, the problem is if we retry straight away the problem might happen again and again and again .....
We need to put it in a queue to then be retried after 10 min, and failing on the third retry.
If you want to help out with this feature, which will involves changes in multiple places of the code, let us know :)
I might try and give this a crack tomorrow
I am curious if git clone exits with different codes for different failures. For example, if the repo does not exist, retrying makes no sense.
I am afraid it will never work very well but if we can separate clone failure detection from actual builders and make it possible to test it in isolation by passing in exit code and output, maybe we can try. Output streaming the way it is done today does not make that easy, though, and I am not sure I want to add special cases to the streaming code path.
While I haven't looked at the code, it seems like there's the one shell script that is run, and the output is streamed back to the client (is this right?)
Why not break it up a little bit so the git clone is done and checked before running the rest?
@charliesome that is incorrect. Every operation in the build lifecycle is executed separately in a stateful SSH session and we stream output to our log collector as described in the Technical Overview.
So some problems are
Overall, I find this issue not worth the effort at this point. There are plenty other things we can put our time into. Solving even the 80% cases is challenging, requires changes to 2 applications and likely introducing new message types.
The easiest solution I see is to add a new final state to all builds (something like "technical issues") and instead of marking the build as failed, we can mark it as having technical issues. This way even though there will be minor changes to all 3 apps and one new message type, we can easily make builds with technical issues not affect build status image.
Retries need to be designed first.
@michaelklishin That seems like a good solution. Most projects move fast enough that retrying a build isn't really worth it, but it's annoying that it shows up as a broken build if something like this happens.
What about allowing a user to manually have travis retry the pull after some hard time limit (i.e. don't let the user try to re-pull the repository to travis within 30 seconds).
@sigmavirus24 you can trigger new builds using Test Hook button on github (for master)
Plus, VM snapshotting and rollbacks complicate what @sigmavirus24 suggests a great deal. So we are back to square one with retrying builds and how to detect when not to retry it. This will take some time to figure out and won't be trivial to implement.
@michaelklishin The problem for me is that I'm not testing on master I'm testing on a development branch. Thanks for the reply though.
@sigmavirus24 I am positive about extending our API to allow retries. We are partially moving into this direction with pre-tested pull requests.
Is there any progress on this?
Related to #851
@joshk I believe this is implemented in travis-worker:sf-compile-sh?