-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@coqbot could report failures from GitLab CI and restart spurious ones #3
Comments
This is going to be way easier than I thought thanks to the job webhook, the build trace API endpoint and the build retry API endpoint (cf. bf62f4b and https://docs.gitlab.com/ee/api/jobs.html#retry-a-job). |
Actually something is missing from the webhook load or the trace to be able to tell whether the failure is due to a failing runner. Cf. https://gitlab.com/gitlab-org/gitlab-ee/issues/6408 |
Another unrelated problem to put this in practice would be to stop relying on Heroku's free dynos as GitLab job webhook generates way too many requests to let the bot have the 7 hours of statutory sleep. |
Please disable the report functionality until the "stale build problem" is fixed as detailed in coq/coq#7871 (comment) |
OK, this is fixed now. |
Thanks!!! |
This is basically implemented now and further enhancements can be treated in separate issues. |
Thanks for this great work. |
When a GitLab CI pipeline completes with failures, first check if the pipeline is up-to-date with respect to the PR.
If yes, check if some of the failures are spurious:
ERROR: Job failed (system failure): Cannot connect to the Docker daemon at tcp://10.142.0.123:2376. Is the docker daemon running?
If yes, restart the corresponding jobs.
(For the spurious failures we have control upon, we should fix them instead.)
Otherwise, post a message in the PR thread with the last few lines of the failing job logs and direct links to these logs.
The text was updated successfully, but these errors were encountered: