Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some way for developer to force rebuild. #419

Closed
inferno-chromium opened this issue Feb 23, 2017 · 11 comments
Closed

some way for developer to force rebuild. #419

inferno-chromium opened this issue Feb 23, 2017 · 11 comments

Comments

@inferno-chromium
Copy link
Collaborator

No description provided.

@evverx
Copy link
Contributor

evverx commented Apr 4, 2020

Now that badges turn red, build failures are much more noticeable. It would be great if there was a way to trigger a new build manually. Though, it would probably be even better if sporadic failures like

FETCHSOURCE
BUILD
Starting Step #0
Step #0: Already have image (with digest): gcr.io/cloud-builders/git
Step #0: Cloning into 'oss-fuzz'...
Step #0: fatal: unable to access 'https://github.com/google/oss-fuzz.git/': Failed to connect to github.com port 443: Connection timed out
Finished Step #0
ERROR
ERROR: build step 0 "gcr.io/cloud-builders/git" failed: step exited with non-zero status: 128

could be detected and rescheduled automatically. Apart from green badges, it would prevent OSS-Fuzz from opening issues in the unlikely event of two infrastructure issues in two days.

@inferno-chromium
Copy link
Collaborator Author

@Dor1s @oliverchang - thoughts here. i think a retry logic for some obvious failures makes sense, or even retry 2/3 times on same day or something.

@Dor1s
Copy link
Contributor

Dor1s commented Apr 5, 2020

Retrying for infra failures sounds doable.

Regarding builds on demand, I'm not sure. Don't want people to abuse it for testing changes. Maybe we should leverage CIFuzz for that purpose, e.g. if the latest OSS-Fuzz build is broken, but CIFuzz has just succeeded, we can trigger re-run for that project.

But:

  1. CIFuzz doens't build all the configs
  2. CIFuzz dashboard would need to trigger the builds in such scenario

Another idea is to enable per project hooks based on the project repos (to build on every commit pushed to master), but with some limitation e.g. not to build more often than once per 6 hours. That would be beneficial in both cases:

  • if a project is being actively developed, we'll be doing builds more often
  • if a project is not being actively developed, we can build it even less often than the default

Blocking this on #3538

@evverx
Copy link
Contributor

evverx commented Apr 5, 2020

Regarding builds on demand, I'm not sure. Don't want people to abuse it for testing changes.

As long as it can't be automated easily, I'm pretty sure it's going to be hard to abuse a button that should be pressed manually to trigger a build (assuming only the people listed in project.yaml are allowed to do that).

if a project is not being actively developed, we can build it even less often than the default

I think it would make sense to keep building projects unconditionally every once in a while to allow for dependencies that can keep moving forward (or backwards :-)) regardless of how often projects using them get updated.

@oliverchang
Copy link
Collaborator

Any substantial improvement would require a rework of our build scheduling, as Max filed in #3538.

In any case, I'm not sure the complexity of retrying in a smart way is worth it. In the meantime we can just dumbly retry on every failure at most once. This should be easily doable by modifying the Jenkins build config.

@Dor1s
Copy link
Contributor

Dor1s commented Apr 6, 2020

This should be easily doable by modifying the Jenkins build config.

I did some googling and didn't find any native way to do it. The two ways I found both require plugins:

  1. naginator: https://plugins.jenkins.io/naginator/
  2. pipeline: https://jenkins.io/doc/pipeline/steps/workflow-basic-steps/#retry-retry-the-body-up-to-n-times

GCB seems to have a retry API: https://cloud.google.com/cloud-build/docs/api/reference/rest/v1/projects.builds/retry

So I think we can patch

return status == 'SUCCESS'
to call it in case of failure. That should also protect us from re-trying very long builds, as Jenkins job timeout wouldn't be reset after the first failure in such scenario. Let me give it a try.

@oliverchang
Copy link
Collaborator

Yep, sorry I should've clarified to save you some time -- this was exactly what I meant :)

Dor1s added a commit that referenced this issue Apr 7, 2020
* [infra] Add build retry logic inside wait_for_build.py (#419).

* typo

* address comments by Oliver
@Dor1s
Copy link
Contributor

Dor1s commented Apr 7, 2020

Yep, sorry I should've clarified to save you some time -- this was exactly what I meant :)

Ah, no worries!

The retry logic is deployed (in the middle of the OSS-Fuzz builder job running).

The other features discussed here will be tracked in #3538 (I've added point 4. to it).

@Dor1s Dor1s closed this as completed Apr 7, 2020
@evverx
Copy link
Contributor

evverx commented Apr 23, 2020

@Dor1s I'm wondering if coverage builds are covered as well. According to https://oss-fuzz-build-logs.storage.googleapis.com/log-b218ba4c-3aee-4d9b-959c-acc34acee954.txt, the last coverage build failed with

starting build "b218ba4c-3aee-4d9b-959c-acc34acee954"

FETCHSOURCE
BUILD
Starting Step #0
Step #0: Already have image (with digest): gcr.io/cloud-builders/git
Step #0: Cloning into 'oss-fuzz'...
Step #0: fatal: unable to access 'https://github.com/google/oss-fuzz.git/': The requested URL returned error: 503
Finished Step #0
ERROR
ERROR: build step 0 "gcr.io/cloud-builders/git" failed: step exited with non-zero status: 128

and it's hard to tell whether it was restarted or not.

@Dor1s
Copy link
Contributor

Dor1s commented Apr 23, 2020

Yes, coverage builds are affected as well. Two consecutive builds have different IDs, so users like yourself won't see both, but admins can see it:

python wait_for_build.py 658f1f55-aaa2-4fd4-b7e4-cacc777c7980
2020-04-23 13:11:27.544998 QUEUED
2020-04-23 13:11:42.609716 WORKING
2020-04-23 13:15:13.395316 FAILURE
The build failed. Retrying the same build one more time.
2020-04-23 13:15:13.865580 QUEUED
2020-04-23 13:15:28.901593 WORKING
2020-04-23 13:17:29.305992 FAILURE
Build step 'Execute shell' marked build as failure
Finished: FAILURE

Fwiw, there's plenty of failed coverage jobs, maybe GitHub was offline long enough to fail that many times :)

@evverx
Copy link
Contributor

evverx commented Apr 24, 2020

On the one hand, I'm kind of glad we got to the point where the OSS-Fuzz badge basically shows the status of GitHub :-) On the other hand, I think maybe it would be better if the badge was less sensitive to issues like that. Would it be possible for it to turn red or yellow depending on whether an issue was opened on Monorail?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants