Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ./mach filter-intermittents for catching intermittents on CI #14331

Merged
merged 4 commits into from Nov 29, 2016

Conversation

@Manishearth
Copy link
Member

Manishearth commented Nov 23, 2016

cc @metajack

The plan here is to run this on the error summary of the logs after each WPT/CSS run, as ./mach filter-intermittents wpt-errorsummary.log --output filtered-errorsummary.log --auth /path/to/authfile

The filtered-errorsummary.log file will be a buildbot artifact for this run, and can be used for updating test results.

We change WPT/CSS runs to not cause test failures on Buildbot; instead; the test failure will be caused by this job.

We should at some point add a separate highfive task which ccs the appropriate bugs and tracks consistent failures. (We can change mach filter-intermittents to output a second file of intermittents matched to bug numbers to make this easy)

A small issue with this is that this simple task might mask failures of the WPT harness itself. We have a separate test that tests if the test harness works though. (This test will fail if the log files it requires don't exist, perhaps that's enough to solve the problem)

r? @larsbergstrom


This change is Reviewable

@highfive
Copy link

highfive commented Nov 23, 2016

Heads up! This PR modifies the following files:

@Manishearth
Copy link
Member Author

Manishearth commented Nov 23, 2016

This needs auth because Github gives you 20 unauthenticated search API requests per minute or so, which I easily hit when testing this. Github also lets you use Oauth for the API, if you think that's easier to set up.

@wafflespeanut
Copy link
Member

wafflespeanut commented Nov 23, 2016

The filtered-errorsummary.log file will be a buildbot artifact for this run, and can be used for updating test results.

Nice!

We can change mach filter-intermittents to output a second file of intermittents matched to bug numbers

Yes, we should do that instead. Highfive can't search (at least, not until we go for Github integrations) because it lives on @jdm's token, and is already hitting the API limit.

Github also lets you use Oauth for the API,

Also, yes, it'd be nice to have a dedicated auth token for these search queries.

cc'ing @aneeshusa for buildbot ideas

@aneeshusa
Copy link
Member

aneeshusa commented Nov 25, 2016

With secret provisioning in saltfs but the steps in servo, I'd prefer to use environment variables than a file for passing authorization information.

As I mentioned in #10504 (comment), I still think it's easier to get Highfive to do all the log parsing and filtering at one time. Plus, since Highfive is continuously running, we can cache information about intermittents there, watch for issue events to update its cache when filing a new intermittent or closing/fixing one, etc. That would be much more API-friendly than searching for each failure for each build.

@Manishearth
Copy link
Member Author

Manishearth commented Nov 25, 2016

I still think it's easier to get Highfive to do all the log parsing and filtering at one time.

Yes, but then highfive has to trigger a rebuild (which means that the pain of intermittents doesn't quite go away). In this case the same build can still be accepted. We'd need to change homu otherwise.

@aneeshusa
Copy link
Member

aneeshusa commented Nov 25, 2016

Oh, I didn't think that far, that does make sense. Feel free to change etc/ci/buildbot_steps.yml in this PR too, then.

@jdm
Copy link
Member

jdm commented Nov 25, 2016

FYI, in its current incarnation highfive is not continuously running and stores no state between webhook invocations.

@larsbergstrom
Copy link
Contributor

larsbergstrom commented Nov 28, 2016

I'm OK with this, but are you not going to change buildbot_steps.yml to include running it in this PR?

@larsbergstrom
Copy link
Contributor

larsbergstrom commented Nov 28, 2016

I guess we need to:

  1. land this
  2. land changes in saltfs to add token-enhanced steps for running the changes that this will be added to
  3. deploy
  4. add steps to buildbot_steps.yml
    ?
@larsbergstrom
Copy link
Contributor

larsbergstrom commented Nov 28, 2016

@bors-servo
Copy link
Contributor

bors-servo commented Nov 28, 2016

📌 Commit 4b4421c has been approved by larsbergstrom

@bors-servo
Copy link
Contributor

bors-servo commented Nov 28, 2016

Testing commit 4b4421c with merge f188fd6...

bors-servo added a commit that referenced this pull request Nov 28, 2016
Add ./mach filter-intermittents for catching intermittents on CI

cc @metajack

The plan here is to run this on the error summary of the logs after each WPT/CSS run, as `./mach filter-intermittents wpt-errorsummary.log --output filtered-errorsummary.log --auth /path/to/authfile`

The `filtered-errorsummary.log` file will be a buildbot artifact for this run, and can be used for updating test results.

We change WPT/CSS runs to not cause test failures on Buildbot; instead; the test failure will be caused by this job.

We should at some point add a separate highfive task which ccs the appropriate bugs and tracks consistent failures. (We can change mach filter-intermittents to output a second file of intermittents matched to bug numbers to make this easy)

A small issue with this is that this simple task might mask failures of the WPT harness itself. We have a separate test that tests if the test harness works though. (This test will fail if the log files it requires don't exist, perhaps that's enough to solve the problem)

r? @larsbergstrom

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/14331)
<!-- Reviewable:end -->
@Manishearth
Copy link
Member Author

Manishearth commented Nov 28, 2016

Updated with authless buildbot steps.

The API limit is 20 requests every ~2 minutes, so this should be mostly fine. In cases where it isn't fine, this will just make us fall back to our old behavior of flunking builds on all wpt/css failures.

@larsbergstrom
Copy link
Contributor

larsbergstrom commented Nov 28, 2016

@bors-servo
Copy link
Contributor

bors-servo commented Nov 28, 2016

📌 Commit 4f021d3 has been approved by larsbergstrom

@bors-servo
Copy link
Contributor

bors-servo commented Nov 28, 2016

Testing commit 4f021d3 with merge e2bdb27...

bors-servo added a commit that referenced this pull request Nov 28, 2016
Add ./mach filter-intermittents for catching intermittents on CI

cc @metajack

The plan here is to run this on the error summary of the logs after each WPT/CSS run, as `./mach filter-intermittents wpt-errorsummary.log --output filtered-errorsummary.log --auth /path/to/authfile`

The `filtered-errorsummary.log` file will be a buildbot artifact for this run, and can be used for updating test results.

We change WPT/CSS runs to not cause test failures on Buildbot; instead; the test failure will be caused by this job.

We should at some point add a separate highfive task which ccs the appropriate bugs and tracks consistent failures. (We can change mach filter-intermittents to output a second file of intermittents matched to bug numbers to make this easy)

A small issue with this is that this simple task might mask failures of the WPT harness itself. We have a separate test that tests if the test harness works though. (This test will fail if the log files it requires don't exist, perhaps that's enough to solve the problem)

r? @larsbergstrom

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/14331)
<!-- Reviewable:end -->
@Manishearth
Copy link
Member Author

Manishearth commented Nov 28, 2016

@bors-servo
Copy link
Contributor

bors-servo commented Nov 28, 2016

@Manishearth
Copy link
Member Author

Manishearth commented Nov 28, 2016

@bors-servo r=larsbergstrom,jdm

@bors-servo
Copy link
Contributor

bors-servo commented Nov 28, 2016

📌 Commit c0c65ea has been approved by larsbergstrom,jdm

@bors-servo
Copy link
Contributor

bors-servo commented Nov 28, 2016

Testing commit c0c65ea with merge 0eb85ff...

bors-servo added a commit that referenced this pull request Nov 28, 2016
…rom,jdm

Add ./mach filter-intermittents for catching intermittents on CI

cc @metajack

The plan here is to run this on the error summary of the logs after each WPT/CSS run, as `./mach filter-intermittents wpt-errorsummary.log --output filtered-errorsummary.log --auth /path/to/authfile`

The `filtered-errorsummary.log` file will be a buildbot artifact for this run, and can be used for updating test results.

We change WPT/CSS runs to not cause test failures on Buildbot; instead; the test failure will be caused by this job.

We should at some point add a separate highfive task which ccs the appropriate bugs and tracks consistent failures. (We can change mach filter-intermittents to output a second file of intermittents matched to bug numbers to make this easy)

A small issue with this is that this simple task might mask failures of the WPT harness itself. We have a separate test that tests if the test harness works though. (This test will fail if the log files it requires don't exist, perhaps that's enough to solve the problem)

r? @larsbergstrom

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/14331)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Nov 28, 2016

💔 Test failed - mac-rel-wpt1

… uploads as an artefact
@Manishearth
Copy link
Member Author

Manishearth commented Nov 28, 2016

@bors-servo r=larsbergstrom,jdm

@bors-servo
Copy link
Contributor

bors-servo commented Nov 28, 2016

📌 Commit f42ce6d has been approved by larsbergstrom,jdm

@bors-servo
Copy link
Contributor

bors-servo commented Nov 28, 2016

Testing commit f42ce6d with merge 4021d05...

bors-servo added a commit that referenced this pull request Nov 28, 2016
…rom,jdm

Add ./mach filter-intermittents for catching intermittents on CI

cc @metajack

The plan here is to run this on the error summary of the logs after each WPT/CSS run, as `./mach filter-intermittents wpt-errorsummary.log --output filtered-errorsummary.log --auth /path/to/authfile`

The `filtered-errorsummary.log` file will be a buildbot artifact for this run, and can be used for updating test results.

We change WPT/CSS runs to not cause test failures on Buildbot; instead; the test failure will be caused by this job.

We should at some point add a separate highfive task which ccs the appropriate bugs and tracks consistent failures. (We can change mach filter-intermittents to output a second file of intermittents matched to bug numbers to make this easy)

A small issue with this is that this simple task might mask failures of the WPT harness itself. We have a separate test that tests if the test harness works though. (This test will fail if the log files it requires don't exist, perhaps that's enough to solve the problem)

r? @larsbergstrom

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/14331)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Nov 29, 2016

@bors-servo bors-servo merged commit f42ce6d into servo:master Nov 29, 2016
2 of 3 checks passed
2 of 3 checks passed
continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
@Manishearth Manishearth deleted the Manishearth:filter-intermittents branch Nov 29, 2016
@Manishearth
Copy link
Member Author

Manishearth commented Nov 29, 2016

Looks like the builders have outdated certs. Doing a python GET on https://www.joshmatthews.net/intermittent-tracker/query.py?name=private gives an HTTP 403 error (http://build.servo.org/builders/linux-rel-css/builds/1186), or if using requests, requests.exceptions.SSLError: hostname 'www.joshmatthews.net' doesn't match '*.nfshost.com'

nfshost.com is NearlyFreeSpeech's certificate, which is probably the older one. It now uses Lets Encrypt, and it seems like Python on the builders doesn't know about this.

cc @larsbergstrom @edunham

@Manishearth
Copy link
Member Author

Manishearth commented Nov 29, 2016

Only on Mac I guess, since filter-intermittents worked in http://build.servo.org/builders/mac-rel-css/builds/4508

@metajack
Copy link
Contributor

metajack commented Nov 29, 2016

Can we host this new service on build.servo.org instead? That can be done in a followup of course.

@Manishearth
Copy link
Member Author

Manishearth commented Nov 29, 2016

We can and should; I planned to do this once we were sure that things were working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

8 participants
You can’t perform that action at this time.