-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Postmortem: wpt.fyi status checks not showing up on WPT repo #1661
Comments
Action item 1: filed a bug to Chrome Password Manager: https://crbug.com/1027556 |
Done via Stackdriver. Tentatively set the threshold to 0.02 5XX responses per second (~= 1 error per minute). |
GitHub does not have a public issue tracker. Sent email to support@github.com with details of problem. |
I stumbled upon https://github.community/ just now. Not sure if this is their UserVoice or issue tracker or both. |
From that page:
(The contact form is equivalent to emailing support@github.com afaik) |
Follow-up: turns out Stackdriver cannot send emails to groups, so we have to put individual emails there. |
Follow-up: I received an email from support saying they will pass it onto the engineering team. |
@stephenmcgruer I think this warrants an action item as it would catch problems wherever in the chain they occur, and we could also monitor Taskcluster and Azure Pipelines on PRs with this. |
Other than that, this postmortem is great, LGTM! |
Added action item to write a design doc for monitoring 'expected' checks on WPT PRs. |
In terms of the repair task:
At this point I don't believe we're going to do this (most of the affected PRs have landed anyway), so marking it as such. |
We have two out-standing AIs here:
The former is on our 2020 OKRs and related to the productionization effort being led by @LukeZielinski , so I think we can expect that to happen this year. The latter is still on me; I'll try to get that done soon so we can close this out :) |
Re-assigning to folks with action items |
Owner: @stephenmcgruer
Postmortem Created: 2019-11-22 09:57 EST
Status: Published
Issue: #1660
Impact: Approximately 20.5 hours of PRs to WPT did not have status checks run on them to report the change in pass/fail rate of affected tests. 34 PRs were merged during this time. Any affected PR would not have Safari or Edge results uploaded to wpt.fyi.
Root Cause: The wpt.fyi app's secret was changed inadvertently. This is believed to have been caused by Chrome's password manager auto-filling the secret field when the app name was changed.
Timeline
/api/webhook/check
hits a 500 error forpayload signature check failed
. This goes unnoticed./api/webhook/check
is logged on the server.Lessons Learnt
Things that went well
/api/webhook/check
was set in GitHub.Things that went poorly
/api/webhook/check
is undocumented, which meant the engineer debugging was not sure what even calls that endpoint.Where we got lucky
Action Items
/api/webhook/check
(type=mitigate, owner=@stephenmcgruer)The text was updated successfully, but these errors were encountered: