Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC 122: Remove browser specific failures graph #122

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions rfcs/remove_bsf.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# RFC 122: Remove browser specific failures from wpt.fyi

## Summary

[wpt.fyi](https://wpt.fyi) currently shows a chart of "browser
specific failures"; a score of tests that are failing only in
a single browser. This RFC proposes entirely removing that graph
from wpt.fyi.

## Details

The original motivation for browser-specific-failures was to provide
browser vendors with insights into tests that might be causing interop
problems and therefore might be especially valuable to spend
engineering effort fixing. On the basis of this hypothesis, a graph
was added to wpt.fyi showing a browser-specific-failures "score" for
each browser engine, so that vendors could track their progress on
fixing these issues.

However at this point we have identified a number of issues with
"browser specific failures" as a metric, for example:

* It's hard to correlate a change in the score to a change in browser
behaviour.

* The way the score is computed biases the score toward missing
features rather than interop failures in already shipping features
(since a missing feature usually causes a large number of failures).

* The metric doesn't provide any way of controlling for the user
impact of failures; browsers can get a "bad" score from a large
number of failures that in practice don't cause any observed
problems for authors.

For these reasons and others, we haven't reached a critical mass of
adoption for browser specific failures as a metric to improve interop
on the web platform, and its original function has been largely
replaced by the Interop-20xx project.

Although browser specific failures isn't providing the initially hoped
for value, having it on the wpt.fyi does create some work as it
encourages people to try to understand the current scores or changes
in the score.

Removing the graph entirely seems like the simplest way to indicate
that we no longer consider this useful as a metric.

## Alternatives

* Keep the graph but move it to a less prominent page.

Although this would make the graph less obvious, it would still
imply some endorsement of browser specific failures as useful at the
level of a metric. Since Interop scores are a metric that browser
vendors have explicitly committed to, and which solve many of the
problems with browser specific failures, it's better to clearly
commit to one public metric.

## Risks

* Browser engineers might be using browser specific failures as a way
to identify good tests to fix.

This doesn't depend on having a metric / graph of browser-specific
failures, and could be better solved by making it easier to see an
actual list of browser specific failures in a given feature. This is
already possible on wpt.fyi, but is quite complex to write as a
query. Other frontends on wth wpt.fyi data like
https://jgraham.github.io/wptdash/ provide a engineer-focused view
of this data.