-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove harness status aggregation and display percentages on Interop-202X scores #2858
Conversation
4c2b872
to
9004c18
Compare
@KyleJu The deployment CI run has been having issues with resources it seems. Everything seems to successfully deploy except the new results processor. Maybe you have seen this problem before? |
Keeping these changes in a single PR for now and opening an RFC. |
b12f265
to
41c7ba2
Compare
@@ -187,21 +187,45 @@ func prepareSearchResponse(filters *shared.QueryFilter, testRuns []shared.TestRu | |||
// Dedup visited file names via a map of results. | |||
resMap := make(map[string]shared.SearchResult) | |||
for i, s := range summaries { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also double check what other queries could be affected? https://github.com/web-platform-tests/wpt.fyi/tree/main/api/query#readme. I will double check as well
Errors from staging deployment (I will take a look):
|
11f07c4
to
2994e3c
Compare
OK I cleaned up some resources and got CI to build. I have proposed a solution for this issue in #2867. |
The description in this PR says it fixes #2825, but I don't think that's accurate. The request in 2825 is for "the numbers in the table are the actual scores used for Interop2022." AFAICT, this PR shows the % in the cells but does not take Interop2022 into account. |
If combined with filtering by label, we get something that's similar: (If you see "Failed to fetch test runs" that's a resource issue on staging instances, you might have to imagine it working.) But actually, this doesn't do a perfect job of making it clear where the biggest wins are. Starting at the top level and drilling down, the score is 0-100% at each level, so a few levels down it will still require some math to know how much the Interop 2022 score would increase by fixing all the tests in view. The original suggestion of "43.95 / 90" in #2825 would give numbers that can be compared between subdirectories, since any 10 tests would contribute as much to the score. Although one still needs to compute |
d9f4a62
to
7737760
Compare
api/query/search_test.go
Outdated
@@ -1,3 +1,4 @@ | |||
//go:build small |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you remove this line and the same one added to api/query/search.go ( i meant api/query/query_test.go sorry about that) please? This syntax is only valid for go 1.17+ https://pkg.go.dev/go/build/constraint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! This was automatically added by my editor - I'll remove it.
Description
Addresses #2825 and #1958
See RFC #114 for detailed proposal information.
This change adds a new display for how test results are shown on wpt.fyi's Interop-2022 label view. Instead of displaying the flat number of tests that have passed over the test total, a percentage will display with this information. Additionally, a new scoring method has been implemented that no longer counts the Harness status toward the subtest count, and will mark the test as a failure if the test's status was not "OK" or "PASS".
Here is a staging view of these changes in action. The summaries used on the runs in this link were generated with the new aggregation and display method.
Here is another staging view displaying similar Chrome results. The FIRST run shown on the left is displaying results with the new aggregation method. This is a way to visualize the effect on the totals that will display from older vs. newer aggregations.
Changes
NOTE: This will result in an overall drop of passing percentages in these scenarios:
Screenshots
Tests with harness errors display warnings with title text next to results
Interop-2022 results are viewed with an aggregation that is more indicative of actual interop-2022 scores
Harness status will no longer count toward subtest totals
The run in the left column is aggregated using the new method, compared to the old totals displayed in the right column. This change more accurately represents the scope of test failures.
![Screen Shot 2022-06-21 at 10 45 58 AM](https://user-images.githubusercontent.com/56164590/174865969-1385d984-c6bd-4e85-ac4b-d0faded3f63c.png)