Market research: would a mapping from APIs to tests be valuable? #77

foolip · 2018-03-08T07:03:08Z

Hi all, this seems like a fair place to catch spec editors who are keen on testing things.

@mdittmer and I are brainstorming ideas, and we have one that seems promising. Roughly:

Patch Blink's bindings generator to log which Web IDL methods/attributes/etc. are being hit at runtime (kinda like use counters for everything, but for local use)
Run each test in WPT against that build individually and save the results
Parse the Web IDL of all specs (soon here) to figure out which APIs belong to which specs
Combine to get one of:
- Lookup of single API to which tests hit that API
- List of all APIs in a spec and how many times they are hit when running all tests

We think that the former is neat but not important, and that the second could be useful as a guide to what is totally untested and possibly undertested, by eyeballing it.

Would any of this be immediately useful for spec maintainers? For anyone else?

We could use the same stuff to figure out what tests are relevant to a page on MDN, but how to leverage that isn't obvious.

@tabatkins FYI that we're thinking about this, complementary to manual fine-grained test linkage using Bikeshed.

annevk · 2018-03-08T07:10:58Z

Lookup of single API to which tests hit that API

This is actually quite useful as whenever we want to change something we need to have this list. I typically resort to grep.

domenic · 2018-03-08T07:14:29Z

The former is quite nice in knowing, when it's time to update the spec for a thing, what tests you should potentially update. I'm sure I've made spec changes which have broken existing tests I didn't know about.

Although, I guess this will be pretty useless for core DOM APIs that all tests use. E.g. tons of iframe APIs are going to be hit for all multi-global tests, even if they're testing multi-globals in the XHR API or something.

foolip · 2018-03-08T07:26:30Z

Yep, some APIs will show up everywhere, but then those are probably not the kinds of APIs we like to change the most, document.body is going to stay as it is :)

For the "Lookup of single API to which tests hit that API" case, would this be for APIs with very grep-unfriendly names? I would definitely reach first for grep in a case like node.querySelectorAll(), but maybe useful for https://dom.spec.whatwg.org/#dom-elementcreationoptions-is and such?

How would the information have to made available for it to be less work than using grep? Command line? Web property? Inline in spec? JSON file?

rtoy · 2018-03-09T16:29:29Z

I like have counters for everything. But what do you mean "for local use"? The counters don't work in the wild? But I like the automatic part because I've learned that some of the counters we added for WebAudio don't actually count what we thought it did.

It would be great for testing purposes; for existing attributes in WebAudio, we have discovered that some weren't tested. Sometimes we forget to test completely the values and other such things. Having anything automatic to tell us we're missing things would be totally awesome.

foolip · 2018-03-10T16:24:41Z

@rtoy, by "for local use", I mean that builds we use to collect this data would most likely be custom builds on a machine dedicated to collecting this data, not vanilla Chromium builds. This is because we would effectively have to add "use counters" for all APIs exposed using Web IDL, which would simply not fly for performance reasons.

So, very concretely, I think this would be:

An off-by-default build flag that causes the bindings generator to add some form of logging for every method, attribute, operation, dictionary member, and so on, very much like [Measure] but for everything.
Run each test in web-platform-tests individually with that build, to associate a single test with a list of code paths exercised.
Do something useful with the data. Whether it would be useful is the question of this issue.

An older idea that's been floating around is to do full code coverage builds in the same way, which would be a superset of this, much more data, but an even bigger undertaking. (Idea being that we could go look at code we think is well covered, to see if actually there are gaping holes, i.e. going from 30% to 90%, not 95% to 99%, that I don't think is a good return for time spent.)

mdittmer · 2018-03-11T17:05:42Z

- Run each test in web-platform-tests *individually* with that build, to associate a single test with a list of code paths exercised. Do tests <--> current-browser-URL have a neat mapping? If so, we might be

able to avoid the significant overhead of setting up and tearing down the test harness for each test individually, include the URL in the custom logs, and map URLs to tests during log analysis. Would that work?

foolip · 2018-03-13T03:18:27Z

I think there should be ways to avoid restarting the browser by logging the test URL at the right points, but it'd take some work to convince ourselves that there's no chance of raciness, i.e., that all logging used comes from a single thread and can't arrive out of order, which is generally not the case.

mdittmer · 2018-03-13T17:42:23Z

Can't the URL be captured as soon as there is intent-to-log, rather than when the log gets committed?

…

On Mon, Mar 12, 2018 at 11:18 PM, Philip Jägenstedt < ***@***.***> wrote: I think there should be ways to avoid restarting the browser by logging the test URL at the right points, but it'd take some work to convince ourselves that there's no chance of raciness, i.e., that all logging used comes from a single thread and can't arrive out of order, which is generally not the case. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#77 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABsWSLtes0smZrPFy6tg6RbPJPgYPxZrks5tdzqHgaJpZM4SiO0K> .

foolip · 2018-03-14T15:33:19Z

@mdittmer, I'm not sure exactly, the code paths to log would be spread across renderer processes because of OOPIF, and the current URL is ultimately something that changes in the browser process first and then reaches a new or reused render processes, so having confidence in the non-raciness of any logging setup is hard. Most straightforward way is probably to start by running the tests one by one, and to validate any optimization by comparing to those results.

foolip · 2018-03-14T15:38:09Z

@plehegar and @tabatkins both told me that this might be most immediately useful to bootstrap spec→test linking, and that makes sense. It seems like it'd also easily reveal bits that are totally untested.

domfarolino · 2018-03-26T18:53:19Z

In general, I really like this idea, thanks for posting. It sounds really useful!

annevk closed this as completed Sep 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Market research: would a mapping from APIs to tests be valuable? #77

Market research: would a mapping from APIs to tests be valuable? #77

foolip commented Mar 8, 2018

annevk commented Mar 8, 2018

domenic commented Mar 8, 2018

foolip commented Mar 8, 2018

rtoy commented Mar 9, 2018

foolip commented Mar 10, 2018

mdittmer commented Mar 11, 2018 via email

foolip commented Mar 13, 2018

mdittmer commented Mar 13, 2018 via email

foolip commented Mar 14, 2018 •

edited

Loading

foolip commented Mar 14, 2018

domfarolino commented Mar 26, 2018

Market research: would a mapping from APIs to tests be valuable? #77

Market research: would a mapping from APIs to tests be valuable? #77

Comments

foolip commented Mar 8, 2018

annevk commented Mar 8, 2018

domenic commented Mar 8, 2018

foolip commented Mar 8, 2018

rtoy commented Mar 9, 2018

foolip commented Mar 10, 2018

mdittmer commented Mar 11, 2018 via email

foolip commented Mar 13, 2018

mdittmer commented Mar 13, 2018 via email

foolip commented Mar 14, 2018 • edited Loading

foolip commented Mar 14, 2018

domfarolino commented Mar 26, 2018

foolip commented Mar 14, 2018 •

edited

Loading