Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Market research: would a mapping from APIs to tests be valuable? #77

Closed
foolip opened this issue Mar 8, 2018 · 11 comments
Closed

Market research: would a mapping from APIs to tests be valuable? #77

foolip opened this issue Mar 8, 2018 · 11 comments

Comments

@foolip
Copy link
Member

foolip commented Mar 8, 2018

Hi all, this seems like a fair place to catch spec editors who are keen on testing things.

@mdittmer and I are brainstorming ideas, and we have one that seems promising. Roughly:

  • Patch Blink's bindings generator to log which Web IDL methods/attributes/etc. are being hit at runtime (kinda like use counters for everything, but for local use)
  • Run each test in WPT against that build individually and save the results
  • Parse the Web IDL of all specs (soon here) to figure out which APIs belong to which specs
  • Combine to get one of:
    • Lookup of single API to which tests hit that API
    • List of all APIs in a spec and how many times they are hit when running all tests

We think that the former is neat but not important, and that the second could be useful as a guide to what is totally untested and possibly undertested, by eyeballing it.

Would any of this be immediately useful for spec maintainers? For anyone else?

We could use the same stuff to figure out what tests are relevant to a page on MDN, but how to leverage that isn't obvious.

@tabatkins FYI that we're thinking about this, complementary to manual fine-grained test linkage using Bikeshed.

@annevk
Copy link
Member

annevk commented Mar 8, 2018

Lookup of single API to which tests hit that API

This is actually quite useful as whenever we want to change something we need to have this list. I typically resort to grep.

@domenic
Copy link
Member

domenic commented Mar 8, 2018

The former is quite nice in knowing, when it's time to update the spec for a thing, what tests you should potentially update. I'm sure I've made spec changes which have broken existing tests I didn't know about.

Although, I guess this will be pretty useless for core DOM APIs that all tests use. E.g. tons of iframe APIs are going to be hit for all multi-global tests, even if they're testing multi-globals in the XHR API or something.

@foolip
Copy link
Member Author

foolip commented Mar 8, 2018

Yep, some APIs will show up everywhere, but then those are probably not the kinds of APIs we like to change the most, document.body is going to stay as it is :)

For the "Lookup of single API to which tests hit that API" case, would this be for APIs with very grep-unfriendly names? I would definitely reach first for grep in a case like node.querySelectorAll(), but maybe useful for https://dom.spec.whatwg.org/#dom-elementcreationoptions-is and such?

How would the information have to made available for it to be less work than using grep? Command line? Web property? Inline in spec? JSON file?

@rtoy
Copy link

rtoy commented Mar 9, 2018

I like have counters for everything. But what do you mean "for local use"? The counters don't work in the wild? But I like the automatic part because I've learned that some of the counters we added for WebAudio don't actually count what we thought it did.

It would be great for testing purposes; for existing attributes in WebAudio, we have discovered that some weren't tested. Sometimes we forget to test completely the values and other such things. Having anything automatic to tell us we're missing things would be totally awesome.

@foolip
Copy link
Member Author

foolip commented Mar 10, 2018

@rtoy, by "for local use", I mean that builds we use to collect this data would most likely be custom builds on a machine dedicated to collecting this data, not vanilla Chromium builds. This is because we would effectively have to add "use counters" for all APIs exposed using Web IDL, which would simply not fly for performance reasons.

So, very concretely, I think this would be:

  • An off-by-default build flag that causes the bindings generator to add some form of logging for every method, attribute, operation, dictionary member, and so on, very much like [Measure] but for everything.
  • Run each test in web-platform-tests individually with that build, to associate a single test with a list of code paths exercised.
  • Do something useful with the data. Whether it would be useful is the question of this issue.

An older idea that's been floating around is to do full code coverage builds in the same way, which would be a superset of this, much more data, but an even bigger undertaking. (Idea being that we could go look at code we think is well covered, to see if actually there are gaping holes, i.e. going from 30% to 90%, not 95% to 99%, that I don't think is a good return for time spent.)

@mdittmer
Copy link

mdittmer commented Mar 11, 2018 via email

@foolip
Copy link
Member Author

foolip commented Mar 13, 2018

I think there should be ways to avoid restarting the browser by logging the test URL at the right points, but it'd take some work to convince ourselves that there's no chance of raciness, i.e., that all logging used comes from a single thread and can't arrive out of order, which is generally not the case.

@mdittmer
Copy link

mdittmer commented Mar 13, 2018 via email

@foolip
Copy link
Member Author

foolip commented Mar 14, 2018

@mdittmer, I'm not sure exactly, the code paths to log would be spread across renderer processes because of OOPIF, and the current URL is ultimately something that changes in the browser process first and then reaches a new or reused render processes, so having confidence in the non-raciness of any logging setup is hard. Most straightforward way is probably to start by running the tests one by one, and to validate any optimization by comparing to those results.

@foolip
Copy link
Member Author

foolip commented Mar 14, 2018

@plehegar and @tabatkins both told me that this might be most immediately useful to bootstrap spec→test linking, and that makes sense. It seems like it'd also easily reveal bits that are totally untested.

@domfarolino
Copy link
Member

In general, I really like this idea, thanks for posting. It sounds really useful!

@annevk annevk closed this as completed Sep 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants