-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore from coverage more non-grammar statements in generated parsers #633
Comments
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
- Testing: Use CLI to ignore unneeded source for improving coverage of pegjs file
Hi, I don't entirely understand this issue. I also use Are you trying to get actual coverage out of a generated parser? If so, why don't you just wrap calls to the parser, and get coverage of the wrappers? |
Being honest, I don't think code coverage metrics of a generated parser make sense Most of a parser is about "this isn't the grammar" A parser is machine-generated Just test the grammar. You don't need coverage of the parser. You need coverage of the grammar. The easiest way to get there is with a stochastic tester like fast-check. Not only will you never actually get there by modifying |
@StoneCypher : Thank you for your thoughts and info on this. Regarding much of the parse code being about "this isn't the grammar", that is why in #632 I have ignore statements generated on those parts which aren't about the grammar. While yes, it is coverage of the grammar that we want, with proper coverage exposure of the parser, one should in theory be able to get full coverage of the grammar. Regarding peg entering rules it's not using, I believe it is generally enough to get coverage to check the branches which test positive and our PR in #632 is aimed at doing so. In performing such checks, that PR gave us an indication of coverage that was missing, helping ensure our project covered conditions it had not been, but without extra noise from the paths of paths not taken for a particular by the parser but where a rule ultimately gets covered elsewhere. I'm not interested in mere calls to the parser, but as you say, the grammar. Thank you for bringing stochastic checkers to my attention and It would be an interesting notion if one could hook into the parsing generation process, such that, instead of say supplying a Back to the topic of branches not being covered, to take a sample portion of one of our rules:
...the generated portion looks like: function peg$parseVariadicTypeExprOperand() {
// ...
s0 = peg$parseUnionTypeExpr();
if (s0 === peg$FAILED) {
s0 = peg$parseUnaryUnionTypeExpr();
if (s0 === peg$FAILED) {
s0 = peg$parseRecordTypeExpr();
if (s0 === peg$FAILED) { ...for many projects, it would not be of interest to check that a specific However, depending on how the parsed results were used, we might want to optionally avoid ignoring the else branches here if a project wanted to ensure e.g., that |
It's easy to get full coverage of the grammar. The problem is that coverage of a grammar is completely meaningless. Your response to my explanation seems to miss the point. I'll try again. . This is why this can't workMy largest grammar has around 800 things in it. I don't really know the formal way to say these things, so I kind of want to call them expressions, but I'm worried that's incorrect. It's around 3000 SLOC (5kLOC because I use a ton of whitespace.) It's right around the edge of my ability to cope right now. The grammar I'm going to tell you about now is much smaller. (The larger one isn't public; this one is.) And I sort of apologize: this grammar is probably poorly written, and it'll probably offend someone who's actually good at parsers. But also, it's 138 ... what do I call those, rules? 669 lines of code after you remove whitespace. Also it's the first non-trivial grammar I ever wrote. At the time I originally wrote it, this was brutally difficult for me. The reason I know how to set up coverage for the parser is I already did it (by accident) on that grammar. The reason I know how to exclude the parser from coverage is later I realized that the coverage, while correct, was dangerously misleading, and as a result, useless. I wanted coverage in general for the standard reason: I don't trust my own code, and so I rely heavily on testing. The underlying library has 100% coverage with about 80x median pathing. I'm pretty proud of that. Tooling changesExcept later, when I switched to typescript, the way I included the generated file had to change, because typescript sets the build tree up according to the furthest file up the filesystem chain (which makes sense, but now makes me wonder what it does across disks on windows) The reason that matters is that prior to moving the machine to typescript/rollup, it had previously been flow and browserify (it's old, sue me,) and they set the root of the build according to the entrypoint, and exclude things above that from the build Either is fine, but, I had given my users the same build tree for years and I didn't want to give that up in tool change, so, I chose to move the one generated file into the proper source tree, then delete it after the build. (I'm still not very comfortable with that decision) And so, pow, suddenly And that's when I decided I wanted coverage of my peg. If a rule existed, it was code, and it should be part of my 100%, right? The resultsAnd you know, that sounded great at first. And sure enough, I generated a bunch of test cases that happened to get into the peg's corners The problem was that initially I didn't understand that it was getting into the peg's corners in ways that don't read correctly on a coverage test, and as such, my coverage though technically correct was observationally quite false It's sort of important to remember that by the time coverage came into this grammar, there were already probably 70-80 rules, and it was already probably 400 SLOC The host environment also already had more than a thousand test cases, so when coverage came up by accident it was already at a little over 50% Also, it's NYC, so there's a little column at the right edge of the report that gives you three or four lines numbers per file which don't have coverage yet, so I kind of just used those as a guide. Add a test that kills the line numbers I see, re-run, get new line numbers. I was able to achieve 100% coverage, I thought, in about two weeks. The first shoe dropsThen I add some random feature. I don't even remember what. I think it might have been stripes. The thing this grammar is parsing is a nightmare rebake of the basic DOT grammar, for what I originally claimed was uniformity. It's a shorthand for finite state machines. And so you see sequences in it like
which could also be written
Or, they can be bi-directional:
Or you can add "actions," which are (potentially shared) events which result in transitions.
Or, on both sides, for bidirectional links:
... or, those could share action titles, which might actually be useful ...
And the grammar is sufficiently complex that the parser that's built from the currently 34k grammar is currently 1.4 meg, or 103k in small-but-slow (almost all of that is indentation, something I want to fix once the repo is ready for PRs again; I just fixed it locally yesterday once I found out about it) And so you know, when I did this remaining 50% coverage by hand, I was pretty geeked, right? Because it's not just the situation as described above, but rather that written by someone who doesn't know what they're doing. It Ain't Pretty ™ Your point?Then I try to introduce stripes, which rewrites roughly the above
into this shorter, more compact form:
Might seem silly with an example that short, but as you get these really long stripes of states (hence the name,) they really shorten stuff. And then, the thing I was most excited about using them with immediately broke, namely
Because that allows a form of appliable modularity on finite state machine states that is very similar to what Java calls a cross-cutting aspect, and which largely makes monstrosities like Heirarchal FSMs unnecessary. To me, this is gold Mmm hmm ok stop talking about your projectexcept lol I did implement that, and my parser supremo shits the bed extra-hard in the unit tests, even though the grammar is showing full coverage full passing And so I'm Mister Krabs as fuck, right? Literally no idea what's going on. And then I realize "oh wait, maybe my unit tests are causing coverage on the parser in ways that it shouldn't be showing coverage." And at that point I can't even think of a why yet So I just move all my unit tests into the wrong directory and run coverage again Now my parser coverage, which was at 100% from 51%, is at ... at ... 33%? what? 100 - max 51 isn't 49? where did the other 18% go? (Honestly I still haven't figured that out.) Okay whatever fine shut up. So I just do the process again, climbing But I had already learned: those coverage numbers did not actually reflect the parser. They reflected any external use of the parser at all. And that's not actually good enough for parser coverage, because that means anything using that gets the line, no matter what's actually on the line. So I changed my And lo, brother: thine other shoe too shall drop, finallyAnd so after I get coverage up, I find the thing where the list doesn't do the stuff with the junk, and I reorder some hojombo, and suddenly LEGIT. So I'm going to implement the other thing I have like stripes, called cycles, right? The notation is pretty much the same thing, but they don't stop at the end, they wrap around. So like, now you can do things like
and actually some other funkier stuff too, like non-n steps; like if you play the card game Magic the Gathering,
Write out a grammar chunk-a-chunk, put it right where stripes are 'cause it's almost the same thing, and, off we go to the ra-- hey why is everything on fire And I realize actually the way And now I'm confused and frustrated, because I just implemented the coverage on that stuff Oh, no, though, I didn't. I went back, and I looked, because I had just earned 100% on coverage without unit tests, and th... the... wait where is it And I'm looking, right? I'm finding the tests for ... and there's nothing for the actual number. When it wasn't a one, it was getting uptaken as a color RGBA hexadecimal. And the unit test faults that because that's horse hockey. (I map every feature with every non-range value and validate per-pairing that the right result comes out.)
No, dear fellow. You don't. And so I actually traced the thing in a browser to figure out what was happening. Oh. Oh, I see. To be a jerk, I usually add reversed test cases also. So, in addition to
I will also add things like
And that test caused the parser to try the path I thought it had tested, discard it as wrong, and move on. But the coverage engine didn't know that PEG had discarded it as wrong. In general, in coverage testing, the assumption is that if a line has been gone over, it's been used correctly, and may be counted as validated. What I learned the hard way is that in anything involving back tracking and pattern matching at the same time, that assumption is not true. All coverage shows you at that point is that a line is adjacent to a used line, and that's not actually valuable. Instead, I have found great success in the alternate practice of generating primitives in a stochastic testing library (I like fast-check) and making sure that every rule has an expressing primitive. |
Three ways.
Having a stochastic generator is an incremental guarantee of total space coverage - not just of the primary space but also of every combination and expansion you can figure out how to express. It's less work, it's more powerful, and the results are trustworthy |
function ArrayWrapOneNumberNeverThrow(n) {
return (n === 4.2)? new TypeError('oh noes!') : n;
}
test( 'lol', expect(ArrayWrapOneNumberNeverThrow( 0 )).toBe( [0] ) );
If you annotate that with types, function ArrayWrapOneNumberNeverThrow(n: number): number[] {
return (n === 4.2)? new TypeError('oh noes!') : n;
}
test( 'lol', expect(ArrayWrapOneNumberNeverThrow( 0 )).toBe( [0] ) );
function is_illegal(n: number): boolean {
return (n % 5) * (n / log(n)) > (n/4);
}
function ArrayWrapOneNumberNeverThrow(n: number): number[] {
return (is_illegal(n))? new TypeError('oh noes!') : n;
}
test( 'lol', expect(ArrayWrapOneNumberNeverThrow( 0 )).toBe( [0] ) ); Now you have to do some pretty heavy math to figure out whether that's an actual criterion, and whether that coverage matters, and how to trigger it. When you're talking about combinations of factors, in parsers, this kind of situation is actually pretty common. Here, TypeScript would be able to say "this might be an issue," but not whether it actually is.
it's a whole different class of testing knowledge It's not "extra sanity checking." It's a radical revision to what it's possible to test. It's moving from C to C++. Give it a chance |
I believe I follow the bulk of what you are saying, but I think there is a misunderstanding here on the capabilities of nyc. Here is a test repo showing that nyc does show a failure in coverage for the Perhaps you had been using an older version of nyc? Such conditions are marked differently in nyc (in yellow rather than red), but they are still reported. Are you speaking about that? Anyways, it can report missing branch coverage, and that is what the #632 PR relies on for indicating failing coverage. |
"the coverage is not correct, here's why" "but it can show coverage, maybe it was old nyc?" i tried twice. oh well |
If you really want to prove your point, then give a concrete example in JS where it actually fails. Your JS examples do not fail in nyc. |
the entire point being that it looks like it's succeeding when it isn't |
Sorry, what I mean by "do not fail in nyc" is that they do not fail to show a lack of coverage. See my links. |
Issue type
Prerequisites
Description
I would like the generated parsers to have additional
ignore
statements to prevent noise when checking coverage against one's grammars.Steps to Reproduce
nyc
against one's tests that use a pegjs parserExample code:
n/a
Expected behavior:
I'd like to only see reporting on genuine missing code coverage, e.g., tests missing for token types which are represented in the grammar.
Actual behavior:
One sees many lines of code in the parser flagged as breaking coverage
Software
The text was updated successfully, but these errors were encountered: