-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TAP version 14 - First Draft #36
Conversation
6c6d974
to
561d609
Compare
`Bail out!` in the parent test. | ||
|
||
If a Subtest TAP stream does not include a version number, it MUST be | ||
interpreted as a TAP14 stream anyway. This is a backwards-compatible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea being that in some future TAP14+n with some different behaviour could have subtests that are interpreted using TAP14 semantics? And the context here is that TAP14 is a codification of how people have been using/extending TAP13 in practice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the context of this spec is to codify how we're extending TAP13 in practice.
The wording of this is tricky. The issue here is that you don't want subtests to be interpreted as a TAP12 stream (since they'll likely have subtests and yaml diags), but including a TAP version XX
line in a subtest upsets some existing parsers that don't require that the version start on column 0, so it must be acceptable to omit the version designator.
Perhaps it's better to say that, in absence of a TAP version XX
in the subtest, it's assumed to be the same version as the parent test? (Ie, 14+)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's kinda what I was thinking.. as useful as it would be to be able to have some older tests output appear as subtests, it seems like the facilities to produce them in a way that matches this spec is somewhat mutually exclusive of requiring this contingency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I don't think people are restreaming of different versions very much as I personally think the format doesn't really lend itself to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For stuff like t.test('child test', function (t) { ... })
, yeah, you're going to definitely use the same version, because you're generating TAP with the same test framework. But for spawning child processes, or especially collating tests from users of a web service, it's possible to get more random stuff showing up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly, I just don't think TAP handles those test cases very well at all. With the fact that the debug isn't there for the process etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allowing more than one TAP version in a stream sounds like an implementors nightmare to me. I don't think it's worth the complication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allowing more than one TAP version in a stream sounds like an implementors nightmare to me.
Yeah, I agree, and in practice I'd probably just rely on TAP's relatively reliable backward compatibility, and parse it all as TAP14.
Since TAP14 is a superset of TAP12 (mostly; it's a bit stricter about yaml indentation, but in practice, everyone uses 2 spaces, not 1), and using other versions is unlikely anyway, it seems like it might just be reasonable to say that any embedded subtest SHOULD be TAP14?
Hi, sorry, I missed the discussion for this draft, so I'm not sure I'll be able to be of much help reviewing it. |
@kinow There hasn't been all that much discussion, tbh. There have been a few conversations about making TAP14 specify what people are already doing to extend TAP13, and add some clarifications about edge cases, so I just wrote down what node-tap and Test::More seem to be doing today. If that's divergent from what your TAP programs do, I'd love to surface those differences. |
ok | ||
``` | ||
|
||
has five tests. The sixth is missing. For example, in Perl, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be made independent to language or sections like this clearly demarcated?
The W3C mark these as 'non normative'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
No, I don't think so.
|
Sorry. I didn't intend to misrepresent What about the case when multiple producers share the same output stream (e.g., $ nosetests --with-tap --tap-stream server
TAP version 13
... a stream of Python tests here ...
1..42
$ cd client
$ ember test
TAP version 13
... a stream of Ember tests here ...
1..24 As TAP currently exists, any TAP CI plugin would be unable to aggregate that. The results would have to be sent to separate files or something to do it correctly. |
That's exactly the type of thing that prove is doing: aggregating different TAP results. |
For sure, To simulate, I made a fake stream that I stored in a file,
Then I ran |
Sounds like what you want it a TAP multiplexing protocol. While I can imagine that would be useful in some scenarios, it also sounds like something that would complicate consumers (and probably producers too) that don't need such features. |
This is the place and time to discuss changes that could affect consumer/producer behavior, right? Maybe you didn't intend it, but it feels very dismissive to state that some change would complicate consumers/producer when the reality is that any change can potentially complicate consumers/producers. For instance, subtests will definitely complicate consumer behavior. That doesn't mean that it was thrown out from consideration. Back to the subject, I think it would be possible to support multiple TAP streams in a single output stream without being a separate protocol. Some kind of "end of TAP" marker could signal a consumer to treat the stream as multiple streams. I think it would be similar to how consumers handle and aggregate multiple files. Does anyone see hidden gotchas in having an |
I would prefer not to add support for backslash-escaping to TAP. It's a slippery slope to tests with I think the right answer here is to say that Node-tap does not support any kind of escaping in test names, I'm not sure about other harnesses or test frameworks. |
I am new here. I saw this is a 4 years old draft, would like to know is tap version 14 still ongoing or abandoned? |
From the above closed issues, it kind of looks like you are merging subfeatures of this PR instead of everything all at once; that makes sense. Nevertheless, I am curious about the status of some of the features it introduces like subtests; could you comment on when that will be added to the spec? We are already using the subtests feature on the Linux kernel, and it would be nice to see the current version of the spec to reflect this. At this point substests are something we depend on and I really don't want this to result in us forking your spec. |
@bjh83 @3cp The TAP specification is somewhat in a weird state, I have to say. No one with the moral authority to dictate the spec has the time or inclination to do so. So, what's happened is that a bunch of implementers have just gone about implementing in a way that makes sense to them. For myself, as the maintainer of node-tap, yes, I've basically already "ratified" this specification years ago, and have implemented all of it. Test::More and a few other CPAN modules provided the original inspiration, and it was a specification of their observed behavior, so they're also in line. It might be incomplete, but as a description of current reality, it's fairly accurate. What it isn't is normative, because there is no governing body with authority to say what TAP is or should be. It's a peaceful anarchy, with the "specification" of the protocol only held in line by a shared desire to interoperate. It's a good protocol for its stated purpose, and anything that deviates too sharply would be something other than TAP, so it's probably safe to rely on, assuming you're somewhat strict in what you send and loose in what you accept. Someday when I have time, I might push forward with this, but for now, other things are higher priority. |
The Plan MUST appear exactly once, EITHER: | ||
|
||
- the first line of TAP output after the Version, or | ||
- the last line of TAP output prior to the end of the TAP stream. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about comments after a final plan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. In practice, we all put comments after the closing plan, so I think the assumption here is that comments don't count as a "line of TAP output". (Which is an understandable assumption, I think, but also weird and should be called out.)
Agreed.
Without a decision making process and clear requirements, the process is doomed.
I've done essentially the same in |
This is to accomodate node-tap and tape, which allow for child tests to be associated directly with other assertion-holding tests (as opposed to having tests only contain assertions, and suites contain only tests and other suites). Ref #126. It also allows for future compatibility with TAP 14, which currently has no concept of test groups or test suites, but is considering the addition of "sub tests". Ref TestAnything/testanything.github.io#36. Also: - Define "Adapter" and "Producer" terms. - Refer mostly to producers and reporters, instead of frameworks, runners, or adapters. - Remove mention that the spec is for reporting information about JavaScript test frameworks, it can report information about any kind of test that can be represented in its structure of JSON messages. Instead, do clarify that the spec defines a JavaScript-based API of producers and reporters. Thought dump: In aggregation, simplify status to failed/passed only, if something has only todo or skipped children, don't propagate this like we did with suites, but cast it down to only failed/passed, as we did with "run" before. This is because, with the "suite" concept gone, we can't assume that test parents only contained other tests, they may have their own assertions. As such, a parent with only two skipped children doesn't mean the parent can therefore be marked as skipped, rather it will be marked as passed, assuming no errors/failures reported. This affects the adapters for QUnit/Mocha/Jasmine, but when frameworks implement this themselves, they can of course have know if an entire suite was known to have been explicitly skipped in which case it can mark that accordingly.
== Status quo == The TAP 13 specification does not standardise a way of describing parent-child relationships between tests, nor does it standardise how to group tests. Yet, all major test frameworks have a way to group tests (e.g. QUnit module, and Mocha suite) and/or allow nesting tests inside of other tests (like tape, and node-tap). While the CRI draft provided a way to group tests, it did not accomodate Tap. They would either need to flatten the tests with a separator symbol in the test name, or to create an implied "Suite" for every test that has non-zero children and then come up with an ad-hoc naming scheme for it. Note that the TAP 13 reporter we ship, even after this change, still ends up flattening the tests by defaut using the greater than `>` symbol, but at least the event model itself recognises the relationships so that other output formats can make use of it, and in the future TAP 14 hopefully will recognise it as well, which we can then make use of. Ref TestAnything/testanything.github.io#36. == Summary of changes == See the diff of `test/integration/reference-data.js` for the concrete changes this makes to the consumable events. - Remove `suiteStart` and `suiteEnd` events. Instead, the spec now says that tests are permitted to have children. The link from child to parent remains the same as before, using the `fullName` field which is now a stack of test names. Previously, it was a stack of suite names with a test name at the end. - Remove all "downward" links from parent to child. Tests don't describe their children upfront in detail, and neither does `runStart`. This was information was very repetitive and tedious to satisy for implementors, and encouraged or required inefficient use of memory. I do recognise that a common use case might be to generate a single output file or stream where real-time updates are not needed, in which case you may want a convenient tree that is ready to traverse without needing to listen for async events and put it together. For this purpose, I have added a built-in reporter that simply listens to the new events and outputs a "summary" event with an object that is similar to the old "runEnd" event object where the entire run is described in a single large object. - New "SummaryReporter" for simple use cases of non-realtime traversing of single structure after the test has completed. == Caveats == - A test with the "failed" status is no longer expected to always have an error directly associated with it. Now that tests aggregate into other tests rather than into suites, this means tests that merely have other tests as children do still have to send a full testEnd event, and thus an `errors` and `assertions` array. I considered specifying that errors have to propagate but this seemed messy and could lead to duplicate diagnostic output in reporters, as well ambiguity or uncertainty over where errors originated. - A suite containing only "skipped" tests now aggregates as "passed" instead of "skipped". Given we can't know whether a suite is its own test with its own assertions, we also can't assume that if a test parent has only "skipped" children that the parent was also skipped. This applies to our built-in adapters, but individual frameworks, if they know that a suite was skipped in its entirety, can of course still set the status of parents however they see fit. - Graphical reporters (such as QUnit and Mocha's HTML reporters) may no longer assume that a test parent has either assertions/errors or other tests. A test parente can now have both its own assertions/errors, as well as other tests beneath it. This restricts the freedom and possibilities for visualisation. My recommendation is that, if a visual reporter wants to keep using different visual shapes for "group of assertions" and "group of tests", that they buffer the information internally such that they can first render all the tests's own assertions, and then render the children, even if they originally ran interleaved and/or the other way around. Ref #126. == Misc == - Add definitions for the "Adapter" and "Producer" terms. - Use terms "producer" and "reporter" consistently, instead of "framework", "runner", or "adapter". - Remove mention that the spec is for reporting information from "JavaScript test frameworks". CRI can be used to report information about any kind of test that can be represented in CRI's event model, including linting and end-to-end tests for JS programs, as well as non-JS programs. It describes a JS interface for reporters, but the information can come from anywhere. This further solifies that CRI is not meant to be used for "hooking" into a framework, and sets no expectation about timing or run-time environment being shared with whatever is executing tests in some form or another. This was already the intent originally, since it could be used to report information from other processes or from a cloud-based test runner like BrowserStack, but this removes any remaining confusion or doubt there may have been. Fixes #126.
== Status quo == The TAP 13 specification does not standardise a way of describing parent-child relationships between tests, nor does it standardise how to group tests. Yet, all major test frameworks have a way to group tests (e.g. QUnit module, and Mocha suite) and/or allow nesting tests inside of other tests (like tape, and node-tap). While the CRI draft provided a way to group tests, it did not accomodate Tap. They would either need to flatten the tests with a separator symbol in the test name, or to create an implied "Suite" for every test that has non-zero children and then come up with an ad-hoc naming scheme for it. Note that the TAP 13 reporter we ship, even after this change, still ends up flattening the tests by defaut using the greater than `>` symbol, but at least the event model itself recognises the relationships so that other output formats can make use of it, and in the future TAP 14 hopefully will recognise it as well, which we can then make use of. Ref TestAnything/testanything.github.io#36. == Summary of changes == See the diff of `test/integration/reference-data.js` for the concrete changes this makes to the consumable events. - Remove `suiteStart` and `suiteEnd` events. Instead, the spec now says that tests are permitted to have children. The link from child to parent remains the same as before, using the `fullName` field which is now a stack of test names. Previously, it was a stack of suite names with a test name at the end. - Remove all "downward" links from parent to child. Tests don't describe their children upfront in detail, and neither does `runStart`. This was information was very repetitive and tedious to satisy for implementors, and encouraged or required inefficient use of memory. I do recognise that a common use case might be to generate a single output file or stream where real-time updates are not needed, in which case you may want a convenient tree that is ready to traverse without needing to listen for async events and put it together. For this purpose, I have added a built-in reporter that simply listens to the new events and outputs a "summary" event with an object that is similar to the old "runEnd" event object where the entire run is described in a single large object. - New "SummaryReporter" for simple use cases of non-realtime traversing of single structure after the test has completed. == Caveats == - A test with the "failed" status is no longer expected to always have an error directly associated with it. Now that tests aggregate into other tests rather than into suites, this means tests that merely have other tests as children do still have to send a full testEnd event, and thus an `errors` and `assertions` array. I considered specifying that errors have to propagate but this seemed messy and could lead to duplicate diagnostic output in reporters, as well ambiguity or uncertainty over where errors originated. - A suite containing only "skipped" tests now aggregates as "passed" instead of "skipped". Given we can't know whether a suite is its own test with its own assertions, we also can't assume that if a test parent has only "skipped" children that the parent was also skipped. This applies to our built-in adapters, but individual frameworks, if they know that a suite was skipped in its entirety, can of course still set the status of parents however they see fit. - Graphical reporters (such as QUnit and Mocha's HTML reporters) may no longer assume that a test parent has either assertions/errors or other tests. A test parente can now have both its own assertions/errors, as well as other tests beneath it. This restricts the freedom and possibilities for visualisation. My recommendation is that, if a visual reporter wants to keep using different visual shapes for "group of assertions" and "group of tests", that they buffer the information internally such that they can first render all the tests's own assertions, and then render the children, even if they originally ran interleaved and/or the other way around. Ref #126. == Misc == - Add definitions for the "Adapter" and "Producer" terms. - Use terms "producer" and "reporter" consistently, instead of "framework", "runner", or "adapter". - Remove mention that the spec is for reporting information from "JavaScript test frameworks". CRI can be used to report information about any kind of test that can be represented in CRI's event model, including linting and end-to-end tests for JS programs, as well as non-JS programs. It describes a JS interface for reporters, but the information can come from anywhere. This further solifies that CRI is not meant to be used for "hooking" into a framework, and sets no expectation about timing or run-time environment being shared with whatever is executing tests in some form or another. This was already the intent originally, since it could be used to report information from other processes or from a cloud-based test runner like BrowserStack, but this removes any remaining confusion or doubt there may have been. Fixes #126.
== Status quo == The TAP 13 specification does not standardise a way of describing parent-child relationships between tests, nor does it standardise how to group tests. Yet, all major test frameworks have a way to group tests (e.g. QUnit module, and Mocha suite) and/or allow nesting tests inside of other tests (like tape, and node-tap). While the CRI draft provided a way to group tests, it did not accomodate Tap. They would either need to flatten the tests with a separator symbol in the test name, or to create an implied "Suite" for every test that has non-zero children and then come up with an ad-hoc naming scheme for it. Note that the TAP 13 reporter we ship, even after this change, still ends up flattening the tests by defaut using the greater than `>` symbol, but at least the event model itself recognises the relationships so that other output formats can make use of it, and in the future TAP 14 hopefully will recognise it as well, which we can then make use of. Ref TestAnything/testanything.github.io#36. == Summary of changes == See the diff of `test/integration/reference-data.js` for the concrete changes this makes to the consumable events. - Remove `suiteStart` and `suiteEnd` events. Instead, the spec now says that tests are permitted to have children. The link from child to parent remains the same as before, using the `fullName` field which is now a stack of test names. Previously, it was a stack of suite names with a test name at the end. - Remove all "downward" links from parent to child. Tests don't describe their children upfront in detail, and neither does `runStart`. This was information was very repetitive and tedious to satisy for implementors, and encouraged or required inefficient use of memory. I do recognise that a common use case might be to generate a single output file or stream where real-time updates are not needed, in which case you may want a convenient tree that is ready to traverse without needing to listen for async events and put it together. For this purpose, I have added a built-in reporter that simply listens to the new events and outputs a "summary" event with an object that is similar to the old "runEnd" event object where the entire run is described in a single large object. - New "SummaryReporter" for simple use cases of non-realtime traversing of single structure after the test has completed. == Caveats == - A test with the "failed" status is no longer expected to always have an error directly associated with it. Now that tests aggregate into other tests rather than into suites, this means tests that merely have other tests as children do still have to send a full testEnd event, and thus an `errors` and `assertions` array. I considered specifying that errors have to propagate but this seemed messy and could lead to duplicate diagnostic output in reporters, as well ambiguity or uncertainty over where errors originated. - A suite containing only "skipped" tests now aggregates as "passed" instead of "skipped". Given we can't know whether a suite is its own test with its own assertions, we also can't assume that if a test parent has only "skipped" children that the parent was also skipped. This applies to our built-in adapters, but individual frameworks, if they know that a suite was skipped in its entirety, can of course still set the status of parents however they see fit. - Graphical reporters (such as QUnit and Mocha's HTML reporters) may no longer assume that a test parent has either assertions/errors or other tests. A test parente can now have both its own assertions/errors, as well as other tests beneath it. This restricts the freedom and possibilities for visualisation. My recommendation is that, if a visual reporter wants to keep using different visual shapes for "group of assertions" and "group of tests", that they buffer the information internally such that they can first render all the tests's own assertions, and then render the children, even if they originally ran interleaved and/or the other way around. Ref #126. - The "Console" reporter that comes with js-reporter now no longer uses `console.group()` for collapsing nested tests. == Misc == - Add definitions for the "Adapter" and "Producer" terms. - Use terms "producer" and "reporter" consistently, instead of "framework", "runner", or "adapter". - Remove mention that the spec is for reporting information from "JavaScript test frameworks". CRI can be used to report information about any kind of test that can be represented in CRI's event model, including linting and end-to-end tests for JS programs, as well as non-JS programs. It describes a JS interface for reporters, but the information can come from anywhere. This further solifies that CRI is not meant to be used for "hooking" into a framework, and sets no expectation about timing or run-time environment being shared with whatever is executing tests in some form or another. This was already the intent originally, since it could be used to report information from other processes or from a cloud-based test runner like BrowserStack, but this removes any remaining confusion or doubt there may have been. Fixes #126.
== Status quo == The TAP 13 specification does not standardise a way of describing parent-child relationships between tests, nor does it standardise how to group tests. Yet, all major test frameworks have a way to group tests (e.g. QUnit module, and Mocha suite) and/or allow nesting tests inside of other tests (like tape, and node-tap). While the CRI draft provided a way to group tests, it did not accomodate Tap. They would either need to flatten the tests with a separator symbol in the test name, or to create an implied "Suite" for every test that has non-zero children and then come up with an ad-hoc naming scheme for it. Note that the TAP 13 reporter we ship, even after this change, still ends up flattening the tests by defaut using the greater than `>` symbol, but at least the event model itself recognises the relationships so that other output formats can make use of it, and in the future TAP 14 hopefully will recognise it as well, which we can then make use of. Ref TestAnything/testanything.github.io#36. == Summary of changes == See the diff of `test/integration/reference-data.js` for the concrete changes this makes to the consumable events. - Remove `suiteStart` and `suiteEnd` events. Instead, the spec now says that tests are permitted to have children. The link from child to parent remains the same as before, using the `fullName` field which is now a stack of test names. Previously, it was a stack of suite names with a test name at the end. - Remove all "downward" links from parent to child. Tests don't describe their children upfront in detail, and neither does `runStart`. This was information was very repetitive and tedious to satisy for implementors, and encouraged or required inefficient use of memory. I do recognise that a common use case might be to generate a single output file or stream where real-time updates are not needed, in which case you may want a convenient tree that is ready to traverse without needing to listen for async events and put it together. For this purpose, I have added a built-in reporter that simply listens to the new events and outputs a "summary" event with an object that is similar to the old "runEnd" event object where the entire run is described in a single large object. - New "SummaryReporter" for simple use cases of non-realtime traversing of single structure after the test has completed. == Caveats == - A test with the "failed" status is no longer expected to always have an error directly associated with it. Now that tests aggregate into other tests rather than into suites, this means tests that merely have other tests as children do still have to send a full testEnd event, and thus an `errors` and `assertions` array. I considered specifying that errors have to propagate but this seemed messy and could lead to duplicate diagnostic output in reporters, as well ambiguity or uncertainty over where errors originated. - A suite containing only "skipped" tests now aggregates as "passed" instead of "skipped". Given we can't know whether a suite is its own test with its own assertions, we also can't assume that if a test parent has only "skipped" children that the parent was also skipped. This applies to our built-in adapters, but individual frameworks, if they know that a suite was skipped in its entirety, can of course still set the status of parents however they see fit. - Graphical reporters (such as QUnit and Mocha's HTML reporters) may no longer assume that a test parent has either assertions/errors or other tests. A test parente can now have both its own assertions/errors, as well as other tests beneath it. This restricts the freedom and possibilities for visualisation. My recommendation is that, if a visual reporter wants to keep using different visual shapes for "group of assertions" and "group of tests", that they buffer the information internally such that they can first render all the tests's own assertions, and then render the children, even if they originally ran interleaved and/or the other way around. Ref #126. - The "Console" reporter that comes with js-reporter now no longer uses `console.group()` for collapsing nested tests. == Misc == - Add definitions for the "Adapter" and "Producer" terms. - Use terms "producer" and "reporter" consistently, instead of "framework", "runner", or "adapter". - Remove mention that the spec is for reporting information from "JavaScript test frameworks". CRI can be used to report information about any kind of test that can be represented in CRI's event model, including linting and end-to-end tests for JS programs, as well as non-JS programs. It describes a JS interface for reporters, but the information can come from anywhere. This further solifies that CRI is not meant to be used for "hooking" into a framework, and sets no expectation about timing or run-time environment being shared with whatever is executing tests in some form or another. This was already the intent originally, since it could be used to report information from other processes or from a cloud-based test runner like BrowserStack, but this removes any remaining confusion or doubt there may have been. Fixes #126.
== Status quo == The TAP 13 specification does not standardise a way of describing parent-child relationships between tests, nor does it standardise how to group tests. Yet, all major test frameworks have a way to group tests (e.g. QUnit module, and Mocha suite) and/or allow nesting tests inside of other tests (like tape, and node-tap). While the CRI draft provided a way to group tests, it did not accomodate Tap. They would either need to flatten the tests with a separator symbol in the test name, or to create an implied "Suite" for every test that has non-zero children and then come up with an ad-hoc naming scheme for it. Note that the TAP 13 reporter we ship, even after this change, still ends up flattening the tests by defaut using the greater than `>` symbol, but at least the event model itself recognises the relationships so that other output formats can make use of it, and in the future TAP 14 hopefully will recognise it as well, which we can then make use of. Ref TestAnything/testanything.github.io#36. == Summary of changes == See the diff of `test/integration/reference-data.js` for the concrete changes this makes to the consumable events. - Remove `suiteStart` and `suiteEnd` events. Instead, the spec now says that tests are permitted to have children. The link from child to parent remains the same as before, using the `fullName` field which is now a stack of test names. Previously, it was a stack of suite names with a test name at the end. - Remove all "downward" links from parent to child. Tests don't describe their children upfront in detail, and neither does `runStart`. This was information was very repetitive and tedious to satisy for implementors, and encouraged or required inefficient use of memory. I do recognise that a common use case might be to generate a single output file or stream where real-time updates are not needed, in which case you may want a convenient tree that is ready to traverse without needing to listen for async events and put it together. For this purpose, I have added a built-in reporter that simply listens to the new events and outputs a "summary" event with an object that is similar to the old "runEnd" event object where the entire run is described in a single large object. - New "SummaryReporter" for simple use cases of non-realtime traversing of single structure after the test has completed. == Caveats == - A test with the "failed" status is no longer expected to always have an error directly associated with it. Now that tests aggregate into other tests rather than into suites, this means tests that merely have other tests as children do still have to send a full testEnd event, and thus an `errors` and `assertions` array. I considered specifying that errors have to propagate but this seemed messy and could lead to duplicate diagnostic output in reporters, as well ambiguity or uncertainty over where errors originated. - A suite containing only "skipped" tests now aggregates as "passed" instead of "skipped". Given we can't know whether a suite is its own test with its own assertions, we also can't assume that if a test parent has only "skipped" children that the parent was also skipped. This applies to our built-in adapters, but individual frameworks, if they know that a suite was skipped in its entirety, can of course still set the status of parents however they see fit. - Graphical reporters (such as QUnit and Mocha's HTML reporters) may no longer assume that a test parent has either assertions/errors or other tests. A test parente can now have both its own assertions/errors, as well as other tests beneath it. This restricts the freedom and possibilities for visualisation. My recommendation is that, if a visual reporter wants to keep using different visual shapes for "group of assertions" and "group of tests", that they buffer the information internally such that they can first render all the tests's own assertions, and then render the children, even if they originally ran interleaved and/or the other way around. Ref #126. - The "Console" reporter that comes with js-reporter now no longer uses `console.group()` for collapsing nested tests. == Misc == - Add definitions for the "Adapter" and "Producer" terms. - Use terms "producer" and "reporter" consistently, instead of "framework", "runner", or "adapter". - Remove mention that the spec is for reporting information from "JavaScript test frameworks". CRI can be used to report information about any kind of test that can be represented in CRI's event model, including linting and end-to-end tests for JS programs, as well as non-JS programs. It describes a JS interface for reporters, but the information can come from anywhere. This further solifies that CRI is not meant to be used for "hooking" into a framework, and sets no expectation about timing or run-time environment being shared with whatever is executing tests in some form or another. This was already the intent originally, since it could be used to report information from other processes or from a cloud-based test runner like BrowserStack, but this removes any remaining confusion or doubt there may have been. Fixes #126.
== Status quo == The TAP 13 specification does not standardise a way of describing parent-child relationships between tests, nor does it standardise how to group tests. Yet, all major test frameworks have a way to group tests (e.g. QUnit module, and Mocha suite) and/or allow nesting tests inside of other tests (like tape, and node-tap). While the CRI draft provided a way to group tests, it did not accomodate Tap. They would either need to flatten the tests with a separator symbol in the test name, or to create an implied "Suite" for every test that has non-zero children and then come up with an ad-hoc naming scheme for it. Note that the TAP 13 reporter we ship, even after this change, still ends up flattening the tests by defaut using the greater than `>` symbol, but at least the event model itself recognises the relationships so that other output formats can make use of it, and in the future TAP 14 hopefully will recognise it as well, which we can then make use of. Ref TestAnything/testanything.github.io#36. == Summary of changes == See the diff of `test/integration/reference-data.js` for the concrete changes this makes to the consumable events. - Remove `suiteStart` and `suiteEnd` events. Instead, the spec now says that tests are permitted to have children. The link from child to parent remains the same as before, using the `fullName` field which is now a stack of test names. Previously, it was a stack of suite names with a test name at the end. - Remove all "downward" links from parent to child. Tests don't describe their children upfront in detail, and neither does `runStart`. This was information was very repetitive and tedious to satisy for implementors, and encouraged or required inefficient use of memory. I do recognise that a common use case might be to generate a single output file or stream where real-time updates are not needed, in which case you may want a convenient tree that is ready to traverse without needing to listen for async events and put it together. For this purpose, I have added a built-in reporter that simply listens to the new events and outputs a "summary" event with an object that is similar to the old "runEnd" event object where the entire run is described in a single large object. - New "SummaryReporter" for simple use cases of non-realtime traversing of single structure after the test has completed. == Caveats == - A test with the "failed" status is no longer expected to always have an error directly associated with it. Now that tests aggregate into other tests rather than into suites, this means tests that merely have other tests as children do still have to send a full testEnd event, and thus an `errors` and `assertions` array. I considered specifying that errors have to propagate but this seemed messy and could lead to duplicate diagnostic output in reporters, as well ambiguity or uncertainty over where errors originated. - A suite containing only "skipped" tests now aggregates as "passed" instead of "skipped". Given we can't know whether a suite is its own test with its own assertions, we also can't assume that if a test parent has only "skipped" children that the parent was also skipped. This applies to our built-in adapters, but individual frameworks, if they know that a suite was skipped in its entirety, can of course still set the status of parents however they see fit. - Graphical reporters (such as QUnit and Mocha's HTML reporters) may no longer assume that a test parent has either assertions/errors or other tests. A test parente can now have both its own assertions/errors, as well as other tests beneath it. This restricts the freedom and possibilities for visualisation. My recommendation is that, if a visual reporter wants to keep using different visual shapes for "group of assertions" and "group of tests", that they buffer the information internally such that they can first render all the tests's own assertions, and then render the children, even if they originally ran interleaved and/or the other way around. Ref #126. - The "Console" reporter that comes with js-reporter now no longer uses `console.group()` for collapsing nested tests. == Misc == - Add definitions for the "Adapter" and "Producer" terms. - Use terms "producer" and "reporter" consistently, instead of "framework", "runner", or "adapter". - Remove mention that the spec is for reporting information from "JavaScript test frameworks". CRI can be used to report information about any kind of test that can be represented in CRI's event model, including linting and end-to-end tests for JS programs, as well as non-JS programs. It describes a JS interface for reporters, but the information can come from anywhere. This further solifies that CRI is not meant to be used for "hooking" into a framework, and sets no expectation about timing or run-time environment being shared with whatever is executing tests in some form or another. This was already the intent originally, since it could be used to report information from other processes or from a cloud-based test runner like BrowserStack, but this removes any remaining confusion or doubt there may have been. Fixes #126.
In light of the shift in direction per #133, I'm reverting (most of) cce0e4d so as to allow the next release to more similar to the previous, and to make upgrading easy, allowing most reporters to keep working with very minimal changes (if any). Instead, I'll focus on migrating consumers of js-reporters to use TAP tools directly where available, and to otherwise reduce use of js-reporters to purely the adapting and piping to TapReporter. * Revert `RunStart.testCounts` > `RunStart.counts` (idem RunEnd). * Revert `TestStart.suitName` > `TestStart.parentName` (idem TestEnd). * Revert Test allowing Test as child, restore Suite. This un-fixes #126, which will be declined. Frameworks adapted to TAP by js-reporters will not supported nested tests. Frameworks directly providing TAP 13 can one of several strategies to express relationships in a backwards-compatible manner, e.g. like we do in js-reporters by flattening with '>' symbol, or through indentation or through other manners proposed in TestAnything/testanything.github.io#36. Refer to #133 for questions about how to support TAP.
This is a first draft at a specification that seeks to ratify existing behavior of TAP harnesses and producers. - YAML blocks standardized to 2 space indentation - Subtests specified to behavior of `Test::More` and `node-tap`. - Normative advice regarding exit code for harness programs - Examples and usage comments made language-agnostic. - Clarification of whitespace and hyphens in test lines. - Clarification of handling of incorrect lines. - Specification of Pragma lines It'd be very helpful for implementors to point to a body of language-agnostic tests for compliant parsers. I've got a pretty nice start at this over at <https://github.com/substack/tap-parser/tree/master/test/fixtures>, but we may want to bikeshed the event/property names a bit, and ideally I'd like to get at least one other implementation passing those tests before we sign off on them. Note that this does not go as far as a lot of people would probably like to see in a forward-looking TAP specification. No fancy new magic is added. However, before we start talking about brand new features, it seems wise to ratify the features we are already using. With this change, node-tap and Test::More should be able to simply change their version number from 13 to 14 in order to be fully compliant. Feedback from other TAP producers and harnesses is necessary before adopting this officially.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@isaacs has this been superseded by the work on the Specification repository? If so I think this can now be closed?
Ha, yes, superseded by TestAnything/Specification#25 |
This is a first draft at a specification that seeks to ratify existing
behavior of TAP harnesses and producers.
Test::More
andnode-tap
.It'd be very helpful for implementors to point to a body of
language-agnostic tests for compliant parsers. I've got a pretty nice
start at this over at
https://github.com/substack/tap-parser/tree/master/test/fixtures, but
we may want to bikeshed the event/property names a bit, and ideally I'd
like to get at least one other implementation passing those tests before
we sign off on them.
Note that this does not go as far as a lot of people would probably like
to see in a forward-looking TAP specification. No fancy new magic is
added. However, before we start talking about brand new features, it
seems wise to ratify the features we are already using.
With this change, node-tap and Test::More should be able to simply
change their version number from 13 to 14 in order to be fully
compliant. Feedback from other TAP producers and harnesses is
necessary before adopting this officially.