Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

critera.md: remove "code coverage" requirements. #1

Closed
wants to merge 1 commit into from
Closed

Conversation

gregkh
Copy link

@gregkh gregkh commented Aug 10, 2015

Code "coverage" is really just "test coverage" and is not a good metric for any sort of "assurance" other than someone wrote a bunch of tests. See http://martinfowler.com/bliki/TestCoverage.html for more details as to why I wouldn't even put this on a list of something to try to achieve toward.

Code "coverage" is really just "test coverage" and is not a good metric for any sort of "assurance" other than someone wrote a bunch of tests.  See http://martinfowler.com/bliki/TestCoverage.html for more details as to why I wouldn't even put this on a list of something to try to achieve toward.
@dankohn
Copy link
Contributor

dankohn commented Aug 11, 2015

Fowler says:

If you are testing thoughtfully and well, I would expect a coverage percentage in the upper 80s or 90s.

The criteria you're suggesting to delete are not for the initial badge, but for a potential future version with more stringent criteria. I think the point could be clarified that 100% test coverage probably isn't a good criteria even for a hypothetical platinum badge, for the reasons that Fowler describes.

But code coverage does have a reasonable correlation with the chance that an unrelated change will break basic functionality elsewhere in the app (i.e., a smoke test). While this isn't particularly relevant to security bugs, it does matter to the reliability and resilience of a project to updates, especially for projects that don't have lots of eyeballs. For example, it is useful for gems and npm modules to have enough test coverage to tell whether new pull requests break any core functionality (which is automatically verified by Travis and displays on the Github user interface). That lets a maintainer merge after a code review without having to manually provide QA (quality assurance) of the library before merging.

(Oh, and thanks for making pull request #1!)

@gregkh
Copy link
Author

gregkh commented Aug 11, 2015

I agree this isn't for the "initial" badge, it's just that a blind "100%" or "greater than X%" might not be all that relevant of a metric on it's own.

Feel free to keep it in the list though, what we want is people talking about this stuff. Feel free to close this pull request out if you don't want to change it.

@david-a-wheeler
Copy link
Collaborator

I think for the moment it's better leave in the coverage idea. That way, we can get people talking. It's not a required criterion, and may never be required in any badge.

There's debate about the value of coverage measures. SQLite, for example, strongly emphasizes the value of their 100% branch and 100% MC/DC coverage, per https://www.sqlite.org/testing.html (they do other things, but that is a key part of what they do). In contrast, the Linux kernel has to deal with a lot of weird hardware with undocumented behavior and no one has them all... it's not clear how the Linux kernel drivers (for example) could get very high coverage numbers, frankly, or what value they'd have.

Even Fowler notes that "if you are testing thoughtfully and well, I would expect a coverage percentage in the upper 80s or 90s... Certainly low coverage numbers, say below half, are a sign of trouble." So while 80% is not necessarily a measure by itself of good testing, a very low measure is a sign of bad testing. If there's a way we can determine if it's really good testing - perhaps by pairing this with other measures - I am all ears. Also, high coverage may not be good for finding current defects, but they can often be really helpful when refactoring code. I find that having lots of tests (with good coverage) means that I am much more willing to refactor code.

That said, this is pull request #1... and that is AWESOME. Thank you.

Please keep coming with comments; love to hear them!

@gregkh
Copy link
Author

gregkh commented Aug 11, 2015

For the kernel, about 15 years ago, we did have people doing code-coverage analysis, I think the build option is still in there to enable it if you want to use it. Turns out, no one ever cared, it was just a "number on a spreadsheet" that someone wanted to track. We do have options to 'fail' core calls (like memory allocation) a % of the time to try to exercise error paths, but again, no one seems to ever use them, as again, no one really cares.

Perhaps this could be rewritten as something like:
Tests for major feature additions
as without some kind of test being present, it's hard to verify if something new even works. This is now the "unofficial" rule for the kernel, as we have been burned by this in the past (feature additions that never even worked).

I'll close this one out for now, as keeping what you have is fine at the moment, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants