Skip to content

Conversation

ambrussimon
Copy link
Contributor

Closes #873

Review Checklist

  • Tests were added to cover all code changes
  • Documentation was added / updated
  • Code and tests follow standards in CONTRIBUTING.md

@ambrussimon ambrussimon self-assigned this Aug 7, 2017
@kofalt
Copy link
Contributor

kofalt commented Aug 7, 2017

Ambrus, I know this PR is experimental, so I want to thank you for trying this out before offering my feedback. My comments are on the project, not your work :)


I don't support line-level semantic comments, especially to this extent. It's a clutter on the codebase and an abuse of the purpose of having test coverage in the first place. In my opinion attempting to artificially jump to 100, then measuring the drop, is worse in any meaningful way than simply measuring the drop from an honest percentage.

For what it's worth, I would be a bit more amiable to class- or file-level exclusion comments, that isolate entire swathes of functionality. That's at bit more honest and clearly stated, as opposed to "it's inconvenient that this error handling line is not covered".

It's very tempting to use technical solutions to a social problem. In the software engineering industry this is almost always a mistake. I assert that "we have more priorities than code coverage alone" is a social problem, and "sprinkle comments on hundreds of lines to bump a metric" is a technical solution.

I also don't think the stated purpose of cover 100 is pragmatic or realistic. Right now that's at 357 occurrences - a whopping 4% of the code base - which makes me think that "Not meant for further use - existing exclusions are considered TODOs" is unlikely to remain true.

Overall this gets a strong 👎 from me.

@ryansanford
Copy link
Contributor

@ambrussimon-invenshure Thanks for the PR. 👍

@kofalt

I want a culture where tested code is the default. I want to lean on tools to predictably and consistently promote that culture. I want opting code out of coverage to be visible and explicit.

I view this PR as an effective means to achieve the above.

I don't support line-level semantic comments, especially to this extent. It's a clutter on the codebase and an abuse of the purpose of having test coverage in the first place.

Could you help me understand the extend of your objection here? Is it specifically semantic comment on the same line as the code? Is there a more appealing approach to mark individual lines?

In my opinion attempting to artificially jump to 100, then measuring the drop, is worse in any meaningful way than simply measuring the drop from an honest percentage.

Measuring a drop in percentage means coverage rates for new development needs to be just as good/bad as the old.

For what it's worth, I would be a bit more amiable to class- or file-level exclusion comments, that isolate entire swathes of functionality. That's at bit more honest and clearly stated, as opposed to "it's inconvenient that this error handling line is not covered".

Skipping existing files/modules does nothing to encourage future development within that file/module to be covered.

It's very tempting to use technical solutions to a social problem. In the software engineering industry this is almost always a mistake. I assert that "we have more priorities than code coverage alone" is a social problem, and "sprinkle comments on hundreds of lines to bump a metric" is a technical solution.

The goal is not to achieve 100% coverage for the sake of stating we have 100% coverage. The goal is to move forward with higher standards for new development.

I also don't think the stated purpose of cover 100 is pragmatic or realistic. Right now that's at 357 occurrences - a whopping 4% of the code base - which makes me think that "Not meant for further use - existing exclusions are considered TODOs" is unlikely to remain true.

I agree with the sentiment. We should be honest about what skipped lines we consider important enough for covering in the next 1-3 months, and use a phrase more specific than cover 100 for those, and use no cover for the chaff.

@kofalt
Copy link
Contributor

kofalt commented Aug 7, 2017

I want a culture where tested code is the default. I want to lean on tools to predictably and consistently promote that culture.

I think this speaks to the heart of the "solve a social problem with a technical solution" anti-pattern.

My guess is that a lot of projects have those goals, yet very few have adopted something like this strategy.
Offhand, I've never heard of someone doing this. It strikes me as very strange.

Could you help me understand the extend of your objection here?

My objection is with line-level exclusions. No coverage exclusion is great to my mind, but ranked in descending order of palatability: file, class, branch, line. My link above shows one example of my reasoning.

This information is already exposed by better tools that do not require re-implementing a highlighter by hand.

Measuring a drop in percentage means coverage rates for new development needs to be just as good/bad as the old.

Agreed. This is true for both strategies, if I'm understanding you correctly.

Skipping existing files/modules does nothing to encourage future development within that file/module to be covered.

Agreed! This is why I don't like coverage exclusions. Exclusions should be a "throw up our hands, this will never be tested because of effort level / state of the universe" last resort. Not a #TODO.

I think it's noteworthy that a few test frameworks I've ran across don't support exclusions at all, by design. To quote the internet: "Code coverage tells you what you definitely haven't tested, not what you have."

The goal is to move forward with higher standards for new development.

I don't see how this change helps us do that.

I submit that there is no useful advantage to be gained moving from having X% coverage and hoping to obtain X+Y, versus having fake-100% coverage and hoping to obtain < 357 exclusion comments.

@ambrussimon
Copy link
Contributor Author

I completely agree that this solution is far from elegant, but this is a grey area in my opinion. The question I think is whether

a) there is a nicer (but still feasible!) solution at all or
b) the approach is bad enough to offset the intended benefits*

The goal is to move forward with higher standards for new development.

I don't see how this change helps us do that.

I think the goal here (the intended benefit) is enabling enforcement of adding some tests to any new code. Note that file level exclusions leave way more freedom for avoiding this requirement.

Pertaining to a) above, I was also considering the incremental comparison approach, as I agree that is a "more correct" solution. However, there are some drawbacks there, too:

  1. any refactoring (removing code that's better covered than the avg) may lead to decreased coverage
  2. needs more effort* to implement properly (getting previous cov%, local vs CI)

(Regarding 2., this discussion gave me the idea of always comparing to the latest master cov% from coveralls effectively leaving only 1., but I'm sure there could be edge cases/problems with that too, that I just didn't think about yet.)

Again, I'm neither for or against, just trying help with sharing my 50 cents.

@kofalt
Copy link
Contributor

kofalt commented Aug 11, 2017

I think the goal here (the intended benefit) is enabling enforcement of adding some tests to any new code.

I agree that this PR might help accomplish that goal, but I disagree with the approach. I think we understand each other there :)

However, there are some drawbacks there, too

I agree with your drawbacks - these are some of the reasons our brief usage of coverall's "mark the PR red if % change is > X" was annoying and useless. This, to me, clearly marks a need for the human reviewer to make a human decision using the information at hand.

This is precisely the same as if I had added 500 lines of untested code with 500 # pragma: no cover comments: a human would subjectively review the PR and respond.

Copy link
Contributor

@kofalt kofalt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we now understand both sides of the discussion, but I'm not convinced my position is going to change. 👎

@nagem
Copy link
Contributor

nagem commented Aug 22, 2017

We discussed my opinion in a meeting, but I'll summarize here for transparency.

Ambrus I appreciate your work, thank you for taking the time to investigate this option for us and give us a real, tangible solution to discuss. I'm glad we looked into this, but I don't believe the reality of the solution is something I would like to move forward with. In an effort to promote higher testing standards for our work we are sacrificing code cleanliness, transparency of actual coverage, and maintainability, but I believe the biggest problem with this solution is it doesn't directly address the problem of gaining more insight into how a PR changes the coverage.

In the meeting, we discussed using a different tool that will give us that insight. It looks like Codecov does a much better job relaying how a PR would change the coverage that exists, most importantly the coverage of the diff itself. I suggest we investigate using this tool and retire the artificial 100% coverage approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants