Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed RFC Suggestion Unit Tests and Code Coverage #7

Open
smurly opened this issue Sep 3, 2021 · 3 comments
Open

Proposed RFC Suggestion Unit Tests and Code Coverage #7

smurly opened this issue Sep 3, 2021 · 3 comments
Labels
rfc-suggestion Request for Comments for a Suggestion

Comments

@smurly
Copy link
Contributor

smurly commented Sep 3, 2021

Summary:

The objective of unit testing is the robust detection of regression early, allowing correction before issues integrate with other systems. If we allow regressions to persist, consumers of our code may inadvertently code to them, causing a negative outcome to the quality of the product.

What is the relevance of this feature?

Automation supports higher confidence in the quality of our product. Code coverage is a measurable proxy for confidence in code quality. While code coverage does not equate to code quality, the level of coverage correlates to perceived confidence in quality. Our atom editor unit test code coverage is below 42% with some individual dll's being below 12%; the common industry objective being greater than 80%. All software has defects; given that we are not covering our code, it is possible that defects exist in the uncovered areas.

Increased unit test code coverage

Unit tests are designed to be quick to run and can provide rapid verification of changes by any contributor to O3DE both as a local development verification and as a gated AR element. This represents the lowest cost solution in both implementation and ongoing maintenance. Emphasis should be placed on both increasing coverage and designing meaningful tests.
Investigating regressions found by unit tests is a much easier process than that of a regression failure in full stack integration tests which load editor. Unit tests tell you exactly where the failure has occurred and what the expectation that was not met is. This makes the return on investment from unit tests much higher than costly to implement full stack integration tests. We still need integration tests, however, where possible we should have unit tests.

How to accomplish increased unit test coverage:

  • Unit test creation training hosted as a SIG special meeting with a curriculum created by SIG-Testing and experienced developers
  • Create O3DE community documentation of how to and expectations of inclusion with contributions
  • When fixing a defect, the fix pull request should include a unit test that would detect a regression of the defect if possible.
  • Developers should critically review pull requests for inclusion of unit tests.

Code coverage collection

Code coverage is a helpful indication for confidence in quality, however, unit tests do not tell you if the function of your code is correct from an end user perspective only if the units of code are as asserted. It is possible to have 95% coverage and yet still have significant bug count once units of code integrate with other systems. Conversely, we can conclude that with low coverage numbers we cannot have confidence that the units of code are correct beyond compilation, it is an unknown quality state.
Code Coverage obtained manually is a critical source of gap analysis to identify where to add tests. Using coverage line information you can critically identify useful areas that can benefit from additional unit tests.
This proposal does not stipulate a minimum bar for coverage. Some areas of code are not practical to unit test since they would require mocking significant 3rd party dependencies or have other complexities that don't easily lend themselves to unit tests.

How to accomplish code coverage collection:

  • Developers should consider installing OpenCPPCoverage visual studio plugin and check line coverage to identify opportunities to add unit tests and cover uncovered code.
  • Developers should include information about code coverage in pull request descriptions.
  • Pull requests that add code, should demonstrably increase code coverage or contribute to coverage of the included code.

https://marketplace.visualstudio.com/items?itemName=OpenCppCoverage.OpenCppCoveragePlugin

@smurly smurly added the rfc-suggestion Request for Comments for a Suggestion label Sep 3, 2021
@Kadino
Copy link

Kadino commented Sep 3, 2021

"SDET team" should be "SIG-Testing"

OpenCPPCoverage is notably windows-only and, worringly, licensed under GPLv3. While it should not create an issue for a developer to use locally, any central use of it or tools which depend on it will need to be cleared with the Linux Foundation legal team. That said, SIG-Testing is currently trying to schedule an RFC for coverage analysis in AR.

Frankly, the best way to assure that code coverage stays the same or better is to have an automated tool in the pipeline that rejects changes that reduce the overall coverage percentage. These tools often have some type of override mechanism for critical changes, but can help correct death-by-papercuts that erodes test coverage. A recommendation to contributors is always good, but if there's no requirement before merging a PR then there's little to .

I want to encourage any 80% goal to point to only the code modified in a pull request. This can help coverage trend up over time as code gets modified and deprecated, without focusing on the discouraging mountain of tech debt. I'm quite passionate about creating a tool for this type of test-velocity report, and hope to be able to work on it.

Additionally, coverage is only one facet of writing a good test. Coverage helps show where there are no tests, and ask questions about how to add a test for a line of code. However it doesn't help identify things such as customer (programmer) use cases for a function, mathematical edge cases that can occur, or configuring buggy interactions around a changed interface.

@santorac
Copy link
Contributor

Discussion should include Atom Sample Viewer screenshot tests for consideration.
We should have guidelines as to what kind of code should have unit tests (easy to break, lots of dependencies, clear isolated interface, not integration-heavy, etc).

@jromnoa
Copy link

jromnoa commented Sep 15, 2021

Another consideration is the ROI on tests. Too many front GUI or full stack tests do not have a good ROI and are against the industry standard for most automation setups in a CI/CD world. It sounds like we just need to identify which portions of code are worth having unit tests and which are not. I am pointing this out because relying 100% on front end GUI / full stack tests can really bog down a pipeline due to how expensive they are to run, the amount of time they take to run, as well as the time it takes to debug them. We do the best we can to make this easier or to save time (like our upcoming parallel and batched test approach for front end Editor.exe tests) but ultimately unit tests are cheapest, fastest, and most concise to the error observed which increases the turnaround on fixing and updating the issue.

The unit tests aren't meant to be comprehensive or cover everything, but are useful for things like parsing functions, or functions that have a wide variance of accepted data types for its parameters, or other objects that can be easily mocked for unit tests. Anything that can't be easily mocked (such as the code that interacts with graphics drivers or code that has highly complex algorithms) are probably NOT good candidates to add unit tests for. Functions or objects that have a simple set of expected values are best for having unit tests coverage though and we ought to add them in those cases (and be sure to register them in CMakeLists.txt so AR runs them).

We just need to be careful about not piling too many front end GUI / full stack tests into the pipeline because it is antithesis to automation industry standards and will greatly slow down our merge updates in code. This is a valid concern and greatly affects ROI - CD/CI traditionally tries to avoid/limit/consolidate these front end test types to keep the pipeline running fast and efficient while also detecting the errors we want to detect. For more reading about this, see the Test Automation Pyramid philosophy: https://martinfowler.com/articles/practical-test-pyramid.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfc-suggestion Request for Comments for a Suggestion
Projects
None yet
Development

No branches or pull requests

4 participants