Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require contributors to write unit tests for pull requests #1586

Open
CitrusWire opened this issue Sep 30, 2020 · 24 comments
Open

Require contributors to write unit tests for pull requests #1586

CitrusWire opened this issue Sep 30, 2020 · 24 comments

Comments

@CitrusWire
Copy link

Describe the project you are working on:
n/a

Describe the problem or limitation you are having in your project:

I am currently evaluating Godot for use in my project. Part of this evaluation includes looking at the development environment around Godot; specifically, does it follow software-development best practices such as having requiring Unit Tests with PR's.
Unit Tests are required for infrastructure (which is what Godot is), to ensure there are no regressions between versions. Regressions between versions mean things that were working now break in an undocumented way.

A single regression can potentially waste thousands of person-hours across all Godot users:

  • Countless developers having to debug their games, including the entire QA process if they're an organisation.
  • Lots of community time spent discussing the issue and helping those debuggings.
  • Engine/library bugs are the hardest to discover because the End User has to consider that any bugs are in their own software first, so has to eliminate all possibilities that's the case before finally concluding it's a problem with the engine/library.

This is also evidenced in the research - https://www.researchgate.net/figure/IBM-System-Science-Institute-Relative-Cost-of-Fixing-Defects_fig1_255965523 - the cost to fix a bug in a Maintenance deployment is 100 times as expensive as fixing it in Design, as compared to just 6.5 times the cost at Implementation.

At the time of writing Godot has about 1000 (Open and Closed) issues with the word "regression" either in the text or using that flag; and this is likely an underestimate as not all regressions will be marked. This seems like a very large number of regressions for a project that prides itself on being such a small codebase.

If you want a stable, reliable game engine that is suitable for non-hobby projects, comprehensive test coverage is a must.

TL;DR: The problem is that I do not trust Godot to not introduce frequent regressions into my projects.

Describe the feature / enhancement and how it helps to overcome the problem or limitation:
GoDot PR's should require Unit Tests.

By doing this, over time Godot will see a reduced number of regressions introduced to the codebase. This in turn will free up limited Godot dev resources to contribute to other things in the project, reduce the number of issues raised, and provide end-users with peace-of-mind that the core is stable.

Ideally over time Unit Tests should also be created for historical code.

Research has indicated that Unit Tests (at least when applied via Test Driven Development) can lead to "a significant increase in quality of the code (greater than two times) for projects developed using TDD" - https://dl.acm.org/doi/abs/10.1145/1159733.1159787 - "Additionally, the unit tests have served as auto documentation".

The downside is that "The projects also took at least 15% extra upfront time for writing the tests.".

Describe how your proposal will work, with code, pseudocode, mockups, and/or diagrams:
This is for the project to decide. This is a process/procedure/workflow suggestion, not a technical suggestion.
A code coverage counter could also be added to the Github project (you already have code helpers, lgtm alerts etc) - https://github.com/marketplace/codecov (free for OS).

If this enhancement will not be used often, can it be worked around with a few lines of script?:
n/a

Is there a reason why this should be core and not an add-on in the asset library?:
Due to its very nature this is about the development process behind the Godot Core.

@Calinou Calinou changed the title Godot Procedes - Require Unit Tests for PRs Require contributors to write unit tests for pull requests Sep 30, 2020
@Calinou
Copy link
Member

Calinou commented Sep 30, 2020

My opinion is that we should encourage writing tests (especially for veteran contributors), but not require them. Many of the occasional contributors we have aren't professinal software engineers. Expecting them to learn how to write C++ unit tests might deter them from contributing further, especially if it ends up with their pull requests not getting merged.

A code coverage counter could also be added to the Github project (you already have code helpers, lgtm alerts etc) - https://github.com/marketplace/codecov (free for OS).

See Supported languages in the Codecov documentation. It seems doing this with a C++ project not using CMake requires using gcov. I've tried it once and the results were not very meaningful due to the GDCLASS() macros artificially inflating the test coverage.

@KoBeWi
Copy link
Member

KoBeWi commented Sep 30, 2020

By doing this, over time Godot will see a reduced number of regressions introduced to the codebase.

And also smaller number of contributors, because making a PR will be more difficult. Also reviewing PRs will take more time, because you have to review tests too, so our backlog will increase even more.

@CitrusWire
Copy link
Author

And also smaller number of contributors, because making a PR will be more difficult. Also reviewing PRs will take more time, because you have to review tests too, so our backlog will increase even more.

Isn't this self-contradictory? If you have fewer contributors you'll get fewer PR's so your backlog won't increase as fast in the first place.

@bruvzg
Copy link
Member

bruvzg commented Sep 30, 2020

My opinion is that we should encourage writing tests (especially for veteran contributors), but not require them.

We should at least add checklist to the PR template, something along these lines (generated by https://www.talater.com/open-source-templates/#/), even if will be ignored by most people and not enforced in any way:

## Description
<!--- Describe your changes in detail -->

## How Has This Been Tested?
<!--- Please describe in detail how you tested your changes. -->
<!--- Include details of your testing environment, and the tests you ran to -->
<!--- see how your change affects other areas of the code, etc. -->

## Checklist:
<!--- Go over all the following points, and put an `x` in all the boxes that apply. -->
<!--- If you're unsure about any of these, don't hesitate to ask. We're here to help! -->
- [ ] I have read the **CONTRIBUTING** document.
- [ ] I have updated the documentation accordingly.
- [ ] I have added tests to cover my changes.
...

@Xrayez
Copy link
Contributor

Xrayez commented Sep 30, 2020

The problem is that I do not trust Godot to not introduce frequent regressions into my projects.

I hate to say this, but that's my actual experience. Unless you have your own unit/integration tests in your own project while developing with Godot and able to report problems yourself that way, regressions are very likely to happen in any long term project you'd like to pursue (frequent with master, not-so-rare with stable), although I have little experience with other open-source projects development to assert whether that's a problem with Godot development specifically or a problem with open-source projects at large.

Moreover, the project is developed actively by hundreds of contributors from all over the world, and having the ability to write tests gives some confidence boost for new contributors, and tests actually document how a particular feature can be used, even without necessarily writing documentation for the feature (which is an unspoken requirement currently, according to my observations).

Perhaps "require" is a strong word for this proposal. But I'd certainly require writing unit test for core data structures, especially by core developers themselves. People just have to be aware of this system that it actually exists and it's possible to write tests while submitting PRs.

Suggest that "writing tests increase the chances of something being merged", and you'll get a surge of PRs with tests. 🙂

@Calinou
Copy link
Member

Calinou commented Sep 30, 2020

even without necessarily writing documentation for the feature (which is an unspoken requirement currently, according to my observations).

Documenting your changes is a written requirement now. Compared to writing unit tests, it has more benefits for end users and the barrier to entry for new contributors is lower (but still present).

@bruvzg
Copy link
Member

bruvzg commented Sep 30, 2020

Documenting your changes is a written requirement now.

Yet almost no one is running --doctool before making PR, and even fewer people actually write the docs.

@YuriSizov
Copy link
Contributor

Well, I guess the questions is "What should be done with a good PR missing some essentials for the overall project health". What if the original author couldn't be bothered to make a sound PR with the docs and the tests? Should it be completely rejected? Should it be marked as salvageable in hopes that yet another contributor would pick that up? I can only imagine all this being a hard rule for the core team, as in people who are getting paid to do this and everyone in the organization to some extend. But I am not sure there is a good way to keep them improvements coming with random contributors not hitting the mark.

@Calinou
Copy link
Member

Calinou commented Sep 30, 2020

Yet almost no one is running --doctool before making PR, and even fewer people actually write the docs.

We're considering running --doctool in CI and fail if there's any difference, but we need to fix headless support first.

Edit: Done in godotengine/godot#48410.

@Jummit
Copy link

Jummit commented Sep 30, 2020

Note that Unit Tests have to be maintained as well. It's not a write once, keep forever thing.

At the time of writing Godot has about 1000 (Open and Closed) issues with the word "regression" either in the text or using that flag; and this is likely an underestimate as not all regressions will be marked. This seems like a very large number of regressions for a project that prides itself on being such a small codebase.

There are 460 issues with the tag, which I don't think is a very large number. But that's beside the point.

@Xrayez
Copy link
Contributor

Xrayez commented Sep 30, 2020

Note that Unit Tests have to be maintained as well. It's not a write once, keep forever thing.

Tests can be updated, rewritten, and even individual test cases can disabled without affecting the CI results and the production code, so it shouldn't be a problem in my opinion. If the test code is too old and doesn't work, it may be even worth to remove those kind of tests at some point.

That said, my development philosophy is "Some tests is better than no tests". 🙂

@CitrusWire
Copy link
Author

Note that Unit Tests have to be maintained as well. It's not a write once, keep forever thing.

Yes, but that cost is on the Godot developers, not the end users. This is as it should be - the people who should "pay" the cost of their code is the developers. (I am a developer, but not a Godot dev).

At the time of writing Godot has about 1000 (Open and Closed) issues with the word "regression" either in the text or using that flag; and this is likely an underestimate as not all regressions will be marked. This seems like a very large number of regressions for a project that prides itself on being such a small codebase.

There are 460 issues with the tag, which I don't think is a very large number. But that's beside the point.

Indeed, but you'll note I didn't just say using the flag, I explicitly stated "issues with the word 'regression'". Use the word regression and you get 735:
https://github.com/godotengine/godot/issues?q=is%3Aissue+regression
There's some overlap and no easy way to do an "or" search that I can see, so i added the two numbers up (1195) and said "about" 1000.

@dsnopek
Copy link

dsnopek commented Sep 30, 2020

Unit tests are good, and I'm super happy that there's more and more automated testing being added to the Godot project.

However, having a requirement for all PRs to have unit tests seems like it'd be pretty problematic, at least until Godot's test coverage gets a lot closer to 100%, because there are many parts of a general purpose game engine that are hard to test, especially one that runs on as many platforms that Godot runs on.

For example, I'm working on adding WebXR support to Godot (in godotengine/godot#42397).

This is code that can only run in HTML5, which is trickier to unit test. If someone had already fully solved, "how do we run tests for Godot on HTML5?" then that PR could add to that, but solving that problem is a pretty massive project in its own right. You could say the same thing for other tricky platforms like Android and iOS.

Also, the WebXR code is essentially integration code (connecting the WebXR APIs with Godot) making unit tests tricky (and not super valuable). What we'd really want is a functional test that makes sure that the result is that WebXR works as expected, which can't be just a unit test verifying that function inputs led to the right return value. But how do you do a functional test to determine an VR or AR device is working correctly? That's another big problem that would need to be solved. And you could say the same for other tricky to test features like physics and particle effects (which are non-deterministic), editor UI features (which would need some kind of functional desktop automation), etc, etc.

So, in summary: tests = good! But, IMO, requiring tests will only make sense once Godot has almost full test coverage, which is probably a long, long way off.

I think a better effort would be trying to get full (or nearly full) test coverage of the parts of the engine that are easy to test and where tests are high value. And then once that's achieved, require test coverage for all new PRs to those parts of the engine. Then, start progressively figuring out how to add tests to the hard-to-test parts of the engine (like Android, iOS, AR/VR, physics, etc) and only requiring tests for those parts once they have good test coverage as a base.

@lawnjelly
Copy link
Member

"I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail."
https://en.wikipedia.org/wiki/Law_of_the_instrument

Unit testing can potentially be helpful in some code areas (low level library functions for example), but not necessarily in others - it is quite controversial.

In games programming in particular there's a lot of areas where attempting unit testing could potentially be more of a hindrance rather than a help. Believing that introducing unit testing to every PR will significantly reduce regressions is a hypothesis - it may not turn out to be true.

You have experience of applying unit testing throughout a large-ish game engine?

TL;DR: The problem is that I do not trust Godot to not introduce frequent regressions into my projects.

Surely the simpler solution is for you to use another engine that uses unit testing throughout, and never introduces regressions?

@CitrusWire
Copy link
Author

TL;DR: The problem is that I do not trust Godot to not introduce frequent regressions into my projects.

Surely the simpler solution is for you to use another engine that uses unit testing throughout, and never introduces regressions?

Why would you not want Godot to be that engine? Surely the project and the engine should aspire to be something that its users can trust? Why would you want the project to say (to paraphrase) "go and use something else that's better tested" to those of us who are concerned at the number of regressions we can see in the tracker.

but not necessarily in others - it is quite controversial.

Please provide evidence to back up this claim. Anyone can say something is "controversial". Heck, Earth being a oblate spheroid (kind of) is somehow "controversial" despite it being 2020. I've provided my evidence which included peer reviewed research done at one of the largest software development companies in the world (the irony of it being Microsoft is not however lost on me ;-) ).

You have experience of applying unit testing throughout a large-ish game engine?

How is this pertinent? Or are you suggesting the only people who are allowed to comment on a particular area of Godot are people who have expertise in that area? That doesn't seem very inclusive.

The nature and scope of this proposal is for the project to decide, as it states explicitly states: "This is for the project to decide.". I merely raised it because I noticed a dearth of tests.

@lawnjelly
Copy link
Member

Apologies I did not want to come across as overly snarky, but the point is essentially this:

  • Yes we all agree that less bugs / regressions is good - this is a common theme in all software
  • Some people, making particular software, report beneficial results from unit testing

However you appear to extrapolate this to suggest that unit testing is the answer to all (or most) of the regressions in Godot:

Unit Tests are required for infrastructure (which is what Godot is), to ensure there are no regressions between versions. Regressions between versions mean things that were working now break in an undocumented way.

Unit tests DO NOT ensure there are no regressions between versions. For simple functional code you can verify the behaviour of the system. For complex systems, there are a huge number of inputs and an exponential number of possible states, which means that in a practical sense unit tests are often limited to dealing with specific types of bugs.

For instance, I work mostly on the rendering. Godot is designed to run on a large number of different devices - different CPUs, platforms, GPUs, driver and OS versions, and different configurations, and multithreaded, giving rise to race conditions. Many of the drivers will have bugs. For the vast majority of these we don't even have access to anything like the hardware that it is expected to run on (we don't even have Macs for testing Mac builds). We don't have reference rasterizers.

Currently, there is no practical way to perform rigorous testing on all these platforms combinations. As an open source project we don't have access to a vast array of test hardware. What we have to rely on instead is community testing. We release betas, and hope the public try them. If they don't test them enough, we get bugged releases.

Unit testing can help in some areas, but to somehow suggest it is the solution to all the bugs in a game engine is misguided.

@Calinou
Copy link
Member

Calinou commented Oct 11, 2020

We don't have reference rasterizers.

Software OpenGL/Vulkan implementations are sometimes considered reference rasterizers, but even those have bugs 🙂

@lawnjelly
Copy link
Member

We don't have reference rasterizers.

Software OpenGL/Vulkan implementations are sometimes considered reference rasterizers, but even those have bugs

Yes I'm hopeful this situation will improve with Vulkan, they are meant to be stricter with conformance testing. DirectX has been better in this regard, OpenGL is like the wild west. 😀

@CitrusWire
Copy link
Author

Apologies I did not want to come across as overly snarky

That's fine, it's a habit I share. :-)

However you appear to extrapolate this to suggest that unit testing is the answer to all (or most) of the regressions in Godot
...
Unit tests DO NOT ensure there are no regressions between versions.
...
which means that in a practical sense unit tests are often limited to dealing with specific types of bugs.
...
Currently, there is no practical way to perform rigorous testing on all these platforms combinations.
...
As an open source project we don't have access to a vast array of test hardware.
...
but to somehow suggest it is the solution to all the bugs in a game engine is misguided.

At no point have I said that unit tests would solve all of godot's problems. Nor am I saying godot needs perfect coverage, certainly not tomorrow or even within a year anyway. I am saying getting into the habit of committing unit tests with PR's is a good thing.

Unit tests are one tool in the arsenal of software development that provably help to produce better software. They don't solve all problems, but they do solve some, reduce the chance of some regressions, and as a bonus they can even help produce better software in the first place.

Some people, making particular software, report beneficial results from unit testing

Here you are trying to cast doubt on the study because it's not relating to game development.

So here's a paper written directly about game development. This is a game that used Unity, it's not a game engine itself, but it has useful insights nonetheless:
http://web-ext.u-aizu.ac.jp/~mozgovoy/homepage/papers/mozgovoy18c.pdf

It's worth a read, only 3 pages. The crux in relation to automated testing:

Literature on agile development speaks in favor of unit
testing, but one should note that the associated costs are
distributed unevenly. Sanderson [7] identifies two types of
code with high cost of unit testing: complex code with many
dependencies, and trivial code with many dependences
(“coordinators” between other code units). According to
Sanderson, complex code with many dependencies should be
refactored to separate algorithms from coordination.

Our experience shows that a game project has a large
proportion of both types of high-cost unit testing code. I
believe the primary reason for it is that the most cost-
efficient type of code (“complex code with few
dependencies” in Sanderson’s scheme) belongs to the game
engine such as Unity and 3rd-party libraries.

Or put another way: game engines are exactly where the low-cost (read: easy) unit tests go in a game.

There are plenty of other papers out there on the topic, including covering game programming, and from my reading around, they all seem to say some variant of: Tests work.

Or this paper:
https://dl.acm.org/doi/abs/10.1145/3239235.3268923 (full paper: https://sci-hub.scihubtw.tw/10.1145/3239235.3268923)
"Computer games are serious business and so is their quality: particularities of software testing in game development from the perspective of practitioners"

Which has this gem in its abstract:

In software testing, for instance, studies demonstrated the hesitance of professionals to use automated testing techniques with games...

So exactly what we're seeing in this thread. :-D

Amusing on-the-nose observations aside, that paper also highlights that all of the problems with unit testing on games are specific to games, not game engines:

In general, games differ from regular software due to specific
traits related to the complexity of human interactions characteristic
in this type of software. Therefore, there are metrics that are
difficult to observe and test, such as user behaviors, entertainment
and fun...

Again, testing doesn't solve everything and I'm not saying it will. But there's a considerable amount of literature out there that shows it almost certainly will make Godot quite a bit better in various ways.

</Sunday Science> ;-)

@clayjohn
Copy link
Member

@CitrusWire I don't think anyone is suggesting that we dont unit test. Godot has unit tests and many contributers are in the midst of increasing our testing coverage!

I think the disagreement here is on whether to make tests mandatory for all PRs (which is what this proposal suggests).

Put another way, we have two distinct propositions:

  1. Unit tests are a valuable tool for large projects (including game engines)
  2. All PRs must have accompanying unit tests

You can see that 1 does not necessarily imply 2, further the arguments you raise all support 1. I think we are all in agreement about 1. But I don't see any substantial support for 2.

What we could consider is making them mandatory for sections of code that already have full coverage, but I am not certain if any areas have full coverage yet. @Calinou, @RevoluPowered and @Xrayez will know best.

@WilliamTambellini
Copy link

WilliamTambellini commented Oct 13, 2020

My vote in favor of this proposal:

  • I also do not trust Godot to not introduce frequent regressions into my projects
  • As today, on a total of 4726 github issues, 3085 are flagged "bug", meaning 63% of bugs. As a comparison, TF has total 3700 opened issues with 'only' 1200 bugs, meaning 32%.

A 1st step forward would be to target at least the cpp core engine, taking into account the (legit) comment from @dsnopek
As today, could anyone confirm that the only cpp unit tests are the ones in:
https://github.com/godotengine/godot/tree/master/tests
?

@Xrayez
Copy link
Contributor

Xrayez commented Oct 13, 2020

As today, could anyone confirm that the only cpp unit tests are the ones in:
https://github.com/godotengine/godot/tree/master/tests
?

Mostly yes, except for the modules/*/tests which may contain test suites as well.

So, in summary: tests = good! But, IMO, requiring tests will only make sense once Godot has almost full test coverage, which is probably a long, long way off.

Also, I think it's not enough to write tests. One has to actually merge those test PRs without a fear of breaking anything (tests do not touch production code at all). For instance, godotengine/godot#42136 was written almost a month ago, I don't see how we can achieve good coverage quickly enough at this pace. 😛

I personally don't feel motivated to write tests due to this.

@Xrayez
Copy link
Contributor

Xrayez commented Oct 13, 2020

Also, PRs like godotengine/godot#42270 should likely have accompanying tests written, core methods are usually not really that documented (as they're not exposed) and I've seen complaints about this from users, so having those tests there could improve the documentation aspect. We use "doctest" framework. 🤔

But we should at least pre-populate empty test suites for each those core/ classes in tests/ before even attempting to require tests to be written in the first place...

@Calinou
Copy link
Member

Calinou commented Oct 13, 2020

As today, on a total of 4726 github issues, 3085 are flagged "bug", meaning 63% of bugs. As a comparison, TF has total 3700 opened issues with 'only' 1200 bugs, meaning 32%.

Remember that the main Godot repository's issue tracker is now meant to be used for bug reports only. This will naturally increase the percentage of issues labeled bug over time.

Also, TensorFlow is a framework with a smaller surface for bugs compared to a full-blown game engine with a GUI editor.

MarcusElg pushed a commit to MarcusElg/godot that referenced this issue Oct 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests