Add Unit Testing to SU2 #698

clarkpede · 2019-06-04T21:14:04Z

I propose adding a unit-testing framework and unit-tests to SU2. After chatting with @economon, I've decided to move the discussion here to get additional input.

What is unit testing?

For those not familiar with unit testing, unit testing allows the testing of small bits of behavior, ideally using isolated bits of code. It is not intended to replace validation testing or formal verification tests. Instead, it serves a unique purpose. Consider the three following use cases:

You're developing a new feature, and you want to test it to see if it works. You could do a full simulation, but that takes a lot of time and computing power. You want to check if your new behavior behaves as you suspect before you throw a lot of resources at it.
You submit a PR and discover that one of the regression tests has failed. But...why? You know that something is broken, but its hard to track down what broke. You want more granular test coverage that can demonstrate what broke.
You are fixing a very small bug. You know that you should prove that your bug fix worked, but it doesn't seem logical to dedicate an entire validation case to one small bug fix. You want to write a small test for a small fix.

In all of these cases, unit testing fills a unique role. Unit testing increases time spent in development, but decreases the amount of time spent in bug-fixing and maintaining.

For more information, see this relevant Stack Exchange question.

What do I propose?

Our research group at UT Austin has implemented a unit testing framework on our branch, which we're happy with. Some choices were arbitrary, and some choices were made based on our development environment. Those choices may be different for other groups. Here's what we have done:

The unit testing framework is compiled and run using autotools. For more information on autotool's setup, see their documentation. Since autotools is the build system for SU2, this involves minimal changes.

Using automake, the build process for building unit tests becomes:

    ./bootstrap
    ./configure
    make
    make check

We use Boost's unit testing framework. This provides a convenient set of macros for instatiating tests, grouping tests into suites, and running checks. This choice was based on what is available in our development setup.

We have integrated our unit tests into our Travis CI regression testing. Every time we push commits or submit a pull request, the unit tests are run and checked.

What is my vision for unit testing in SU2?

I am not proposing that we start trying to get 100% code coverage with pre-existing code. That would not provide a good return on investment.

Instead, I see people adding unit tests as they write new code and as they find bugs. For each new behavior added to SU2, tests are first added to document the related existing behavior. These tests serve to check that the existing behavior isn't damaged by the new code. Then new tests are added to prove that the new behavior is working correctly. For bug fixes, the process is simpler. A test is added to confirm that something is not behaving as expected. Then the code is fixed to make the test pass.

What frameworks are available?

For a unit testing framework, here are the most popular options, with the following pros and cons:

Roll-your-own

Requires no external dependencies
The most flexible option
Involves the most work to setup
Will lack some of the more advances features of mature unit-testing frameworks.

Boost Test

Can be header only, statically linked, or dynamically linked
If statically or dynamically linked, then Boost is not very lightweight
Easy to add if you're already using Boost

Google Test

Most common unit-testing framework
Can be easily combined with Google's powerful GMock mocking library
Compiling and linking can be somewhat painful

Catch2

Used by FEniCS
Makes unit tests easily readable with lots of syntactic sugar.
Has a very simple syntax
Is header-only
~~Requires C++11 compilation~~ Requires C++11 for full feature set, but offers a C++03 branch
Not as feature rich as Google Test or Boost Test

Questions

How do developers feel about adding unit tests to SU2?
If a unit-testing framework were added to SU2, would you actually use it?
Do developers have a preference (or experience with) any of the unit testing frameworks?
Should unit tests be expected when submitting PRs?

The text was updated successfully, but these errors were encountered:

economon · 2019-06-05T18:49:55Z

An important, but currently missing, component of our current testing/quality assurance procedures, in my opinion.

I would use it. For example, checking the output of the ComputeResidual() functions in each of the numerics classes are obvious candidates for this. I can think of many other "units" throughout the code, but this is another open discussion for the scope. @clarkpede could you give a couple of examples for the selection of the units in your use cases?

No experience w/ the other frameworks. As we now include some Boost for Tecio anyway, could be another opportunity to consolidate.

As for PRs, this is open for me.. we discussed the +/- of requiring docs and tests in PRs at the developers meeting. There are pros and cons to be sure.

Would like to hear what others think too.

juanjosealonso · 2019-06-05T20:50:38Z

Clark, Thanks for putting this idea out there. In my experience, unit testing has been an intrinsic part of the the modus operandi in many multi-physics codes at DoE and has been well worth the additional effort. In cases where it makes sense (as described by Clark and in the Stack Exchange discussion) I would advocate for using it moving forward. There may also be some issues that arise multiple times in existing code where a retroactive application of unit testing may also make sense. I am copying Pat Miller, formerly with DoE, who may have more experience on whether such unit testing approaches were useful/worth the investment in some major codes he worked on. Best, Juan On Jun 4, 2019, at 2:14 PM, Clark Pederson <notifications@github.com<mailto:notifications@github.com>> wrote: I propose adding a unit-testing framework and unit-tests to SU2. After chatting with @economon<https://github.com/economon>, I've decided to move the discussion here to get additional input. What is unit testing? For those not familiar with unit testing, unit testing allows the testing of small bits of behavior, ideally using isolated bits of code. It is not intended to replace validation testing or formal verification tests. Instead, it serves a unique purpose. Consider the three following use cases: * You're developing a new feature, and you want to test it to see if it works. You could do a full simulation, but that takes a lot of time and computing power. You want to check if your new behavior behaves as you suspect before you throw a lot of resources at it. * You submit a PR and discover that one of the regression tests has failed. But...why? You know that something is broken, but its hard to track down what broke. You want more granular test coverage that can demonstrate what broke. * You are fixing a very small bug. You know that you should prove that your bug fix worked, but it doesn't seem logical to dedicate an entire validation case to one small bug fix. You want to write a small test for a small fix. In all of these cases, unit testing fills a unique role. Unit testing increases time spent in development, but decreases the amount of time spent in bug-fixing and maintaining. For more information, see this relevant Stack Exchange question.<https://scicomp.stackexchange.com/questions/206/is-it-worthwhile-to-write-unit-tests-for-scientific-research-codes> What do I propose? My research group at UT Austing has implemented a unit testing framework on our branch, which we're happy with. Some choices were arbitrary, and some choices were made based on our development environment. Those choices may be different for other groups. Here's what we have done: The unit testing framework is compiled and run using autotools. For more information on autotool's setup, see their documentation<https://www.gnu.org/software/automake/manual/html_node/Tests.html>. Since autotools is the build system for SU2, this involves minimal changes. Using automake, the build process for building unit tests becomes: ''' ./bootstrap ./configure make make check ''' We use Boost's unit testing framework<https://www.boost.org/doc/libs/1_70_0/libs/test/doc/html/index.html>. This provides a convenient set of macros for instatiating tests, grouping tests into suites, and running checks. This choice was based on what is available in our development setup. We have integrated our unit tests into our Travis CI regression testing. Every time we push commits or submit a pull request, the unit tests are run and checked. What is my vision for unit testing in SU2? I am not proposing that we start trying to get 100% code coverage with pre-existing code. That would not provide a good return on investment. Instead, I see people adding unit tests as they write new code and as they find bugs. For each new behavior added to SU2, tests are first added to document the related existing behavior. These tests serve to check that the existing behavior isn't damaged by the new code. Then new tests are added to prove that the new behavior is working correctly. For bug fixes, the process is simpler. A test is added to confirm that something is not behaving as expected. Then the code is fixed to make the test pass. What frameworks are available? For a unit testing framework, here are the most popular options, with the following pros and cons: Roll-your-own * Requires no external dependencies * The most flexible option * Involves the most work to setup * Will lack some of the more advances features of mature unit-testing frameworks. Boost Test * Can be header only, statically linked, or dynamically linked * If statically or dynamically linked, then Boost is not very lightweight * Easy to add if you're already using Boost Google Test * Most common unit-testing framework * Can be easily combined with Google's powerful GMock mocking library * Compiling and linking can be somewhat painful Catch2 * Used by FEniCS * Makes unit tests easily readable with lots of syntactic sugar. * Has a very simple syntax * Is header-only * Requires C++11 compilation * Not as feature rich as Google Test or Boost Test Questions * How do developers feel about adding unit tests to SU2? * If a unit-testing framework were added to SU2, would you actually use it? * Do developers have a preference (or experience with) any of the unit testing frameworks? * Should unit tests be expected when submitting PRs? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub<#698?email_source=notifications&email_token=AA5FFRG5U3Z55N4W2XWQED3PY3LJ5A5CNFSM4HTDQXQKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GXUGGLA>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AA5FFRHOPE2MUJ2Z5RRV4HTPY3LJ5ANCNFSM4HTDQXQA>.

juanjosealonso · 2019-06-05T22:25:03Z

I'll weigh in with a more in depth answer on a second email, but yes I've found great value in being able to have a good set of unit tests. Particularly when you don't have good acceptance tests (hard in a fast moving research code), it gives a developer confidence that new changes aren't being fundamental assumptions in the code. It lets sub module developers build "armor" around those assumptions. It is a bit of a cultural thing. People who want robust bits write more. Some people wire less. At the bank I once worked at, unit tests were required for every module. Some people wrote code that tested almost nothing. And it would get through code review that way. Eventually, I added coverage analysis to the check in that exposed this practice that gave a false assurance that things were ok. More when I can think a bit more on this and get to a real keyboard! Pat On Wed, Jun 5, 2019 at 1:50 PM Juan Jose Alonso <jjalonso@stanford.edu> wrote:

…

Clark, Thanks for putting this idea out there. In my experience, unit testing has been an intrinsic part of the the modus operandi in many multi-physics codes at DoE and has been well worth the additional effort. In cases where it makes sense (as described by Clark and in the Stack Exchange discussion) I would advocate for using it moving forward. There may also be some issues that arise multiple times in existing code where a retroactive application of unit testing may also make sense. I am copying Pat Miller, formerly with DoE, who may have more experience on whether such unit testing approaches were useful/worth the investment in some major codes he worked on. Best, Juan On Jun 4, 2019, at 2:14 PM, Clark Pederson ***@***.***> wrote: I propose adding a unit-testing framework and unit-tests to SU2. After chatting with @economon <https://github.com/economon>, I've decided to move the discussion here to get additional input. What is unit testing? For those not familiar with unit testing, unit testing allows the testing of small bits of behavior, ideally using isolated bits of code. It is not intended to replace validation testing or formal verification tests. Instead, it serves a unique purpose. Consider the three following use cases: - You're developing a new feature, and you want to test it to see if it works. You could do a full simulation, but that takes a lot of time and computing power. You want to check if your new behavior behaves as you suspect before you throw a lot of resources at it. - You submit a PR and discover that one of the regression tests has failed. But...why? You know that something is broken, but its hard to track down what broke. You want more granular test coverage that can demonstrate what broke. - You are fixing a very small bug. You know that you should prove that your bug fix worked, but it doesn't seem logical to dedicate an entire validation case to one small bug fix. You want to write a small test for a small fix. In all of these cases, unit testing fills a unique role. Unit testing increases time spent in development, but decreases the amount of time spent in bug-fixing and maintaining. For more information, see this relevant Stack Exchange question. <https://scicomp.stackexchange.com/questions/206/is-it-worthwhile-to-write-unit-tests-for-scientific-research-codes> What do I propose? My research group at UT Austing has implemented a unit testing framework on our branch, which we're happy with. Some choices were arbitrary, and some choices were made based on our development environment. Those choices may be different for other groups. Here's what we have done: The unit testing framework is compiled and run using autotools. For more information on autotool's setup, see their documentation <https://www.gnu.org/software/automake/manual/html_node/Tests.html>. Since autotools is the build system for SU2, this involves minimal changes. Using automake, the build process for building unit tests becomes: ''' ./bootstrap ./configure make make check ''' We use Boost's unit testing framework <https://www.boost.org/doc/libs/1_70_0/libs/test/doc/html/index.html>. This provides a convenient set of macros for instatiating tests, grouping tests into suites, and running checks. This choice was based on what is available in our development setup. We have integrated our unit tests into our Travis CI regression testing. Every time we push commits or submit a pull request, the unit tests are run and checked. What is my vision for unit testing in SU2? I am *not* proposing that we start trying to get 100% code coverage with pre-existing code. That would not provide a good return on investment. Instead, I see people adding unit tests as they write new code and as they find bugs. For each new behavior added to SU2, tests are first added to document the related existing behavior. These tests serve to check that the existing behavior isn't damaged by the new code. Then new tests are added to prove that the new behavior is working correctly. For bug fixes, the process is simpler. A test is added to confirm that something is not behaving as expected. Then the code is fixed to make the test pass. What frameworks are available? For a unit testing framework, here are the most popular options, with the following pros and cons: Roll-your-own - Requires no external dependencies - The most flexible option - Involves the most work to setup - Will lack some of the more advances features of mature unit-testing frameworks. Boost Test - Can be header only, statically linked, or dynamically linked - If statically or dynamically linked, then Boost is not very lightweight - Easy to add if you're already using Boost Google Test - Most common unit-testing framework - Can be easily combined with Google's powerful GMock mocking library - Compiling and linking can be somewhat painful Catch2 - Used by FEniCS - Makes unit tests easily readable with lots of syntactic sugar. - Has a very simple syntax - Is header-only - Requires C++11 compilation - Not as feature rich as Google Test or Boost Test Questions - How do developers feel about adding unit tests to SU2? - If a unit-testing framework were added to SU2, would you actually use it? - Do developers have a preference (or experience with) any of the unit testing frameworks? - Should unit tests be expected when submitting PRs? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#698?email_source=notifications&email_token=AA5FFRG5U3Z55N4W2XWQED3PY3LJ5A5CNFSM4HTDQXQKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GXUGGLA>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA5FFRHOPE2MUJ2Z5RRV4HTPY3LJ5ANCNFSM4HTDQXQA> .

clarkpede · 2019-06-07T18:52:34Z

As requested, here's an example of a unit test that I made.

For context: There's a couple of different modes for the Roe-low-dissipation convective blending. If one of the "DUCROS" modes is selected, then the Ducros sensor values are used. Otherwise, they're ignored. Before commit ac8b3bf, the SetRoe_Dissipation function checked to see if the sensor values were valid regardless of the type of blending selected. Commit ac8b3bf changed the behavior to only check the sensor values if they will be used.

The unit test sets the convective blending to NTS, feeds invalid sensor values into SetRoe_Dissipation and checks the output.

// Used to set the Roe-low-dissipation option
void WriteCfgFile(unsigned short nDim, const char* filename,
                  std::string blending) {
  std::ofstream cfg_file;

  cfg_file.open(filename, ios::out);
  cfg_file << "PHYSICAL_PROBLEM= NAVIER_STOKES" << std::endl;
  cfg_file << "ROE_LOW_DISSIPATION= " << blending << std::endl;

  cfg_file.close();
}

BOOST_AUTO_TEST_CASE(BadSensorsAllowedForNTS) {

  /*--- Setup ---*/

  const unsigned short nDim = 3;

  /*--- Set up the config class for the test ---*/
  char cfg_filename[100] = "convective_blending_test.cfg";
  WriteCfgFile(nDim, cfg_filename, "NTS");
  CConfig* config = new CConfig(cfg_filename, SU2_CFD, 0, 1, 2, VERB_NONE);
  std::remove(cfg_filename);

  /*--- Inputs ---*/
  const su2double dissipation_i = 0.4;
  const su2double dissipation_j = 0.6;
  const su2double sensor_i = NAN;   // Intentionally unphysical:
  const su2double sensor_j = NAN;   // Intentionally unphysical:

  /*--- Outputs ---*/
  su2double dissipation;

  /*--- Test ---*/

  CNumerics numerics;
  numerics.SetRoe_Dissipation(dissipation_i, dissipation_j,
                              sensor_i, sensor_j,
                              dissipation, config);

  const su2double tolerance = std::numeric_limits<su2double>::epsilon();
  BOOST_CHECK_CLOSE_FRACTION(dissipation, 0.5, tolerance);

  /*--- Teardown ---*/
  delete config;
}

There's a couple problems I would fix if I had more time. Ideally, I would be writing the cfg file to an in-memory stream and not to a file. And realistically, I shouldn't need to use a config file at all for a simple test like this.

talbring · 2019-06-09T16:36:21Z

Thanks @clarkpede to take the initiative for this topic. I think unit-tests are a useful thing and we should think about having it in addition to the regression tests. Regarding the framework I am actually a little bit hesitant to use boost. Although we are already using it for tecio, in that case it is used in a part of the code which does not change frequently so it is fine if we are just shipping it. However, if we start introducing it into the actual development process people may want to use more and more features of boost and we will have a hard time maintaining versions, compatbilities and so on. And in my opinion we should keep it as simple and lightweight as possible (one of our biggest strengths is the simple compilation/installation, which actually attracts a lot of users). So in that regard Catch2 looks like a better candidat to me. But I am happy to hear more opinions on that.

clarkpede · 2019-06-10T12:08:18Z

@talbring I agree with your assessment of Boost. I think it's a heavyweight solution to a lightweight use-case. We could always include just the unit-testing header (they offer a header-only version), but "people may want to use more and more features of boost," as you point out. If we as developers want to add Boost as a formal dependency for SU2, then that seems like a fine route. But I have the feeling that many developers do not want to add a Boost dependency.

Honestly, Boost UTF doesn't offer anything that we can't get from Google Test.

Catch2 is definitely the simplest and easiest of the unit-testing frameworks I listed. The only sticking point is that it requires c++03, and that the full-feature version requires C++11.

clarkpede · 2019-06-10T13:12:05Z

I just found a blog post on the future directions of Catch2. There's a couple of important points for our discussion. The developer plans to adopt a hybrid approach, with:

A stripped-down, header-only version.
A full-feature, typical library (i.e. it must be compiled and linked)

This approach is very similar to Boost's setup. Google Test does not offer a header-only version.

Additionally, the developer plans to drop C++11 support, and move to C++14. A simpler branch will still support C++03. It's not clear which features are supported in the C++03 variant, and which ones aren't. Google Test is also moving to support only C++11 in their next release, but their current release fully supports pre-C++11.

All of this discussion raises the question: Do we want to require C++11 for unit tests?

economon · 2019-06-19T18:53:26Z

We already require C++11 for some more advanced features, but it is always nice in my opinion to keep backward compatibility when possible.

However, this is not a deal breaker, I don't think, as most developers that want to use and add their own unit tests should have no problem with using C++11. If we can make it an optional dependency, to make sure the basic build still works simply, I think it could be ok.

stale · 2019-10-30T17:16:16Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

clarkpede · 2019-10-30T17:26:35Z

This issue has been paused until after v7.0.0

stale · 2019-12-29T17:55:40Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is still a relevant issue please comment on it to restart the discussion. Thank you for your contributions.

clarkpede · 2020-01-06T20:06:54Z

No updates, but work will begin on this shortly.

stale · 2020-03-06T20:09:08Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is still a relevant issue please comment on it to restart the discussion. Thank you for your contributions.

clarkpede · 2020-03-06T20:10:34Z

Updates were just pushed to the related PR.

stale · 2020-05-05T23:48:24Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is still a relevant issue please comment on it to restart the discussion. Thank you for your contributions.

clarkpede added the feature_request label Jun 4, 2019

talbring added this to To do in General Maintenance Jun 11, 2019

stale bot added the stale label Oct 30, 2019

stale bot removed the stale label Oct 30, 2019

stale bot added the stale label Dec 29, 2019

stale bot closed this as completed Jan 5, 2020

General Maintenance automation moved this from To do to Done Jan 5, 2020

clarkpede reopened this Jan 6, 2020

General Maintenance automation moved this from Done to In progress Jan 6, 2020

stale bot removed the stale label Jan 6, 2020

clarkpede mentioned this issue Jan 17, 2020

Add unit-testing framework to SU2 #850

Merged

5 tasks

stale bot added the stale label Mar 6, 2020

stale bot removed the stale label Mar 6, 2020

stale bot added the stale label May 5, 2020

stale bot closed this as completed May 13, 2020

General Maintenance automation moved this from In progress to Done May 13, 2020

pcarruscag mentioned this issue Jan 14, 2022

MAJOR ISSUES / DEVELOPMENT AREAS #1487

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Unit Testing to SU2 #698

Add Unit Testing to SU2 #698

clarkpede commented Jun 4, 2019 •

edited

Loading

economon commented Jun 5, 2019

juanjosealonso commented Jun 5, 2019 via email

juanjosealonso commented Jun 5, 2019 via email

clarkpede commented Jun 7, 2019

talbring commented Jun 9, 2019

clarkpede commented Jun 10, 2019

clarkpede commented Jun 10, 2019

economon commented Jun 19, 2019

stale bot commented Oct 30, 2019

clarkpede commented Oct 30, 2019

stale bot commented Dec 29, 2019

clarkpede commented Jan 6, 2020

stale bot commented Mar 6, 2020

clarkpede commented Mar 6, 2020

stale bot commented May 5, 2020

Add Unit Testing to SU2 #698

Add Unit Testing to SU2 #698

Comments

clarkpede commented Jun 4, 2019 • edited Loading

What is unit testing?

What do I propose?

What is my vision for unit testing in SU2?

What frameworks are available?

Roll-your-own

Boost Test

Google Test

Catch2

Questions

economon commented Jun 5, 2019

juanjosealonso commented Jun 5, 2019 via email

juanjosealonso commented Jun 5, 2019 via email

clarkpede commented Jun 7, 2019

talbring commented Jun 9, 2019

clarkpede commented Jun 10, 2019

clarkpede commented Jun 10, 2019

economon commented Jun 19, 2019

stale bot commented Oct 30, 2019

clarkpede commented Oct 30, 2019

stale bot commented Dec 29, 2019

clarkpede commented Jan 6, 2020

stale bot commented Mar 6, 2020

clarkpede commented Mar 6, 2020

stale bot commented May 5, 2020

clarkpede commented Jun 4, 2019 •

edited

Loading