Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Measure test coverage and report to codecov.io, take 2 #202

Closed
wants to merge 47 commits into from
Closed

Measure test coverage and report to codecov.io, take 2 #202

wants to merge 47 commits into from

Conversation

ghost
Copy link

@ghost ghost commented May 17, 2018

Test-coverage reporting with gcov and result publishing to codecov

This pull request introduces coverage metrics for Icepack (e.g. #194), and submits them to codecov.io. See an example for Icepack here. The coverage metrics show what lines of source code are "hit" by the test suite, and which are not. The directory links on the bottom show the coverage metrics for the files in columnphysics and the driver.

Differences from PR #193

In #193 the code coverage was measured as part of the Travis-CI testing. It was decided that the coverage reporting is not necessary with every test build, and it would be preferable to manually invoke the reporting as needed.

I have added a flag --coverage to icepack.setup, which appends the required compiler and linker flags to the Macros.<environment> file in the case scripts directory. After the tests have succeeded, the new script configuration/scripts/tests/report_codecov.csh is invoked to copy the coverage results to the source folder and upload the coverage results to codecov.io, one test at a time. This step requires curl and bash. Codecov.io merges the results so the metrics will reflect the cumulative coverage across all tests.

Since the flags are appended directly from icepack.setup, it is not necessary to modify the Macros in order to get the coverage reporting working in a new host environment. Also, the --coverage option should work regardless of the chosen test suite.

Steps required for CICE-Consortium

Currently, the reporting is done to the codecov.io account linked to my github user (@anders-dc). I need somebody with administrative rights to @CICE-Consortium to go to the codecov.io/Icepack settings page for CICE-consortium, and provide me with the Repository Upload Token. After I get it, I will update the token in the new file configuration/scripts/tests/report_codecov.csh.

Disclaimer

I have only tested the --coverage option with gcc/gfortran.

Other changes in this PR

I transitioned from clang to GCC for linking purposes on Travis-CI (like I did with CICE). This should have no practical implications.


Developer(s): Anders Damsgaard, Princeton/NOAA-GFDL (github.com/anders-dc, adamsgaard.dk)

Are the code changes bit for bit, different at roundoff level, or more substantial? There are no changes to the underlying code.

Is the documentation being updated with this PR? (Y/N) No.

If not, does the documentation need to be updated separately? (Y/N) No.

@ghost
Copy link
Author

ghost commented May 17, 2018

I forgot to add:

Contrary to our discussion in #193 I chose not to implement an ICE_CODECOV environment variable, since the codecov state doesn't need to be communicated outside of the icepack.settings script.

@ghost
Copy link
Author

ghost commented May 18, 2018

Is there a way to have the code coverage test exclude any line with the string 'warn' in it? A lot of the misses are printed warnings that are not supposed to be accessed if the code is working correctly.

It is technically possible, but the solution I can think of would not be elegant as it would involve stripping the lines from the source code before compiling. A perfect test suite tests if non-working parameter sets actually raise warnings or errors, so I think it makes sense to leave them in. I.e. in cases you want to make sure that failures are handled correctly. However, I would not worry about getting to a coverage of 100%.

Also, which version of the code did you run this on? It looks like the code isn't completely up to date, e.g. line 652 of ice_atmo.F90
https://codecov.io/gh/CICE-Consortium/Icepack/src/master/columnphysics/icepack_atmo.F90
has the bug that was fixed in PR #196 (see https://github.com/CICE-Consortium/Icepack/pull/196/files).

The anders-dc/Icepack:codecov branch on Github contained in this PR is even with CICE-Consortium/Icepack:master, see the compare. The report you link to refers to an old coverage report on CICE-Consortium/Icepack:master, probably from the previous PR #193 that I put in. The newest report is available under the CICE-Consortium/Icepack:codecov branch on codecov.io. See the new report for icepack_atmo.F90 here. Codecov divides its reports into branches, with the same structure and naming as Github. When (and if) this PR is merged into CICE-Consortium/Icepack:master, the coverage report for CICE-Consortium/Icepack:master will be updated on codecov as well.

Copy link
Contributor

@apcraig apcraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great feature. Thanks @anders-dc for continuing to push this forward! A few comments/thoughts

Is curl part of the standard shell? Is it a package that needs to be installed? Could requirement of curl be a problem?

I like the idea of appending --coverage Macros directly from the icepack.setup script, but will those value depend on the compiler? For instance, it works with gnu as is. If --coverage can be done with intel or another compiler, would the Macros file changes be different. If so, then I think maybe the --coverage flags should be explicitly added to the Macros files. But only if we need that support for multiple compilers. If we want to say we only do codecov testing with gnu, I'm not opposed to that. But then we might want to add a check in the icepack.setup script that checks --coverage is only set when the compiler is gnu. Just to take this farther. Ultimately, it might be nice to NOT have a Macros file for each compiler on each machine. It might be nice to be able to generate Macros files for machines and compilers via a single script that reduces redundancy. We'd need to figure out how to best do that and that would make adding features like --coverage easier than changing every Macros file, and that's the trick that's being done here, but in the icepack.setup script. But that's a separate issue we should not tackle here.

Finally, just a quick question. What suite and what machine was the suite run on for this coverage test? Is it easy to tell on the codecov website or otherwise?

@ghost
Copy link
Author

ghost commented May 18, 2018

Thanks!

Is curl part of the standard shell? Is it a package that needs to be installed? Could requirement of curl be a problem?

The curl library is required for git (at least on debian (source)), so I would assume it would be available on most systems. However, I have in ddee782 made the report_codecov.csh script try to use wget if curl is not available. All systems I have encountered have one or the other installed. I also added a check to icepack.setup to make sure curl or wget is available before starting the runs. Alternatively, the codecov bash script could be downloaded beforehand and packaged with this repository (it's Apache licensed). Let me know what you think.

I like the idea of appending --coverage Macros directly from the icepack.setup script, but will those value depend on the compiler? For instance, it works with gnu as is. If --coverage can be done with intel or another compiler, would the Macros file changes be different. If so, then I think maybe the --coverage flags should be explicitly added to the Macros files. But only if we need that support for multiple compilers. If we want to say we only do codecov testing with gnu, I'm not opposed to that. But then we might want to add a check in the icepack.setup script that checks --coverage is only set when the compiler is gnu.

Good point. I chose to append the flags since it avoids multiple identical changes to the Macros files. However, with 47c9b2a I now explicitly check if --env gnu is set when --coverage is specified, and exits before the tests begin if this is not the case. For Intel compilers, it looks like there is a flag called -prof-gen=srcpos (Intel docs) which generates coverage files, but I have no experience with this and no idea if it would work with codecov or not.

However, is there anything in the source code that is compiler dependent? I.e. does the coverage for a given test suite depend on the environment? I understand that a user might want to use the --coverage flag with the same compiler as they normally use, but if results are the same with gnu it might not matter?

Just to take this farther. Ultimately, it might be nice to NOT have a Macros file for each compiler on each machine. It might be nice to be able to generate Macros files for machines and compilers via a single script that reduces redundancy. We'd need to figure out how to best do that and that would make adding features like --coverage easier than changing every Macros file, and that's the trick that's being done here, but in the icepack.setup script. But that's a separate issue we should not tackle here.

I think it would make sense to centralize the configuration and generate it "on the fly" to the host environment. Third party users might wonder why some environments are supported directly from this repository, while their favorite cluster and compiler is not. I guess the configurations in configuration/scripts/machines are more examples than anything else, but this might not be clear to the end user. Just my 2c.

Finally, just a quick question. What suite and what machine was the suite run on for this coverage test? Is it easy to tell on the codecov website or otherwise?

This was on travis with the gnu environment. I will look into if this information can be reported together with the results.

@ghost
Copy link
Author

ghost commented May 18, 2018

Alright, the git hash, the git branch, the hostname, and the test suite and id are now reported with the coverage report, and can be inspected under the Build tab:
https://codecov.io/gh/CICE-Consortium/Icepack/commit/c592730e803ac06dc4702d9956e6d372964202f5/build

EDIT: I improved the naming scheme in c592730, and updated the link above.

@ghost
Copy link
Author

ghost commented May 18, 2018

I suggest we leave the code coverage reporting enabled on Travis since the runtime increases are small. The total runtime is ~13 mins without --codecov and ~14 mins with --codecov enabled. This keeps the codecov page up to date without significantly affecting the feedback during development. For the upcoming implementation in CICE we will probably want to keep it turned off.

@apcraig
Copy link
Contributor

apcraig commented May 24, 2018

I'm still spinning up on how all this works. I'm fine keeping this "on" with travis-ci as long as it's cheap.

When I go here, https://codecov.io/gh/CICE-Consortium/Icepack/commit/c592730e803ac06dc4702d9956e6d372964202f5/build, what am I looking at? Is each large box "eea871b..." a new run of the suite. I assume newer results are in the top boxes? And the name in that box, " eea871b:HEAD on travis-job-cice-consortiu-icepack-380740950.travisci.net travisCI.travisCItest " is where the information is about where and what was tested? (By the way, there may be a typo somewhere, see "-consortiu-" above missing an "m"?).

Is this ready to test on another machine with a bigger suite? If so, I just need to add --coverage and make sure I'm compiling with gnu? I'd like to verify it works outside travis-ci.

We will need to add some words to the documentation.

@ghost
Copy link
Author

ghost commented May 25, 2018

When I go here, https://codecov.io/gh/CICE-Consortium/Icepack/commit/c592730e803ac06dc4702d9956e6d372964202f5/build, what am I looking at? Is each large box "eea871b..." a new run of the suite. I assume newer results are in the top boxes?

The page you link to contains the individual coverage reports for a single run of travis_suite. Each test in the test suite submits its own report, which is combined for the presented results.

And the name in that box, " eea871b:HEAD on travis-job-cice-consortiu-icepack-380740950.travisci.net travisCI.travisCItest " is where the information is about where and what was tested?

I define the name in the following format (sources: 1, 2, 3):

"`git rev-parse --short HEAD`:`git rev-parse --abbrev-ref HEAD` on `hostname` `basename '${dir}' | sed 's/_.*\././'`"
# which produces:
# "<short git commit hash>:<current git branch name> on <hostname> <test_suite_id>"

I guess it would make sense to add the name of the test itself to the name.

(By the way, there may be a typo somewhere, see "-consortiu-" above missing an "m"?).

This comes from the hostname command. Apparently, Travis (where this test report was generated) names its virtual machines after the Github user, but truncating the username to 14 characters.

Is this ready to test on another machine with a bigger suite? If so, I just need to add --coverage and make sure I'm compiling with gnu? I'd like to verify it works outside travis-ci.

It should be ready to test on a bigger suite. Just add --coverage. If the environment is something else than gnu, icepack.setup will throw an error (source).

We will need to add some words to the documentation.

Good point, I suggest we get to that when we are happy with the form of the implementation.

@apcraig
Copy link
Contributor

apcraig commented Jun 13, 2018

Sorry, I would like to try this on a machine before we commit. Will try to do that next week. I'm holding this up at this point.

@ghost
Copy link
Author

ghost commented Jun 13, 2018

No problem, let me know if you have any further questions or encounter any issues.

@apcraig
Copy link
Contributor

apcraig commented Jun 23, 2018

I finally got a chance to give this a try. Have several questions.

First, there are a couple issues in the script. I think we want to be able to run report_codecov.csh manually. A couple variables are not defined when running manually, ICE_SANDBOX and ICE_MACHINE_WKDIR. We need to either set those variables in the script or not use them. We could have the icepack.setup create the report_codecov.csh file on the fly, for instance. This is similar to what's done for results.csh in icepack.settings.

Second, this line, "set testdirs=ls -d ${ICE_MACHINE_WKDIR}/*" is too general. It picks up any cases ever run in that directory. At least, we need "set testdirs=ls -d ${ICE_MACHINE_WKDIR}/*.$testid but we might also want to manually generate the list of cases if we generate report_codecov.csh on the fly.

Finally, I am not seeing any results on the codecov.io website, https://codecov.io/gh/CICE-Consortium/Icepack for my sample test. I ran the quicksuite on gordon_gnu which created 3 tests. It definitely looks like some errors were generated at the end. I will send Anders the output from report_coverage.csh in email.

@ghost
Copy link
Author

ghost commented Sep 13, 2018

@apcraig I regret to let you know that I have no further time to work on this. Feel free to close this PR or take over the effort.

@apcraig
Copy link
Contributor

apcraig commented Sep 13, 2018

@anders-dc I understand. Thank you for all your help! It was already on my task list to take this on. Good luck, and I hope we'll meet again.

@apcraig
Copy link
Contributor

apcraig commented Jul 27, 2019

Updated as #264, this will be closed.

@apcraig apcraig closed this Jul 27, 2019
lettie-roach pushed a commit to lettie-roach/Icepack that referenced this pull request Oct 18, 2022
* Update the restart test script to also handle binary restart files.  Also add a test to compare the diagnostic output in the log for the final restart file

* Modify the log comparison test for restart cases to work for binary i/o cases as well

* Add new comparebfb.csh script that performs the restart file and log file comparison (bit-for-bit)

* Update the restart test script to use the new comparebfb.csh script

* Have cice.setup copy the new comparebfb.csh script to the casescripts directory

* Update comparebfb.csh script to allow specific files to be passed.  If comparing log files, and a difference is encountered, the difference is printed to the log.

* Remove the log comparison from the restart test script

* Modify the BFBCOMP code to now call the comparebfb.csh script

* Add a brief workaround to handle asterisks in the log file in comparebfb.csh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants