Skip to content

Conversation

@ZQyou
Copy link
Collaborator

@ZQyou ZQyou commented Feb 17, 2020

Fixes #1161.

@pep8speaks
Copy link

pep8speaks commented Feb 17, 2020

Hello @ZQyou, Thank you for updating!

Cheers! There are no PEP8 issues in this Pull Request!Do see the ReFrame Coding Style Guide

Comment last updated at 2020-03-18 16:20:06 UTC

@jenkins-cscs
Copy link
Collaborator

Can I test this patch?

@codecov-io
Copy link

codecov-io commented Feb 17, 2020

Codecov Report

Merging #1176 into master will decrease coverage by 0.26%.
The diff coverage is 95.83%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1176      +/-   ##
==========================================
- Coverage   91.92%   91.66%   -0.27%     
==========================================
  Files          84       84              
  Lines       12209    11850     -359     
==========================================
- Hits        11223    10862     -361     
- Misses        986      988       +2     
Impacted Files Coverage Δ
reframe/frontend/statistics.py 91.03% <94.73%> (+1.31%) ⬆️
reframe/frontend/cli.py 81.45% <100.00%> (+0.17%) ⬆️
unittests/test_cli.py 93.97% <100.00%> (+0.10%) ⬆️
reframe/core/modules.py 60.66% <0.00%> (-1.93%) ⬇️
reframe/core/deferrable.py 92.94% <0.00%> (-1.79%) ⬇️
reframe/core/systems.py 87.96% <0.00%> (-1.71%) ⬇️
unittests/test_launchers.py 92.72% <0.00%> (-1.48%) ⬇️
unittests/test_environments.py 72.07% <0.00%> (-1.22%) ⬇️
reframe/core/environments.py 88.28% <0.00%> (-1.09%) ⬇️
reframe/core/exceptions.py 84.09% <0.00%> (-0.81%) ⬇️
... and 32 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ece0880...4f13a7e. Read the comment docs.

@vkarak vkarak self-requested a review February 17, 2020 18:59
@vkarak vkarak requested a review from teojgo February 17, 2020 19:00
@vkarak vkarak added this to the ReFrame sprint 20.03 milestone Feb 17, 2020
@vkarak vkarak changed the title [wip][feat] Print summary table and run options for failures [feat] Print summary table and run options for failures Feb 27, 2020
Copy link
Contributor

@vkarak vkarak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ZQyou, thanks for your PR. I have some general comments first.

  • I think that the description in the summary table is redundant; it just explains what each phase is.
  • I would call this part as "Failure Statistics" and move it to a separate function from the standard summary report. A command line option could control whether the statistics are printed or not.
  • Instead of printing the command line options to rerun each test individually, I think it'd be better to provide different groupings in the statistics, e.g., (failures by programming environment, by system partition, by phase etc) and provide the failed tests as |-separated list that could be passed to the -n command line option. Something like the following:
Total number of tests: 6
Total number of failures: 5

Phase       #     Failing tests
----------- ----- ------------------------------------------------------------
setup       2     BadSetupCheck|BadSetupCheckEarly
sanity      1     SanityFailureCheck
performance 1     PerformanceFailureCheck
cleanup     1     CleanupFailTest

You could produce similar summaries per programming environment and per system partition as well.

* Add failure_stats function to print failure statistics
* Add "Rerun as:" in failure report for run options
@ZQyou
Copy link
Collaborator Author

ZQyou commented Mar 4, 2020

@vkarak I have made changes according to your suggestions.

In my tests with release 2.21 and this branch, I found:

  1. The failure statistics is followed by one "reason" message from one of failures.
  2. The situation is worse with 2.21. Sometimes I cannot get correct statistics.

I think one of these issues has been fixed in pre-release but need more work?

Here is the last part of a log file from working directory:

[2020-03-04T02:23:41] info: reframe: ==============================================================================
FAILURE STATISTICS

Total number of tests: 668
Total number of failures: 35

Phase       #     Failing tests                                               
----------- ----- ------------------------------------------------------------
run         22    cp2k_version|cp2k_version|hdf5_default|hdf5_default|lammps_version|lammps_version|lammps_version|lammps_version|namd_jac10000_mpi_test|netcdf_default|netcdf_default|netcdf_ncdump_test|netcdf_ncdump_test|netcdf_version|netcdf_version|openmpi_default_pgi|openmpi_default_pgi|openmpi_module_path|openmpi_module_path|openmpi_test_mpi_placement|totalview_version|totalview_version
compile     10    hdf5_test|hdf5_test|netcdf_test|netcdf_test|netcdf_mpi_test|openmpi_test_pingpong|openmpi_test_hybrid_placement|parmetis_test|pnetcdf_test|pnetcdf_test
performance 1     mvapich2_rgr_omb_bw_18_0_3_2_3                              
sanity      2     singularity_instance_test|sratoolkit_test                   
______________________________________________________________________________
[2020-03-04T02:23:41] error: reframe: /users/PZS0710/zyou/test/reframe/dev/current/reframe.py: : could not show module 'lammps': command '['/usr/local/lmod/lmod/libexec/lmod', 'python', 'show', 'lammps']' failed with exit code 1:
=== STDOUT ===

@vkarak
Copy link
Contributor

vkarak commented Mar 4, 2020

@ZQyou Thanks for the updating the PR. I've read your comments, but I didn't have the time to get back to you. I'll review your PR and get back to you asap.

@vkarak
Copy link
Contributor

vkarak commented Mar 5, 2020

The failure statistics is followed by one "reason" message from one of failures.

I haven't seen that with some runs using this branch. Perhaps, it's a more general issue and it could reserve a separate issue if you have a reproducer.

The situation is worse with 2.21. Sometimes I cannot get correct statistics.

That could also be a problem. If you have a reproducer it would help.

I think one of these issues has been fixed in pre-release but need more work?

I haven't seen something related being merged since 2.21, so the bug, if it is one, may still be around...

@vkarak
Copy link
Contributor

vkarak commented Mar 5, 2020

As a final step, we will also need a unit test in test_cli.py that will exercise this option. You can simply invoke reframe on the unittests/resources/checks/frontend_check.py and check that key aspects of the relevant output are correct.

* Print test cases in the table
* Add unittest for option `--failure-stats
Copy link
Contributor

@vkarak vkarak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ZQyou for updating the PR. Looks good now. I have just a couple of minor comments, which I could also address if you don't have time.

Copy link
Contributor

@vkarak vkarak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit tests are failing. I will do the small fix necessary.

Copy link
Contributor

@vkarak vkarak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm now. Thanks @ZQyou for the PR.

@vkarak vkarak changed the title [feat] Print summary table and run options for failures [feat] Print failure statistics table as well as the run options for failures Mar 18, 2020
@vkarak
Copy link
Contributor

vkarak commented Mar 18, 2020

ok to test

@vkarak vkarak merged commit 007bc0b into reframe-hpc:master Mar 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Print summary table and run options (optionally) for failures

5 participants