[feat] Abbreviate node lists in `FAILURE INFO` reports #1912

jgphpc · 2021-04-06T11:18:40Z

EDIT vkarak: This PR adds an algorithm for compressing node lists and returning a a condensed string representation for them. This abbreviated form of node lists is used in FAILURE INFO reports, but not in the full JSON run report. The reason for this is that the JSON report is meant as raw report info that other tools can process, thus it makes more sense imho not to abbreviate the node lists there.

codecov-io · 2021-04-06T11:20:44Z

Codecov Report

Merging #1912 (0bafe2f) into master (725c78f) will decrease coverage by 0.01%.
The diff coverage is 75.00%.

@@            Coverage Diff             @@
##           master    #1912      +/-   ##
==========================================
- Coverage   87.90%   87.89%   -0.02%     
==========================================
  Files          49       49              
  Lines        8451     8459       +8     
==========================================
+ Hits         7429     7435       +6     
- Misses       1022     1024       +2

Impacted Files	Coverage Δ
reframe/core/schedulers/slurm.py	`52.15% <0.00%> (-0.29%)`	⬇️
reframe/core/schedulers/__init__.py	`98.40% <100.00%> (+0.03%)`	⬆️
reframe/frontend/cli.py	`76.03% <100.00%> (+0.04%)`	⬆️
reframe/frontend/statistics.py	`95.47% <100.00%> (+0.02%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 725c78f...0bafe2f. Read the comment docs.

pep8speaks · 2021-04-07T04:43:36Z

Hello @jgphpc, Thank you for updating!

Cheers! There are no PEP8 issues in this Pull Request!Do see the ReFrame Coding Style Guide

Comment last updated at 2021-04-16 15:10:34 UTC

jgphpc · 2021-04-08T10:00:09Z

note: we could prefer hostlist format over nodelist and/or switch to hostlist if len(nodelist) > 1000 ?

reframe/core/schedulers/__init__.py

vkarak

This solution is entirely Slurm specific. I would prefer that we converted any host list to an abbreviated sequence ourselves. Algorithmically, it should not be difficult as soon as we sort the nodes. It's practically a run-length encoding of the node ids. As for the node name format, we can assume a generic pattern that ends in a sequence of consecutive numbers.

Also we don't need command line options etc. for this. We only need a configuration parameter and an associated environment variable RFM_ABBREV_NODELIST=<n>. If <n> is zero then we don't abbreviate, otherwise we abbreviate any node list with size >= n.

reframe/core/schedulers/__init__.py

vkarak · 2021-04-15T11:36:35Z

@jgphpc I took care of the algorithm and it works nicely now. Can you address the rest of the comments?

Also we don't need command line options etc. for this. We only need a configuration parameter and an associated environment variable RFM_ABBREV_NODELIST=. If is zero then we don't abbreviate, otherwise we abbreviate any node list with size >= n.

Also I don't think that this conversion should be done at the backends, so all the change in the scheduler backends should be reverted. This conversion is purely a presentation thing, so it has to go into the frontend, when we generate the final report.

unittests/test_utility.py

jjotero

Looks good!

unittests/test_utility.py

- Node lists are always abbreviated in the `FAILURE INFO` output but not in the JSON report.

victorusu · 2021-04-16T14:39:46Z

@vkarak, if I understand the last changes correctly, they imply that we will always use the abbreviated node list. Is it what we want?

vkarak · 2021-04-16T14:43:20Z

@victorusu Check the modified description of the PR. Yes, in FAILURE INFO there is no reason to use the non-abbreviated form. Conversely, the JSON report contains the full node list.

victorusu

lgtm. I just have one question related to whether the r['nodelist'] array is always populated, because it used to have a check if it was empty or not. There are some assertions requesting that it has at least one entry. So, it got me confused a bit.

reframe/frontend/statistics.py

Abbreviate nodelist using hostlist format

7a1dbb1

jgphpc requested a review from vkarak April 6, 2021 11:18

jgphpc self-assigned this Apr 6, 2021

Merge branch 'master' into nodelist

0bafe2f

Merge branch 'master' into nodelist

769a0ad

jgphpc changed the title ~~[wip][feat] Abbreviate nodelist using hostlist format~~ [feat] Abbreviate nodelist using hostlist format Apr 8, 2021

vkarak added enhancement prio: normal labels Apr 8, 2021

vkarak added this to the ReFrame sprint 21.04.1 milestone Apr 8, 2021

victorusu reviewed Apr 12, 2021

View reviewed changes

reframe/core/schedulers/__init__.py Outdated Show resolved Hide resolved

vkarak requested changes Apr 12, 2021

View reviewed changes

reframe/core/schedulers/__init__.py Outdated Show resolved Hide resolved

vkarak self-assigned this Apr 13, 2021

Add utility function for abbreviating node lists

3934a7d

vkarak requested a review from victorusu April 15, 2021 11:37

Vasileios Karakasis added 3 commits April 15, 2021 13:41

Remove dead code and fix regex string

146582d

More comments and PEP8 fixes

21a6db9

Merge branch 'master' into nodelist

c28856c

vkarak requested a review from jjotero April 15, 2021 15:47

vkarak reviewed Apr 15, 2021

View reviewed changes

unittests/test_utility.py Show resolved Hide resolved

jjotero approved these changes Apr 16, 2021

View reviewed changes

unittests/test_utility.py Show resolved Hide resolved

Vasileios Karakasis added 2 commits April 16, 2021 16:36

Add unit test for node duplicates

f13e166

Revert changes related to configuration

5b69e45

- Node lists are always abbreviated in the `FAILURE INFO` output but not in the JSON report.

vkarak changed the title ~~[feat] Abbreviate nodelist using hostlist format~~ [feat] Abbreviate node lists in FAILURE INFO reports Apr 16, 2021

vkarak approved these changes Apr 16, 2021

View reviewed changes

victorusu approved these changes Apr 16, 2021

View reviewed changes

reframe/frontend/statistics.py Show resolved Hide resolved

Merge branch 'master' into nodelist

f0e8d50

vkarak merged commit 91f481f into reframe-hpc:master Apr 16, 2021

jgphpc deleted the nodelist branch April 18, 2021 06:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feat] Abbreviate node lists in `FAILURE INFO` reports #1912

[feat] Abbreviate node lists in `FAILURE INFO` reports #1912

Uh oh!

jgphpc commented Apr 6, 2021 •

edited by vkarak

Loading

Uh oh!

codecov-io commented Apr 6, 2021 •

edited

Loading

Uh oh!

pep8speaks commented Apr 7, 2021 •

edited

Loading

Uh oh!

jgphpc commented Apr 8, 2021

Uh oh!

Uh oh!

vkarak left a comment •

edited

Loading

Uh oh!

Uh oh!

vkarak commented Apr 15, 2021

Uh oh!

Uh oh!

jjotero left a comment

Uh oh!

Uh oh!

victorusu commented Apr 16, 2021

Uh oh!

vkarak commented Apr 16, 2021

Uh oh!

victorusu left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[feat] Abbreviate node lists in FAILURE INFO reports #1912

[feat] Abbreviate node lists in FAILURE INFO reports #1912

Uh oh!

Conversation

jgphpc commented Apr 6, 2021 • edited by vkarak Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-io commented Apr 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

pep8speaks commented Apr 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2021-04-16 15:10:34 UTC

Uh oh!

jgphpc commented Apr 8, 2021

Uh oh!

Uh oh!

vkarak left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vkarak commented Apr 15, 2021

Uh oh!

Uh oh!

jjotero left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

victorusu commented Apr 16, 2021

Uh oh!

vkarak commented Apr 16, 2021

Uh oh!

victorusu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[feat] Abbreviate node lists in `FAILURE INFO` reports #1912

[feat] Abbreviate node lists in `FAILURE INFO` reports #1912

jgphpc commented Apr 6, 2021 •

edited by vkarak

Loading

codecov-io commented Apr 6, 2021 •

edited

Loading

pep8speaks commented Apr 7, 2021 •

edited

Loading

vkarak left a comment •

edited

Loading