Skip to content

Conversation

@jgphpc
Copy link
Contributor

@jgphpc jgphpc commented Nov 20, 2020

thanks @ekouts

@codecov-io
Copy link

codecov-io commented Nov 20, 2020

Codecov Report

Merging #1613 (5ec66bc) into master (26da6e4) will decrease coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1613      +/-   ##
==========================================
- Coverage   87.54%   87.54%   -0.01%     
==========================================
  Files          44       44              
  Lines        7292     7288       -4     
==========================================
- Hits         6384     6380       -4     
  Misses        908      908              
Impacted Files Coverage Δ
reframe/core/config.py 90.90% <0.00%> (-0.10%) ⬇️
reframe/frontend/cli.py 74.94% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 26da6e4...5ec66bc. Read the comment docs.

ekouts
ekouts previously requested changes Nov 23, 2020
Copy link
Contributor

@ekouts ekouts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though none of the tests fail there is an error printed in the end:

[  PASSED  ] Ran 15 test case(s) from 10 check(s) (0 failure(s))
[==========] Finished on Fri Nov 20 15:11:26 2020 
./bin/reframe: rfm_MemoryMpiCheck_job.out: No such file or directory

It is coming from these lines, but I am not sure how you should write the test to avoid it

        regex_mem = r'^Currently avail memory: (\d+)'
        self.reference_meminfo = \
            sn.extractsingle(regex_mem, self.stdout, 1,
                             conv=lambda x: int(int(x) / 1024**3))

Also this part doesn't need to be in a hook, since you use the deferable, right?

@jgphpc
Copy link
Contributor Author

jgphpc commented Nov 23, 2020

👍 I moved it out of the hook.
I still get: rfm_MemoryMpiCheck_job.out: No such file or directory

@jgphpc jgphpc changed the title [test] eatmem_mpi [test] Parallel version of the eat memory check Nov 23, 2020
@vkarak vkarak added this to the ReFrame sprint 20.18 milestone Nov 24, 2020
Copy link
Contributor

@vkarak vkarak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I cannot understand very well is the value of this test. Why running the single node eat memory test on multiple nodes wouldn't do the same?

@jgphpc
Copy link
Contributor Author

jgphpc commented Dec 2, 2020

value of this test

It's an extension of the MemoryOverconsumptionCheck:

it tests a slurm flag with a fixed value:
https://github.com/eth-cscs/reframe/blob/master/cscs-checks/system/slurm/slurm.py#L189
I am trying to test values closer to the node limit which I guess is more flexible.
Also, running on all cores is faster (not a huge difference) + some performance reporting.

@vkarak
Copy link
Contributor

vkarak commented Dec 7, 2020

@jgphpc I did some minor enhancements to your test, but on Dom it fails to meet some references. Can you check? And then we can merge.

@vkarak vkarak changed the title [test] Parallel version of the eat memory check [test] Add MPI version of the eat memory check Dec 7, 2020
@vkarak vkarak requested a review from ekouts December 7, 2020 11:40
@vkarak vkarak dismissed ekouts’s stale review December 7, 2020 12:27

Changes done and we need to merge it today.

@vkarak vkarak merged commit 39aa59a into reframe-hpc:master Dec 7, 2020
@jgphpc jgphpc deleted the eatmem_mpi branch December 7, 2020 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants