Skip to content

Conversation

@lucamar
Copy link
Contributor

@lucamar lucamar commented May 8, 2020

I propose to run the Cp2kGpuCheck with 6 tasks per node and 2 cpus per task, as it is slightly more efficient with respect to the current Slurm setup.
I would also like to test the CP2K modules recently provided with the latest stable release 7.1: however, the new modules are not yet default.
Should we change directly the modules loaded in the existing checks or should we rather create a new set of checks for the new versions with a new variant (e.g. dev) and keep the existing ones for the default modules only?
The latter would be probably a better solution to keep track of the performance of the new vs. old release of applications: the new variant would result in having Cp2kCpuCheck_small_dev, etc...

Please let me know your opinion and I would adjust the checks accordingly.

@pep8speaks
Copy link

pep8speaks commented May 8, 2020

Hello @lucamar, Thank you for updating!

Cheers! There are no PEP8 issues in this Pull Request!Do see the ReFrame Coding Style Guide

Comment last updated at 2020-06-09 21:12:30 UTC

@lucamar
Copy link
Contributor Author

lucamar commented May 8, 2020

@jenkins-cscs retry dom

@lucamar
Copy link
Contributor Author

lucamar commented May 11, 2020

The check on dom:mc failed since the nodes were not available:

Reason: job blocked error: [jobid=1015576] job cancelled because it was blocked due to a perhaps non-recoverable reason: 
ReqNodeNotAvail,  UnavailableNodes:nid00[002,038-040,043,404-405,418-419]

@lucamar
Copy link
Contributor Author

lucamar commented May 12, 2020

@jenkins-cscs retry dom

@lucamar
Copy link
Contributor Author

lucamar commented May 12, 2020

No chance to have 6 mc nodes on Dom:

dom101:~$ scontrol show nodes nid00[000-011,032-043,052-055,404-407,416-419,401-402] | grep -A 5 ActiveFeatures=mc | grep State
   State=MAINT+DRAIN ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   State=MAINT+DRAIN ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   State=RESERVED ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   State=MAINT ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A

@vkarak
Copy link
Contributor

vkarak commented May 14, 2020

Hi @lucamar, is this setup valid also for the default CP2K version? If yes, I would suggest to keep the default module and invoke reframe with -M to map to the new version.

@lucamar
Copy link
Contributor Author

lucamar commented May 14, 2020

@vkarak Thanks for your comment: yes, the new setup is also valid for the current default module.
Of course we can use the module mapping to test the new module, as I have being doing so far: the only disadvantage is that you need to issue two different commands for gpu and mc checks, since the modules to be mapped are different.
I thought that it would be interesting to keep track of the performance of the new modules of supported applications before changing the defaults: maybe we can find a different way to do that...

Reverting back to default CP2K modulefile for gpu and mc checks
@vkarak
Copy link
Contributor

vkarak commented May 14, 2020

@lucamar What if we parametrized the test on the module name?

@codecov-commenter
Copy link

Codecov Report

Merging #1304 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1304   +/-   ##
=======================================
  Coverage   91.66%   91.66%           
=======================================
  Files          83       83           
  Lines       12673    12673           
=======================================
  Hits        11617    11617           
  Misses       1056     1056           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dc1d118...c302a4b. Read the comment docs.

Copy link
Contributor

@vkarak vkarak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@vkarak vkarak changed the title [test] New Slurm setup for Cp2kGpuCheck and new modulefiles [test] New Slurm setup for Cp2kGpuCheck Jun 10, 2020
@vkarak vkarak merged commit c18cde8 into reframe-hpc:master Jun 10, 2020
@lucamar lucamar deleted the cp2k branch June 12, 2020 07:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants