Skip to content

Conversation

@lucamar
Copy link
Contributor

@lucamar lucamar commented Aug 28, 2019

I have tested the MCH checks using the Cray Programming Environment 17.06 (module PE/17.06) and the MPI library openmpi/4.0.1-pgi-18.5-gcc-5.4.0-2.26-cuda-8.0.61 for PrgEnv-pgi/18.5.
The latter is not yet in production, since is part of the pull request eth-cscs/production#1271, therefore I had to use my sandbox for testing it.
Fieldextra checks cannot work at present since the module fieldextra/12.7.5-gmvolf-17.02 is not yet in production.

@lucamar
Copy link
Contributor Author

lucamar commented Aug 30, 2019

I have updated the reference value of CudaStressTest, since it was failing occasionally.
The change is from 2.12769 to 2.25, an increase of ~ 5.7%.

@lucamar
Copy link
Contributor Author

lucamar commented Aug 30, 2019

The failure of the checks HaloExchangeTest_nocomp and HaloExchangeTest_default on 144 MPI tasks (9 nodes) is unusual...

@lucamar lucamar changed the title [test] MCH checks on the upgraded Kesch (RH7.5) [test] MCH checks with CrayPE 17.06 on Kesch (RH7.5) Aug 30, 2019
@vkarak vkarak requested review from teojgo and vkarak September 2, 2019 13:28
@vkarak vkarak assigned vkarak and lucamar and unassigned vkarak Sep 2, 2019
@vkarak vkarak added this to the ReFrame sprint 2019w35 milestone Sep 2, 2019
@vkarak
Copy link
Contributor

vkarak commented Sep 2, 2019

@lucamar I see very bad performance for the HaloExchangeTest_default test. Is this expected?

Reason: performance error: failed to meet reference: elapsed_time=30.3839, expected 5.53777 (l=-inf, u=6.3684354999999995)

@lucamar
Copy link
Contributor Author

lucamar commented Sep 3, 2019

@lucamar I see very bad performance for the HaloExchangeTest_default test. Is this expected?

Reason: performance error: failed to meet reference: elapsed_time=30.3839, expected 5.53777 (l=-inf, u=6.3684354999999995)

Unfortunately it was not happening so often in the past: we will need to monitor the check once we merge it.

Copy link
Contributor

@vkarak vkarak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Thanks @lucamar

@vkarak vkarak changed the title [test] MCH checks with CrayPE 17.06 on Kesch (RH7.5) [test] Adapt MCH checks to CrayPE 17.06 on Kesch (RH7.5) Sep 3, 2019
@vkarak vkarak merged commit 892af98 into reframe-hpc:master Sep 3, 2019
@lucamar lucamar deleted the kesch branch September 3, 2019 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants