Skip to content

Conversation

@lucamar
Copy link
Contributor

@lucamar lucamar commented Mar 8, 2021

I have updated the checks defined on Eiger to run on Pilatus as well, please note the following:

  • expected failure of NCO_CDOModuleCompatibilityTest, since no CDO is available at present on Pilatus, due to a Cray bug to be solved in PE 21.04
  • I have not adapted affinity_check.py, could you please adapt it to Pilatus in a separate pull request @jjotero?
  • I get the following message when I start the checks, could you please review the corresponding files @ajocksch and @sebkelle1?
skipping incompatible test defined in class: HaloCellExchangeTest
skipping incompatible test defined in class: StridedBandwidthTest
skipping incompatible test defined in class: DGEMMTest
Restoring modules from user's PrgEnv-intel
Lmod Warning: One or more modules in your PrgEnv-intel collection have
changed: "intel".
To see the contents of this collection execute:
  $ module describe PrgEnv-intel
To rebuild the collection, do a module reset, then load the modules you wish,
then execute:
  $ module save PrgEnv-intel
If you no longer want this module collection execute:
  $ rm ~/.lmod.d/PrgEnv-intel

For more information execute 'module help' or see http://lmod.readthedocs.org/
No change in modules loaded.

It seems like the collection PrgEnv-Intel cannot be restored correctly, so the check fails afterwards.

@pep8speaks
Copy link

pep8speaks commented Mar 8, 2021

Hello @lucamar, Thank you for updating!

Line 89:45: W292 no newline at end of file

Line 72:61: W292 no newline at end of file

Line 66:11: W292 no newline at end of file

Line 108:20: W292 no newline at end of file

Line 35:14: E111 indentation is not a multiple of four
Line 35:14: E117 over-indented
Line 36:10: E901 IndentationError: unindent does not match any outer indentation level

Line 144:73: W292 no newline at end of file

Line 145:18: W292 no newline at end of file

Line 302:57: W292 no newline at end of file

Line 228:10: W292 no newline at end of file

Line 121:34: W292 no newline at end of file

Do see the ReFrame Coding Style Guide

Comment last updated at 2021-03-17 12:06:13 UTC

@jjotero
Copy link
Contributor

jjotero commented Mar 8, 2021

@lucamar I've updated the affinity test in PR #1834

@vkarak vkarak added this to the ReFrame sprint 21.03.1 milestone Mar 8, 2021
@lucamar
Copy link
Contributor Author

lucamar commented Mar 9, 2021

@lucamar I've updated the affinity test in PR #1834

Thanks @jjotero, unfortunately Pilatus was not yet included in the ReFrame CI and the check fails with PrgEnv-intel, like other checks (see my comment above).

self.valid_systems += ['eiger:mc', 'pilatus:mc']

# PrgEnv-intel on Pilatus does not feature cray-hdf5 as of PE 21.02
if self.current_system.name == 'pilatus':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and eiger ? and PrgEnv-intel ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception is only for Pilatus since at present PrgEnv-intel does not feature cray-hdf5 in PE 21.02.
There is no PrgEnv-intel on Eiger at the moment, so this is not defined in the config.py for the system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can check yourself that the intel compiler is missing in the output of the command module spider cray-hdf5/1.12.0.3:

uan04:~$ module spider cray-hdf5/1.12.0.3
----------------------------------------------------------------------------------------------------------------------------------
  cray-hdf5: cray-hdf5/1.12.0.3
----------------------------------------------------------------------------------------------------------------------------------
    You will need to load all module(s) on any one of the lines below before the "cray-hdf5/1.12.0.3" module is available to load.
      aocc/2.2.0.1
      cce/10.0.4
      cce/11.0.0
      cce/11.0.1
      cce/11.0.2
      cce/11.0.3
      gcc/10.1.0
      gcc/10.2.0
      gcc/9.3.0

@jgphpc
Copy link
Contributor

jgphpc commented Mar 10, 2021

Note:


cscs-checks/apps/paraview/paraview_check.py        self.maintainers = ['JF', 'TM']
cscs-checks/libraries/boost/boost_python_check.py        self.maintainers = ['TM', 'AJ']
cscs-checks/libraries/io/hdf5_compile_run.py        self.maintainers = ['SO', 'RS']
cscs-checks/libraries/io/netcdf_compile_run.py        self.maintainers = ['AJ', 'SO']
cscs-checks/libraries/math/scalapack_compile_run.py        self.maintainers = ['CB', 'LM']
cscs-checks/microbenchmarks/cpu/alloc_speed/alloc_speed.py        self.maintainers = ['AK', 'VH']
cscs-checks/microbenchmarks/cpu/dgemm/dgemm.py        self.maintainers = ['AJ', 'VH']
cscs-checks/microbenchmarks/cpu/simd/vc.py        self.maintainers = ['JG']
cscs-checks/microbenchmarks/cpu/strided_bandwidth/strides.py        self.maintainers = ['SK']
cscs-checks/microbenchmarks/mpi/halo_exchange/halo_cell_exchange.py        self.maintainers = ['AJ']
cscs-checks/microbenchmarks/mpi/osu/osu_tests.py        self.maintainers = ['RS', 'AJ']
cscs-checks/prgenv/environ_check.py        self.maintainers = ['TM', 'CB', 'EK']
cscs-checks/prgenv/helloworld.py        self.maintainers = ['VH', 'EK']
cscs-checks/prgenv/mpi.py        self.maintainers = ['JG', 'AJ', 'RS']
cscs-checks/prgenv/mpi_t.py        self.maintainers = ['JG']
cscs-checks/prgenv/ulimit_check.py        self.maintainers = ['RS', 'CB']
cscs-checks/system/io/ior_check.py        self.maintainers = ['SO', 'GLR']
cscs-checks/system/slurm/slurm.py       self.maintainers = ['RS', 'VH', 'JG']
cscs-checks/tools/io/nco.py        self.maintainers = ['SO', 'CB']
cscs-checks/tools/profiling_and_debugging/notool.py        self.maintainers = ['JG', 'MKr']

'PrgEnv-intel', 'PrgEnv-gnu', 'PrgEnv-pgi',
self.valid_prog_environs = ['PrgEnv-aocc', 'PrgEnv-cray',
'PrgEnv-cray_classic', 'PrgEnv-gnu',
'PrgEnv-intel', 'PrgEnv-pgi',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of this check is to test default PrgEnv-xxx but I believe it may fail on eiger/pilatus:

Our config is:

            'name': 'PrgEnv-gnu',
            'target_systems': [
                'eiger', 'pilatus'
            ],
            'modules': [
                {'name': 'PrgEnv-gnu', 'collection': True}

As module restore PrgEnv-xxx loads PE from ~/.lmod.d/PrgEnv-xxx
and because $HOME is shared, the collection [may] differ hence restore will fail.

Copy link
Contributor Author

@lucamar lucamar Mar 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the moment the collections work on Eiger and Pilatus, with the exception of PrgEnv-intel only (failing on Pilatus).
Since module collections for PrgEnv-xxx modules will be dropped by Cray as of PE 21.04, it won't be an issue.
Anyway, Lmod provides the environment variable $LMOD_SYSTEM_NAME that can be uniquely defined for each system to avoid conflicts: see https://lmod.readthedocs.io/en/latest/010_user.html#user-collections-on-shared-home-file-systems

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.valid_prog_environs = ['PrgEnv-cray', 'PrgEnv-gnu', 'PrgEnv-pgi',
'PrgEnv-intel', 'PrgEnv-aocc']
'eiger:mc', 'pilatus:mc']
self.valid_prog_environs = ['PrgEnv-aocc', 'PrgEnv-cray', 'PrgEnv-gnu',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        self.valid_prog_environs = ['PrgEnv-aocc', 'PrgEnv-cray', 'PrgEnv-gnu',
                                    'PrgEnv-intel', 'PrgEnv-pgi',
                                    'cpeAMD', 'cpeCray', 'cpeGNU', 'cpeIntel'
        ]

@jgphpc
Copy link
Contributor

jgphpc commented Mar 11, 2021

I have reviewed the checks where I am maintainer, lgtm.
I will let the others review their checks.

@codecov-io
Copy link

Codecov Report

Merging #1839 (868801c) into master (ed40b22) will increase coverage by 0.21%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1839      +/-   ##
==========================================
+ Coverage   87.61%   87.83%   +0.21%     
==========================================
  Files          49       49              
  Lines        8141     8395     +254     
==========================================
+ Hits         7133     7374     +241     
- Misses       1008     1021      +13     
Impacted Files Coverage Δ
reframe/core/variables.py 94.83% <0.00%> (-1.64%) ⬇️
reframe/core/pipeline.py 91.96% <0.00%> (-0.02%) ⬇️
reframe/core/meta.py 100.00% <0.00%> (ø)
reframe/core/parameters.py 98.50% <0.00%> (+0.04%) ⬆️
reframe/core/namespaces.py 95.00% <0.00%> (+1.89%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ed40b22...868801c. Read the comment docs.

self.variables = {
'OMP_NUM_THREADS': str(self.num_cpus_per_task)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 117: this PrgEnf-pgi looks like a typo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 31: you may want to remove PrgEnv-cray_classicas it is no longer part of config/cscs.py

@lucamar
Copy link
Contributor Author

lucamar commented Apr 26, 2021

I'm closing since this pull request is now outdated and replaced by #1950.

@lucamar lucamar closed this Apr 26, 2021
@lucamar lucamar deleted the addons-pilatus branch April 26, 2021 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants