Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
1f24f72
Add utility for processor information
ekouts Mar 22, 2021
0ce02f4
Fix bug
ekouts Mar 22, 2021
73e8af3
Fix crashing when files or modules are not available
ekouts Mar 23, 2021
ed5efa3
Merge branch 'master' of https://github.com/eth-cscs/reframe into fea…
ekouts Mar 23, 2021
4a2a963
Fix archspec import
ekouts Mar 23, 2021
e2529c1
Add archspec in reframe requirements
ekouts Mar 23, 2021
6d9c8c7
Add experimental CLI option for accessing now the auto-config
May 22, 2021
5ae266e
Merge branch 'master' into feat/cpu-autodetect
May 22, 2021
c367c26
Merge branch 'master' into feat/cpu-autodetect
May 26, 2021
12f11b3
Add the command-line option
May 26, 2021
9bec400
Auto-detect topology
May 26, 2021
055d2b7
Add log calls
May 29, 2021
b33bfdd
Use parallel launcher
May 29, 2021
b6c329f
Fix doc build + wheel creation
May 29, 2021
72c6164
Remove unused imports
May 29, 2021
45f86f4
Add unit tests
May 30, 2021
8f1500b
Fine tune implementation
May 31, 2021
43adf18
Merge branch 'master' into feat/cpu-autodetect
Jun 1, 2021
6522852
Load device metadata files
Jun 1, 2021
c030e3a
More unit tests for topology auto-detection
Jun 2, 2021
071e545
Remove unused imports
Jun 2, 2021
7ea6115
Add documentation
Jun 6, 2021
dbb5823
Merge branch 'master' into feat/cpu-autodetect
Jun 9, 2021
9c10cbe
Address PR comments
Jun 29, 2021
0953adc
WIP: Address PR comments
Jul 1, 2021
0f87547
Fix UnboundLocalError
Jul 1, 2021
9617a4b
Fix PEP8 issues
Jul 2, 2021
e22f28d
Fix remote detection
Jul 2, 2021
ac53c8d
Temporarily change pull repo and branch
Jul 2, 2021
059ebe0
Fix reframe executable
Jul 2, 2021
71c2b26
Improve remote processor detection
Jul 5, 2021
d7b4133
Fix remote detection
Jul 6, 2021
267d9e0
Re-add the temp dir removal
Jul 6, 2021
d942a14
Update documentation
Jul 8, 2021
a919de0
Merge branch 'master' into feat/cpu-autodetect
Jul 9, 2021
6e5a086
Lock meta-config processor info file for reading/writing
Jul 9, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions docs/config_reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -341,9 +341,13 @@ System Partition Configuration
:default: ``{}``

Processor information for this partition stored in a `processor info object <#processor-info>`__.
If not set, ReFrame will try to auto-detect this information (see :ref:`proc-autodetection` for more information).

.. versionadded:: 3.5.0

.. versionchanged:: 3.7.0
ReFrame is now able to detect the processor information automatically.


.. js:attribute:: .systems[].partitions[].devices

Expand Down Expand Up @@ -1201,6 +1205,28 @@ General Configuration
The command-line option sets the configuration option to ``false``.


.. js:attribute:: .general[].remote_detect

:required: No
:default: ``false``

Try to auto-detect processor information of remote partitions as well.
This may slow down the initialization of the framework, since it involves submitting auto-detection jobs to the remote partitions.
For more information on how ReFrame auto-detects processor information, you may refer to :ref:`proc-autodetection`.

.. versionadded:: 3.7.0


.. js:attribute:: .general[].remote_workdir

:required: No
:default: ``"."``

The temporary directory prefix that will be used to create a fresh ReFrame clone, in order to auto-detect the processor information of a remote partition.

.. versionadded:: 3.7.0


.. js:attribute:: .general[].ignore_check_conflicts

:required: No
Expand Down
40 changes: 40 additions & 0 deletions docs/configure.rst
Original file line number Diff line number Diff line change
Expand Up @@ -397,3 +397,43 @@ Let's see some concrete examples:
"CC"

If you explicitly query a configuration value which is not defined in the configuration file, ReFrame will print its default value.


.. _proc-autodetection:

Auto-detecting processor information
------------------------------------

.. versionadded:: 3.7.0

.. |devices| replace:: :attr:`devices`
.. _devices: config_reference.html#.systems[].partitions[].devices
.. |processor| replace:: :attr:`processor`
.. _processor: config_reference.html#.systems[].partitions[].processor
.. |detect_remote_system_topology| replace:: :attr:`detect_remote_system_topology`
.. _detect_remote_system_topology: config_reference.html#.general[].detect_remote_system_topology

ReFrame is able to detect the processor topology of both local and remote partitions automatically.
The processor and device information are made available to the tests through the corresponding attributes of the :attr:`~reframe.core.pipeline.RegressionTest.current_partition` allowing a test to modify its behavior accordingly.
Currently, ReFrame supports auto-detection of the local or remote processor information only.
It does not support auto-detection of devices, in which cases users should explicitly specify this information using the |devices|_ configuration option.
The processor information auto-detection works as follows:

#. If the |processor|_ configuration is option is defined, then no auto-detection is attempted.

#. If the |processor|_ configuration option is not defined, ReFrame will look for a processor configuration metadata file in ``{configdir}/_meta/{system}-{part}/processor.json`` or in ``~/.reframe/topology/{system}-{part}/processor.json`` in case of the builtin configuration file.
If the file is found, the topology information is loaded from there.
These files are generated automatically by ReFrame from previous runs.

#. If the corresponding metadata files are not found, the processor information will be auto-detected.
If the system partition is local (i.e., ``local`` scheduler + ``local`` launcher), the processor information is auto-detected unconditionally and stored in the corresponding metadata file for this partition.
If the partition is remote, ReFrame will not try to auto-detect it unless the :envvar:`RFM_REMOTE_DETECT` or the |detect_remote_system_topology|_ configuration option is set.
In that case, the steps to auto-detect the remote processor information are the following:

a. ReFrame creates a fresh clone of itself in a temporary directory created under ``.`` by default.
This temporary directory prefix can be changed by setting the :envvar:`RFM_REMOTE_WORKDIR` environment variable.
b. ReFrame changes to that directory and launches a job that will first bootstrap the fresh clone and then run that clone with ``{launcher} ./bin/reframe --detect-host-topology=topo.json``.
The :option:`--detect-host-topology` option causes ReFrame to detect the topology of the current host,
which in this case would be the remote compute nodes.

In case of errors during auto-detection, ReFrame will simply issue a warning and continue.
42 changes: 41 additions & 1 deletion docs/manpage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -576,6 +576,16 @@ Miscellaneous options

This option can also be set using the :envvar:`RFM_SYSTEM` environment variable.

.. _--detect-host-topology:

.. option:: --detect-host-topology[=FILE]

Detect the local host processor topology, store it to ``FILE`` and exit.
If no ``FILE`` is specified, the standard output will be used.

.. versionadded:: 3.7.0


.. option:: --failure-stats

Print failure statistics at the end of the run.
Expand Down Expand Up @@ -698,6 +708,36 @@ Here is an alphabetical list of the environment variables recognized by ReFrame:
================================== ==================


.. envvar:: RFM_REMOTE_DETECT

Auto-detect processor information of remote partitions as well.

.. table::
:align: left

================================== ==================
Associated command line option N/A
Associated configuration parameter :js:attr:`remote_detect` general configuration parameter
================================== ==================

.. versionadded:: 3.7.0


.. envvar:: RFM_REMOTE_WORKDIR

The temporary directory prefix that will be used to create a fresh ReFrame clone, in order to auto-detect the processor information of a remote partition.

.. table::
:align: left

================================== ==================
Associated command line option N/A
Associated configuration parameter :js:attr:`remote_workdir` general configuration parameter
================================== ==================

.. versionadded:: 3.7.0


.. envvar:: RFM_GRAYLOG_ADDRESS

The address of the Graylog server to send performance logs.
Expand Down Expand Up @@ -920,7 +960,7 @@ Here is an alphabetical list of the environment variables recognized by ReFrame:
:align: left

================================== ==================
Associated command line option n/a
Associated command line option N/A
Associated configuration parameter :js:attr:`resolve_module_conflicts` general configuration parameter
================================== ==================

Expand Down
1 change: 1 addition & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
archspec==0.1.2
docutils==0.16 # https://github.com/sphinx-doc/sphinx/issues/9001
jsonschema==3.2.0
semver==2.13.0
Expand Down
5 changes: 5 additions & 0 deletions reframe/core/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,11 @@ def __getitem__(self, key):
def __getattr__(self, attr):
return getattr(self._pick_config(), attr)

@property
def schema(self):
'''Configuration schema'''
return self._schema

def add_sticky_option(self, option, value):
self._sticky_options[option] = value

Expand Down
14 changes: 7 additions & 7 deletions reframe/core/systems.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,12 @@

import json

import reframe.utility as utility
import reframe.utility as util
import reframe.utility.jsonext as jsonext
from reframe.core.backends import (getlauncher, getscheduler)
from reframe.core.environments import (Environment, ProgEnvironment)
from reframe.core.logging import getlogger
from reframe.core.modules import ModulesSystem
from reframe.core.environments import (Environment, ProgEnvironment)


class ProcessorType(jsonext.JSONSerializable):
Expand Down Expand Up @@ -232,7 +232,7 @@ def access(self):

:type: :class:`List[str]`
'''
return utility.SequenceView(self._access)
return util.SequenceView(self._access)

@property
def descr(self):
Expand All @@ -249,7 +249,7 @@ def environs(self):
:type: :class:`List[ProgEnvironment]`
'''

return utility.SequenceView(self._environs)
return util.SequenceView(self._environs)

@property
def container_environs(self):
Expand All @@ -258,7 +258,7 @@ def container_environs(self):
:type: :class:`Dict[str, Environment]`
'''

return utility.MappingView(self._container_environs)
return util.MappingView(self._container_environs)

@property
def fullname(self):
Expand Down Expand Up @@ -315,7 +315,7 @@ def resources(self):

'''

return utility.MappingView(self._resources)
return util.MappingView(self._resources)

@property
def scheduler(self):
Expand Down Expand Up @@ -661,7 +661,7 @@ def partitions(self):

:type: :class:`List[SystemPartition]`
'''
return utility.SequenceView(self._partitions)
return util.SequenceView(self._partitions)

def __eq__(self, other):
if not isinstance(other, type(self)):
Expand Down
Loading