[feat] Improve naming scheme of tests #1848

vkarak · 2021-03-09T11:04:08Z

This PR enhances the naming scheme of tests by introducing the following concepts:

Each test is now associated with a unique id, which is a SHA256 hash. This hash is calculated based on the test name and its parameters, if it is a parameterized test.
The unique hash is available through the hash_long and hash_short properties.
A test gets a unique_name, which is essentially the test class name with the test hash appended, if the test is parameterized, i.e., ParamTest_<hash_short>. We could always append the test hash, but we choose not to do so, in order to maintain compatibility with the current naming scheme of non-parameterized tests.
The test's name becomes a human-readable version of the test's name. If the test is not parameterized, this is simply the qualified class name of the test. If it is parameterized, the parameter names and their values are appended as follows: MyTest%param1=value1%param2=value2
Modifying the test's name is now deprecated.
Test listings have been updated to include the test hashes.
Tests can also be selected by hash using the -n option as follows: -n /<hash>.
The test's name is no more used in paths. The unique_name, which is read-only and has no special characters, is used instead.
The test's default executable uses the unique_name instead of name.
The name of the generated job and build scripts do not contain the test name any more. They are simply _rfm_build.{sh,out,err}and _rfm_run.{sh,out,err}. The test name was superfluous here, since it is already part of the path.

Implementation details

The parameterized tests defined using the @parameterized_test decorator are now implemented using the parameter built-in machinery. We essentially extend the test's parameter space with the parameters passed in the decorators. By inspecting the test's __init__() function we can also assign the right names to the parameters. However, one of the key differences of the @parameterized_test decorator compared to the parameter built-in is the way multiple parameters are treated. The parameter() directive uses the external product of the parameters to create new tests, whereas the @parameterized_test decorator simply zips the parameters and passes them to the test's __init__() function. Supporting this requires to add the possibility to change the iteration scheme of the parameterization space. Another difference between the built-in and the decorator is that the parameterization space is inherited in the case of the built-in, such that a @simple_test that derives from a parameterized test is also parameterized. This is not the case, however, when using the decorator. To be able to support this using the built-in machinery, we had to mark a test that is parameterized in the old way, and do not allow inheritance of the parameterization space in this case. In the future, we should consider deprecating the @parameterized_test decorator, since it does not offer anything compared to the more powerful parameter built-in.

Other implementation changes:

The _rfm_init() function is renamed to __rfm_init__() so as to be in sync with other special framework functions.
The hacks in the series of sleep checks used in test_policies are now removed and a parameterized test is used.
The framework's module import logic is extended so as to allow force reloading of a module. This is quite useful in unit tests, where we load multiple times the same tests and we would like every time to start fresh with the creation of its variable and parameter spaces. The addition of this logic to the basic module load utility functions allowed us to simplify the logic of some test loading unit test fixtures.

Todos:

Update documentation.
Actually deprecate setting the test's name.
Remove stale comments.
Implement and test the have_not_hash() filter.

Fixes #1794.

pep8speaks · 2021-03-09T11:04:17Z

Hello @vkarak, Thank you for submitting the Pull Request!

In the file reframe/core/parameters.py:

Line 40:80: E501 line too long (80 > 79 characters)

In the file reframe/core/pipeline.py:

Line 1151:80: E501 line too long (87 > 79 characters)
Line 1932:80: E501 line too long (87 > 79 characters)

Do see the ReFrame Coding Style Guide

jjotero · 2021-03-10T11:09:58Z

After I run a dummy test Test with two parameters (p and q), I get:

[----------] started processing Test%p=1%q=a (Test%p=1%q=a)
[ RUN      ] Test%p=1%q=a on eiger:login using PrgEnv-cray
[ RUN      ] Test%p=1%q=a on eiger:login using PrgEnv-gnu
[----------] finished processing Test%p=1%q=a (Test%p=1%q=a)

I think there is no need to have the full test name 6 times in here. How about something like the snippet below?

[----------] started processing Test (p=1, q=a)
[ RUN      ] Test%p=1%q=a on eiger:login using PrgEnv-cray
[ RUN      ] Test%p=1%q=a on eiger:login using PrgEnv-gnu
[----------] finished processing Test (p=1, q=a)

vkarak · 2021-03-10T11:38:56Z

I think there is no need to have the full test name 6 times in here.

Why not? "Repetitio est mater studiorum" 😭

victorusu · 2021-04-12T08:42:19Z

@vkarak, is the CI failure spurious?

vkarak · 2021-04-12T08:44:07Z

@victorusu This PR is outdated and needs to be synced with the master. I don't think that the CI failure comes from the PR, since it happens from time to time.

jjotero

I think there're a few things that need to be addressed before discussing the naming scheme, and perhaps it's a good idea to tackle these in separate PRs:

Drop @parameterized_test decorator. I saw there're a few workarounds for that and I think we should not have this decorator in the 4.0 version. I think it's knife time :)
Rather than relying on the _rfm_use_params constructor argument, this argument must be changed to an index. This will give us control over which point of the parameter space we create with each class instantiation. This is even more important when thinking of fixtures (that'll add a second index for the fixture space). An added benefit of this is that several sections of the code no-longer need to rely on the order of instantiation, and we'll be able to query things into the parameter space "by index" (e.g. what's the name for parameter variant 6, and so on). This will also potentially simplify the implementation of the getdeps.

Now, regarding the naming scheme:

name, _unique_id and so on must not be variables. The varable built-in should only be used for class attributes that the user can interact (edit) with. These are instead better expressed as properties, I think.
name should refer to the internal & unique name for the test and not the human-readable one. The human-readable name is only used for the console output (fancy_name?).
We need a @classmethod to query the name of any test variant by test variant ID. The name could be a property that binds this function to the test's variant index.

jjotero · 2021-04-22T15:38:52Z

reframe/core/decorators.py

      This decorator does not instantiate any test.  It only registers them.
      The actual instantiation happens during the loading phase of the test.
    '''
+    def _extend_param_space(cls):


Should we just remove the parameterized test here already? I think its time has arrived 🪚

jjotero · 2021-04-22T15:51:59Z

reframe/core/namespaces.py

        assert isinstance(getattr(cls, self.local_namespace_name),
                          LocalNamespace)

+    def allow_inheritance(self, cls, base):


This should also go away when we drop the parameterized_test, right?

jjotero · 2021-04-23T07:45:02Z

reframe/core/parameters.py

        # Internal parameter space usage tracker
-        self.__unique_iter = iter(self)
+        self.set_iter_fn(itertools.product)
+        # self.__unique_iter = iter(self)


With fixtures (and also the callbacks from below) in mind, I think it's a lot better if we enable index access to the parameter points. That would give us all the control when instantiating a class on choosing which point in the parameter space we want, rather than relying on the consumption of the parameter space. What I have in mind is something like the following:

self.__rand_access_iter = list(iter(self))

and then we could even override the __getatitem__ method to achieve some list-like access as

def __getitem__(self, idx): return self.__rand_access_iter[idx]

Then we can inject the parameters from the point in the parameter space that the reframe test wants (see the inject function below).

jjotero · 2021-04-23T07:50:01Z

reframe/core/parameters.py

+        injected = []
        if use_params and self.params:
            try:
                # Consume the parameter space iterator
                param_values = next(self.unique_iter)
                for index, key in enumerate(self.params):
                    setattr(obj, key, param_values[index])
+                    format_fn = self.__format_fn.get(key)
+                    injected.append(
+                        (key, param_values[index], format_fn)
+                    )

+                obj.params_inserted(injected)


Suggested change

injected = []

if use_params and self.params:

try:

# Consume the parameter space iterator

param_values = next(self.unique_iter)

for index, key in enumerate(self.params):

setattr(obj, key, param_values[index])

format_fn = self.__format_fn.get(key)

injected.append(

(key, param_values[index], format_fn)

)

obj.params_inserted(injected)

if self.params and not params_index is None:

try:

# Consume the parameter space iterator

param_values = self.__random_access_iter[params_index]

for index, key in enumerate(self.params):

setattr(obj, key, param_values[index])

except IndexError as no_params:

raise RuntimeError(

f'parameter space index out of range for '

f'{obj.__class__.__qualname__}'

) from None

jjotero · 2021-04-23T07:59:20Z

reframe/core/parameters.py

            parameter values defined in the parameter space.

        '''
+


Now in this function, instead of having use_params being a bool, this argument can be changed to an int argument such as params_index, and then this will inject whichever point in the parameter space the reframe check requests.

Also, since the reframe test has the index of the parameter space, the obj.params_inserted callback could just be made a member function of this parameter space class instead. Then this function would just return that tuple you're after for the naming. Then the pipeline could just call that function whenever needed to set the name of the test.

jjotero · 2021-04-23T08:35:58Z

reframe/core/pipeline.py

    #:
    #: :type: string that can contain any character except ``/``
-    name = variable(typ.Str[r'[^\/]+'])
+    name = variable(str)


To me, name feels now more like a property, no? If I understand correctly, the users are not supposed to touch this, right? Or are they still allowed to fiddle with the name, and reframe just relies on the unique_id instead?
In any case, I think that name should just be the unique_name and then have the human-readable name called something else. As far as I see, the human readable one it's only used for the console output.

jjotero · 2021-04-23T08:36:22Z

reframe/core/pipeline.py


+    # The unique ID of the test: a SHA256 hash of the class name and the
+    # test's parameters if any
+    _unique_id = variable(bytes, type(None), value=None)


This certainly feels even more of a @property.

jjotero · 2021-04-23T08:41:46Z

reframe/core/pipeline.py

    def __new__(cls, *args, _rfm_use_params=False, **kwargs):
        obj = super().__new__(cls)
+        obj.name = cls.__qualname__

        # Insert the var & param spaces
        cls._rfm_var_space.inject(obj, cls)
        cls._rfm_param_space.inject(obj, cls, _rfm_use_params)


The constructor would now take the index of the parameter variant and feed that to the inject function of the parameter space. This index can be used later to get the formatted name of the parameters and so on.

Suggested change

def __new__(cls, *args, _rfm_use_params=False, **kwargs):

obj = super().__new__(cls)

obj.name = cls.__qualname__

# Insert the var & param spaces

cls._rfm_var_space.inject(obj, cls)

cls._rfm_param_space.inject(obj, cls, _rfm_use_params)

def __new__(cls, *args, _rfm_param_variant=None, **kwargs):

obj = super().__new__(cls)

obj.name = cls.__qualname__

# Insert the var & param spaces

cls._rfm_var_space.inject(obj, cls)

cls._rfm_param_space.inject(obj, cls, _rfm_param_variant)

jjotero · 2021-04-23T08:58:49Z

reframe/core/pipeline.py

            self._cdt_environ = env.Environment('__rfm_cdt_environ')

-    # Export read-only views to interesting fields
+    def _set_unique_id(self, params):


If the constructor gets changed as suggested above, this gets massively simplified and the unique_id would just be the class name + the parameter index.

jjotero · 2021-04-23T10:52:13Z

reframe/core/pipeline.py

+        h.update(bytes(jsonext.dumps(obj), encoding='utf-8'))
+        self._unique_id = h.digest()
+
+    def params_inserted(self, params):


I'd change this into something like:

@classmethod def name_id(id): '''Method that returns the name of a class for a given variant id.''' ... @property def name(self): '''Bind the test's param_id with the name_id function.''' return self.name_id(self._rfm_param_variant)

The name_id method is the one that calls the function from the parameter space that returns that list of tuples to build the name and so on. The name property just binds the ID to the current test's ID.

jjotero · 2021-09-08T15:12:24Z

To be redone after the fixtures #2166

Vasileios Karakasis added 9 commits February 27, 2021 14:01

Rename _rfm_init()

418fabf

Revisit naming scheme of tests

98c418c

Remove unused imports

f5c7ce9

Update default executable name

87ad77b

Show parameter hash in test listings

5854ea5

WIP: Hash the whole test name

25879fe

Fix unit tests

23321f5

Select tests by hash

44bd2db

Merge branch 'master' into feat/improved-test-naming

adec202

vkarak added prio: normal enhancement labels Mar 9, 2021

vkarak added this to the ReFrame sprint 21.03.1 milestone Mar 9, 2021

vkarak requested review from ekouts, jjotero and victorusu March 9, 2021 11:04

vkarak self-assigned this Mar 9, 2021

vkarak added the 4.x label Mar 12, 2021

vkarak modified the milestones: ReFrame sprint 21.03.1, ReFrame sprint 21.03.2 Mar 18, 2021

vkarak modified the milestones: ReFrame sprint 21.03.2, ReFrame sprint 21.04.1 Mar 30, 2021

vkarak mentioned this pull request Apr 17, 2021

[feat] Deprecate the use of the @parameterized_test decorator #1934

Merged

jjotero suggested changes Apr 23, 2021

View reviewed changes

jjotero added this to the ReFrame Sprint 21.08.2 milestone Aug 24, 2021

jjotero closed this Sep 8, 2021

vkarak deleted the feat/improved-test-naming branch September 29, 2022 21:54

[feat] Improve naming scheme of tests #1848

[feat] Improve naming scheme of tests #1848

Uh oh!

Conversation

vkarak commented Mar 9, 2021

Implementation details

Todos:

Uh oh!

pep8speaks commented Mar 9, 2021

Uh oh!

jjotero commented Mar 10, 2021

Uh oh!

vkarak commented Mar 10, 2021

Uh oh!

victorusu commented Apr 12, 2021

Uh oh!

vkarak commented Apr 12, 2021

Uh oh!

jjotero left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jjotero commented Sep 8, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants