Skip to content

Conversation

@vkarak
Copy link
Contributor

@vkarak vkarak commented Mar 9, 2021

This PR enhances the naming scheme of tests by introducing the following concepts:

  1. Each test is now associated with a unique id, which is a SHA256 hash. This hash is calculated based on the test name and its parameters, if it is a parameterized test.
  2. The unique hash is available through the hash_long and hash_short properties.
  3. A test gets a unique_name, which is essentially the test class name with the test hash appended, if the test is parameterized, i.e., ParamTest_<hash_short>. We could always append the test hash, but we choose not to do so, in order to maintain compatibility with the current naming scheme of non-parameterized tests.
  4. The test's name becomes a human-readable version of the test's name. If the test is not parameterized, this is simply the qualified class name of the test. If it is parameterized, the parameter names and their values are appended as follows: MyTest%param1=value1%param2=value2
  5. Modifying the test's name is now deprecated.
  6. Test listings have been updated to include the test hashes.
  7. Tests can also be selected by hash using the -n option as follows: -n /<hash>.
  8. The test's name is no more used in paths. The unique_name, which is read-only and has no special characters, is used instead.
  9. The test's default executable uses the unique_name instead of name.
  10. The name of the generated job and build scripts do not contain the test name any more. They are simply _rfm_build.{sh,out,err}and _rfm_run.{sh,out,err}. The test name was superfluous here, since it is already part of the path.

Implementation details

The parameterized tests defined using the @parameterized_test decorator are now implemented using the parameter built-in machinery. We essentially extend the test's parameter space with the parameters passed in the decorators. By inspecting the test's __init__() function we can also assign the right names to the parameters. However, one of the key differences of the @parameterized_test decorator compared to the parameter built-in is the way multiple parameters are treated. The parameter() directive uses the external product of the parameters to create new tests, whereas the @parameterized_test decorator simply zips the parameters and passes them to the test's __init__() function. Supporting this requires to add the possibility to change the iteration scheme of the parameterization space. Another difference between the built-in and the decorator is that the parameterization space is inherited in the case of the built-in, such that a @simple_test that derives from a parameterized test is also parameterized. This is not the case, however, when using the decorator. To be able to support this using the built-in machinery, we had to mark a test that is parameterized in the old way, and do not allow inheritance of the parameterization space in this case. In the future, we should consider deprecating the @parameterized_test decorator, since it does not offer anything compared to the more powerful parameter built-in.

Other implementation changes:

  • The _rfm_init() function is renamed to __rfm_init__() so as to be in sync with other special framework functions.
  • The hacks in the series of sleep checks used in test_policies are now removed and a parameterized test is used.
  • The framework's module import logic is extended so as to allow force reloading of a module. This is quite useful in unit tests, where we load multiple times the same tests and we would like every time to start fresh with the creation of its variable and parameter spaces. The addition of this logic to the basic module load utility functions allowed us to simplify the logic of some test loading unit test fixtures.

Todos:

  • Update documentation.
  • Actually deprecate setting the test's name.
  • Remove stale comments.
  • Implement and test the have_not_hash() filter.

Fixes #1794.

@vkarak vkarak added this to the ReFrame sprint 21.03.1 milestone Mar 9, 2021
@vkarak vkarak requested review from ekouts, jjotero and victorusu March 9, 2021 11:04
@vkarak vkarak self-assigned this Mar 9, 2021
@pep8speaks
Copy link

Hello @vkarak, Thank you for submitting the Pull Request!

Line 40:80: E501 line too long (80 > 79 characters)

Line 1151:80: E501 line too long (87 > 79 characters)
Line 1932:80: E501 line too long (87 > 79 characters)

Do see the ReFrame Coding Style Guide

@jjotero
Copy link
Contributor

jjotero commented Mar 10, 2021

After I run a dummy test Test with two parameters (p and q), I get:

[----------] started processing Test%p=1%q=a (Test%p=1%q=a)
[ RUN      ] Test%p=1%q=a on eiger:login using PrgEnv-cray
[ RUN      ] Test%p=1%q=a on eiger:login using PrgEnv-gnu
[----------] finished processing Test%p=1%q=a (Test%p=1%q=a)

I think there is no need to have the full test name 6 times in here. How about something like the snippet below?

[----------] started processing Test (p=1, q=a)
[ RUN      ] Test%p=1%q=a on eiger:login using PrgEnv-cray
[ RUN      ] Test%p=1%q=a on eiger:login using PrgEnv-gnu
[----------] finished processing Test (p=1, q=a)

@vkarak
Copy link
Contributor Author

vkarak commented Mar 10, 2021

I think there is no need to have the full test name 6 times in here.

Why not? "Repetitio est mater studiorum" 😭

@victorusu
Copy link
Contributor

@vkarak, is the CI failure spurious?

@vkarak
Copy link
Contributor Author

vkarak commented Apr 12, 2021

@victorusu This PR is outdated and needs to be synced with the master. I don't think that the CI failure comes from the PR, since it happens from time to time.

Copy link
Contributor

@jjotero jjotero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there're a few things that need to be addressed before discussing the naming scheme, and perhaps it's a good idea to tackle these in separate PRs:

  • Drop @parameterized_test decorator. I saw there're a few workarounds for that and I think we should not have this decorator in the 4.0 version. I think it's knife time :)
  • Rather than relying on the _rfm_use_params constructor argument, this argument must be changed to an index. This will give us control over which point of the parameter space we create with each class instantiation. This is even more important when thinking of fixtures (that'll add a second index for the fixture space). An added benefit of this is that several sections of the code no-longer need to rely on the order of instantiation, and we'll be able to query things into the parameter space "by index" (e.g. what's the name for parameter variant 6, and so on). This will also potentially simplify the implementation of the getdeps.

Now, regarding the naming scheme:

  • name, _unique_id and so on must not be variables. The varable built-in should only be used for class attributes that the user can interact (edit) with. These are instead better expressed as properties, I think.
  • name should refer to the internal & unique name for the test and not the human-readable one. The human-readable name is only used for the console output (fancy_name?).
  • We need a @classmethod to query the name of any test variant by test variant ID. The name could be a property that binds this function to the test's variant index.

This decorator does not instantiate any test. It only registers them.
The actual instantiation happens during the loading phase of the test.
'''
def _extend_param_space(cls):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just remove the parameterized test here already? I think its time has arrived 🪚

assert isinstance(getattr(cls, self.local_namespace_name),
LocalNamespace)

def allow_inheritance(self, cls, base):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also go away when we drop the parameterized_test, right?

Comment on lines 98 to +100
# Internal parameter space usage tracker
self.__unique_iter = iter(self)
self.set_iter_fn(itertools.product)
# self.__unique_iter = iter(self)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With fixtures (and also the callbacks from below) in mind, I think it's a lot better if we enable index access to the parameter points. That would give us all the control when instantiating a class on choosing which point in the parameter space we want, rather than relying on the consumption of the parameter space. What I have in mind is something like the following:

self.__rand_access_iter = list(iter(self))

and then we could even override the __getatitem__ method to achieve some list-like access as

def __getitem__(self, idx):
    return self.__rand_access_iter[idx]

Then we can inject the parameters from the point in the parameter space that the reframe test wants (see the inject function below).

Comment on lines +181 to +193
injected = []
if use_params and self.params:
try:
# Consume the parameter space iterator
param_values = next(self.unique_iter)
for index, key in enumerate(self.params):
setattr(obj, key, param_values[index])
format_fn = self.__format_fn.get(key)
injected.append(
(key, param_values[index], format_fn)
)

obj.params_inserted(injected)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
injected = []
if use_params and self.params:
try:
# Consume the parameter space iterator
param_values = next(self.unique_iter)
for index, key in enumerate(self.params):
setattr(obj, key, param_values[index])
format_fn = self.__format_fn.get(key)
injected.append(
(key, param_values[index], format_fn)
)
obj.params_inserted(injected)
if self.params and not params_index is None:
try:
# Consume the parameter space iterator
param_values = self.__random_access_iter[params_index]
for index, key in enumerate(self.params):
setattr(obj, key, param_values[index])
except IndexError as no_params:
raise RuntimeError(
f'parameter space index out of range for '
f'{obj.__class__.__qualname__}'
) from None

parameter values defined in the parameter space.
'''

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now in this function, instead of having use_params being a bool, this argument can be changed to an int argument such as params_index, and then this will inject whichever point in the parameter space the reframe check requests.

Also, since the reframe test has the index of the parameter space, the obj.params_inserted callback could just be made a member function of this parameter space class instead. Then this function would just return that tuple you're after for the naming. Then the pipeline could just call that function whenever needed to set the name of the test.

#:
#: :type: string that can contain any character except ``/``
name = variable(typ.Str[r'[^\/]+'])
name = variable(str)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, name feels now more like a property, no? If I understand correctly, the users are not supposed to touch this, right? Or are they still allowed to fiddle with the name, and reframe just relies on the unique_id instead?
In any case, I think that name should just be the unique_name and then have the human-readable name called something else. As far as I see, the human readable one it's only used for the console output.


# The unique ID of the test: a SHA256 hash of the class name and the
# test's parameters if any
_unique_id = variable(bytes, type(None), value=None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This certainly feels even more of a @property.

Comment on lines 749 to 755
def __new__(cls, *args, _rfm_use_params=False, **kwargs):
obj = super().__new__(cls)
obj.name = cls.__qualname__

# Insert the var & param spaces
cls._rfm_var_space.inject(obj, cls)
cls._rfm_param_space.inject(obj, cls, _rfm_use_params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constructor would now take the index of the parameter variant and feed that to the inject function of the parameter space. This index can be used later to get the formatted name of the parameters and so on.

Suggested change
def __new__(cls, *args, _rfm_use_params=False, **kwargs):
obj = super().__new__(cls)
obj.name = cls.__qualname__
# Insert the var & param spaces
cls._rfm_var_space.inject(obj, cls)
cls._rfm_param_space.inject(obj, cls, _rfm_use_params)
def __new__(cls, *args, _rfm_param_variant=None, **kwargs):
obj = super().__new__(cls)
obj.name = cls.__qualname__
# Insert the var & param spaces
cls._rfm_var_space.inject(obj, cls)
cls._rfm_param_space.inject(obj, cls, _rfm_param_variant)

self._cdt_environ = env.Environment('__rfm_cdt_environ')

# Export read-only views to interesting fields
def _set_unique_id(self, params):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the constructor gets changed as suggested above, this gets massively simplified and the unique_id would just be the class name + the parameter index.

h.update(bytes(jsonext.dumps(obj), encoding='utf-8'))
self._unique_id = h.digest()

def params_inserted(self, params):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd change this into something like:

@classmethod
def name_id(id):
    '''Method that returns the name of a class for a given variant id.'''
    ...

@property
def name(self):
    '''Bind the test's param_id with the name_id function.'''
    return self.name_id(self._rfm_param_variant)

The name_id method is the one that calls the function from the parameter space that returns that list of tuples to build the name and so on. The name property just binds the ID to the current test's ID.

@jjotero jjotero added this to the ReFrame Sprint 21.08.2 milestone Aug 24, 2021
@jjotero
Copy link
Contributor

jjotero commented Sep 8, 2021

To be redone after the fixtures #2166

@jjotero jjotero closed this Sep 8, 2021
@vkarak vkarak deleted the feat/improved-test-naming branch September 29, 2022 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants