Skip to content

Conversation

@vkarak
Copy link
Contributor

@vkarak vkarak commented Mar 15, 2021

A directive makes available a method defined in a class to the execution of the class body during its creation.

A directive simply captures the arguments passed to it and all directives are stored in a registry inside the class. When the final object is created, they will be applied to that instance by calling the target method on the freshly created object. As soon as they are applied, they set to point to the actual object method, since they have served their purpose. This allows us to export functions such as depends_on() at the class level. This is implemented by aliasing the directive name in RegressionTest.__getattribute__() to the actual object method that implements it.

Currently, only depends_on() is exposed as a directive, but in the future the skip test functions would also be.

This builds on top of #1865.

The following methods are exposed as directives: depends_on(), skip() and skip_if().

Implementation details

The directive methods are defined in RegressionTest and are prefixed with _D_.
For consistency, I changed the prefix of pipeline hook methods to _P_.

Todos

  • Update documentation

Fixes #1864.

@vkarak vkarak requested review from ekouts and jjotero March 15, 2021 13:06
@vkarak vkarak self-assigned this Mar 15, 2021
@vkarak vkarak marked this pull request as draft March 15, 2021 13:07
@codecov-io
Copy link

Codecov Report

Merging #1868 (3d20133) into master (ed40b22) will increase coverage by 0.06%.
The diff coverage is 95.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1868      +/-   ##
==========================================
+ Coverage   87.61%   87.68%   +0.06%     
==========================================
  Files          49       50       +1     
  Lines        8141     8207      +66     
==========================================
+ Hits         7133     7196      +63     
- Misses       1008     1011       +3     
Impacted Files Coverage Δ
reframe/core/deferrable.py 98.04% <ø> (ø)
reframe/core/directives.py 95.00% <95.00%> (ø)
reframe/core/meta.py 99.15% <95.00%> (-0.85%) ⬇️
reframe/core/decorators.py 97.14% <100.00%> (+0.05%) ⬆️
reframe/core/namespaces.py 93.22% <100.00%> (+0.11%) ⬆️
reframe/core/pipeline.py 92.03% <100.00%> (+0.05%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ed40b22...3d20133. Read the comment docs.

Copy link
Contributor

@jjotero jjotero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got a different suggestion for the syntax. What do you think of something like

class Foo(rfm.RegressionTest):
  t0 = dependency(MyOtherTest, by=...)

where t0 is the instance of a class that manages the dependencies. For example, the test Foo from above could have a hook that requires the dependency and one would just do

@rfm.run_before('compile')
def a_random_hook(self):
    my_dependency = self.t0.get()

    # Do some stuff with the dependency
    ...

This would be more consistent with what we have right now for the other built-ins and it would avoid the name mangling in the pipeline (which I would try to avoid at all costs). I do get that having that general method to export any member function as a directive is far more flexible, but I feel like it's a bit of an overkill for just one directive.

@vkarak
Copy link
Contributor Author

vkarak commented Mar 15, 2021

I do get that having that general method to export any member function as a directive is far more flexible, but I feel like it's a bit of an overkill for just one directive.

Currently, it's just for one.. The plan is to use this mechanism for the skip function functionality, with skip() and/or skip_if() function.

@vkarak
Copy link
Contributor Author

vkarak commented Mar 15, 2021

Regarding the syntax for dependencies, that would require a complete redesign of the dependencies. I see your point, but I'm not sure it's worth the effort. The fact also that you name the dependencies, I find it a bit confusing as a user. It fits better for an internal implementation rather than a user API, imho. I prefer the depends_on() syntax.

@vkarak
Copy link
Contributor Author

vkarak commented Mar 15, 2021

This would be more consistent with what we have right now for the other built-ins and it would avoid the name mangling in the pipeline (which I would try to avoid at all costs).

The name mangling does not interfere with the pipeline. We're not mangling any pipeline method.

@jjotero
Copy link
Contributor

jjotero commented Mar 15, 2021

Regarding the syntax for dependencies, that would require a complete redesign of the dependencies. I see your point, but I'm not sure it's worth the effort. The fact also that you name the dependencies, I find it a bit confusing as a user. It fits better for an internal implementation rather than a user API, imho. I prefer the depends_on() syntax.

Perhaps that's something for the getdeps PR then 🙃

What about inheritance? As far as I understand from this, a test dependency would not be inherited along by any child class. Is this right? I feel like that we're blurring the line between the test class and the test instance here. My view is that things which get declared at the class level MUST be inherited by any child class; whereas things that get specified at the instance level are just "private" to that instance (which is what happens for both variables and parameters). This is why I was thinking of different functions for specifying the dependencies at the class level and the instance level. The way I see this, a class may define dependencies for its tests, but the class itself does not depend on these dependencies; the tests themselves do. This is why I suggested the name dependency for the directive, because you're just creating a new dependency here, but the class itself does not depend_on that dependency.

The name mangling does not interfere with the pipeline. We're not mangling any pipeline method.

I meant having depends_on being an alias to _D_depends_on.

@vkarak
Copy link
Contributor Author

vkarak commented Mar 15, 2021

Dependencies are inherited properly. Check this out:

import reframe as rfm
import reframe.utility.sanity as sn


class Base(rfm.RunOnlyRegressionTest):
    valid_systems = ['*']
    valid_prog_environs = ['*']
    executable = 'echo'

    @rfm.run_before('run')
    def set_exec_opts(self):
        self.executable_opts = [self.name]

    @rfm.run_before('sanity')
    def set_sanity_check(self):
        self.sanity_patterns = sn.assert_found(rf'{self.name}', self.stdout)


@rfm.simple_test
class T0(Base):
    pass


class TX(Base):
    depends_on('T0')


@rfm.simple_test
class T1(TX):
    pass


@rfm.simple_test
class T2(TX):
    pass

This will make T1 and T2 depend on T0:

- T2:
   <...>
    Dependencies (conceptual):
      T0

    Dependencies (actual):
      - ('T2', 'generic:default', 'builtin') -> ('T0', 'generic:default', 'builtin')

- Τ1:
   <...>
    Dependencies (conceptual):
      T0

    Dependencies (actual):
      - ('T1', 'generic:default', 'builtin') -> ('T0', 'generic:default', 'builtin')

But you spotted a very subtle detail! This works nicely because none of T1, T2 defines __init__() so they end up invoking the augmented __init__() of TX, which applies the directives and therefore they also get the dependencies. If I define T1 as follows:

@rfm.simple_test
class T1(TX):
    def __init__(self):
        pass

Then T1 is independent. I would have to call super().__init__() to get it right. Technically, this is no different than currently: if TX was doing self.depends_on('T0') and I forgot to do super().__init__() in T1 I would again miss the dependency.

This PR has a meaning if we want to eliminate the use of __init__() and this was its original assumption. In that case, as I showed, inheritance works nicely. I do believe that we don't need __init__(), and we can replace its functionality, which is literally very-very specific to test filtering only, with a post-init hook as in #1865. All the rest of the test functionality can either be defined in the class body or in the various pipeline hooks. Nothing is needed in the __init__() except if you want to change the tags or the valid_prog_environs, and for this you can have a hook. Then we, as framework, gain control of the __init__() and we can do the directive magic. I would argue that mixing __init__() with directives or __init__() with post-init hooks is not something users should do, unless they know what they are doing. There are two alternatives in writing tests, therefore: (a) the classic way: __init__() without directives and without post-init hooks, (b) the new way without __init__(), but with directives and post-init hooks if necessary.

@vkarak
Copy link
Contributor Author

vkarak commented Mar 15, 2021

Instead of augmenting the __init__(), we could possibly apply the directives in _rfm_init() and avoid this artifact here.

@vkarak
Copy link
Contributor Author

vkarak commented Mar 15, 2021

Indeed, augmenting _rfm_init() instead of __init__() works nicely avoiding the side effects described above when users define __init__() and inheritance works nicely in all cases. It will also simplify a lot the part in the metaclass that applies the directives, since _rfm_init is always there and there is no need to specialize for the hooks.

This avoids side effects when users override `__init__()` without calling
`super()`. Also renamed `_rfm_init()` to `__rfm_init__()` and added a couple
more test cases in the unit tests.
@vkarak
Copy link
Contributor Author

vkarak commented Mar 15, 2021

I meant having depends_on being an alias to _D_depends_on.

I don't see this as a problem, users will not ever call the esoteric function. Also, after the initialization depends_on() behaves exactly as before, when it wasn't a directive, so that the syntax self.depends_on() is still legitimate. The only problem I see is with the documentation :-) We will have to move it to the rst and add an explicit entry for the depends_on class method.

@vkarak vkarak added this to the ReFrame sprint 21.04.1 milestone Apr 6, 2021
@vkarak vkarak marked this pull request as ready for review April 9, 2021 22:06
Copy link
Contributor

@victorusu victorusu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Can we merge it? Or is it for 4.0?

Copy link
Contributor

@jjotero jjotero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation is super clean!

However, I have the feeling that having directives is no-longer required now that the post-init hook is in. I struggle to see why any of the three directives would be used in the class body and not as a post-init hook. In fact, using these methods as directives only gives you an incomplete functionality respect to using them in a post-init hook.

I think the following example shows why these methods belong to the instance and not the class:

class Test(rfm.RunOnlyRegressionTest):
  '''A base test'''

  valid_systems = ['daint:login']
  valid_prog_environs = ['PrgEnv-gnu']
  executable = f'echo bananas'

  @rfm.run_before('sanity')
  def set_sanity(self):
    self.sanity_patterns = sn.assert_found('bananas', self.stdout)


@rfm.simple_test
class A(Test):
    p = parameter(range(10))

    # Why would anyone skip a test in the class body?
    skip('If I do not want this test I might as well not register it.')

    # Skipping a test based on parameters, prgenv, or system will never work
    # in the class body.
    # For example, the following will break, because the value of the
    # parameter `p` belongs to the instance, not the class.

    # skip_if(p > 5, 'skipping test A from the class body') # Error: Can't access a parameter from the class body.

    @rfm.run_after('init')
    def skip_tests(self):
        '''This can only be done in a post-init hook.

        I'm using parameters here, but the same applies to prog_environs and systems.
        '''

        self.skip_if(self.p > 5, 'skipping A from a post-init hook')

    @rfm.run_after('init')
    def serialize_parameterized_tests(self):
        '''This can only be done in a post-init hook

        I'm using parameters here, but the same applies to prog_environs and systems.
        '''

        for params in self._rfm_param_space:
            testname = 'A'
            for val in params:
                testname += f'_{val}'
          
            if testname == self.name:
                break

            self.depends_on(testname)

As a user, I would certainly be confused that I am given the options to use these methods in different places, but their behaviour is not the same in both places.


Directives are special functions that are called at the class level but will be applied to the newly created test.
Directives can also be invoked as normal test methods once the test has been created.
Using directives and `builtins <#builtins>`__ together, it is possible to completely get rid of the :func:`__init__` method in the tests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already true with the post-init hook, right? I think we should just avoid mentioning
__init__ in the new docs and just refer to the recommended syntax.

@vkarak
Copy link
Contributor Author

vkarak commented Apr 12, 2021

@jjotero For parametrised tests, it is true that you have to do it with hooks, but why should we restrict the depends_on() to only define it in an init hook? Usually, the dependency information is static; if not, then yes, it should go in a hook. Skip functions may make little sense to be directives, but I can see use cases there as well. Currently, if my tests have dependencies, there is no way to write it without dependencies, whereas you can write the sanity patterns in the class body if you redirect the standard output/error of your executable to a file. But if you have dependencies, you will be obliged to have a hook and a post-init hook particularly, nothing else: if you add a dependency at a later stage, it will be simply ignored. Isn't this also counter-intuitive?

@jjotero
Copy link
Contributor

jjotero commented Apr 12, 2021

why should we restrict the depends_on() to only define it in an init hook?

Because that locks a class to a specific dependency and it will prevent you from deriving and extending the dependency if you want to. For example:

@rfm.simple_test
class A(rfm.RegressionTest):
   ...

@rfm.simple_test
class B(rfm.RegressionTest):
    depends_on('A')
    ...

All good so far, but say that someone imports the above into another file and does

import the.other.module as m
@rfm.simple_test
class A1(m.A):
    '''Extend the functionality of A'''

@rfm.simple_test
class B1(m.B):
    '''This class now depends on both A and A1, which doesn't make much sense.'''
    depends_on('A1')

Here if you want to run only B1, you're still forced to register both A and A1 even though you don't need A at all. Otherwise you simply get a could not resolve dependency error. However, if you have the dependencies in an init hook, you can just override them if you want to. This is why I still think that test dependencies belong to the instance and not the class.

With fixtures in mind, fixtures MUST be defined in the class body because a fixture is a building block for the test itself (the test requires of the fixture for it to be complete). But this is not the case with dependencies, where a test may or may not need the dependency to run.

So in short, I think that if it belongs in the class body, it's because it is a fixture, else, it's a dependency and should be attached as a hook.

whereas you can write the sanity patterns in the class body if you redirect the standard output/error of your executable to a file.

Sanity patterns are different because they do not depend on anything external to the class where they're defined. It doesn't matter what you do with it, a derived class can always override it.

if you add a dependency at a later stage, it will be simply ignored.

The same applies to every other test variable that is used at some point along the pipeline, right? I think this is more than fine as long as we state clearly that dependencies and test skipping conditions must not be defined beyond the post-init stage.

@vkarak
Copy link
Contributor Author

vkarak commented Apr 12, 2021

Ok, it agains boils down to the dependencies vs. fixtures thing. I think you are probably right. Dependencies should be something more dynamic, although this is not the actual case, since they are used as (and are shown to be used as) fixtures. I am a bit disappointed to have to close this PR, because I feel like throwing quite some effort out of the window... Anyway, I will keep my branch for future reference in case we need to implement directives.

@vkarak vkarak closed this Apr 12, 2021
@vkarak vkarak deleted the feat/directives branch September 29, 2022 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a depends_on directive

4 participants