Skip to content

Conversation

@vkarak
Copy link
Contributor

@vkarak vkarak commented Aug 10, 2021

This PR adds the possibility to set test variables from the command line. More specifically, users can now pass -S foo=X from the command line to set the variable foo to X in all loaded tests. To set multiple variables, users should pass the -S option multiple times.

Behind the scenes the framework sets the default value of a variable to whatever was passed from the command line. This happen at the variable injection time, just before the final test is constructed and returned to the frontend. This has some implications:

  • The body of the class has executed before the variables are set from the command line. This means that in the following example num_tasks will be 1 regardless of the value passed from the command-line.
class my_test(rfm.RegressionTest):
    foo = variable(int, value=1)
    num_tasks = foo
  • The variables are set before any filtering in the tests is performed. This gives us some nice flexibility, since the --skip-systme-check and --skip-prgenv-check option could be simply implemented by overriding the valid_systems and valid_prog_environs using this mechanism. We could even support dynamic tags that would be set from the command line and the test could act upon them. The downside is that the variable change will happen to all the loaded tests; there is no way to limit that, unless we support a syntax like this: -S my_test.foo=3, which would set foo only for test my_test.
  • Users can prefix a variable with a test name, such as -S my_test.foo=3, in which case foo will only be set for my_test. Notice that wildcards are not accepted in such a syntax.
  • Since the variables are set before the test is instantiated, this mechanism will have no effect if the variable is (re)set anywhere in the pipeline, including the init phase. This is, however, consistent with the default values behaviour, which is exactly what this mechanism leverages.

If a variable that is asked to be set from the command line is not declared in a test, it will be silently ignored.

One key aspect of this PR is to handle transparently the conversion from the string passed to the command line to the type of the variable. For the moment, the default conversion function is pretty simple. If the field is a TypedField it tries to convert to each one of the supported types doing typ(s) until it succeeds. None are handled specially and if the string 'None' is passed and the accepted type is NoneType it will be converted to None. Users when declaring variables are given the option to pass a custom conversion function, in which case it will be used instead of the default. If a conversion is not possible a TypeError will be raised and the test will be skipped.
The type conversion happens in the TypedField by passing the command-line value wrapped in a special object. When a TypedField is attempted to be set by such an object, the TypedField will try to implicitly convert the value to any of its supported types. If no conversion is successful, only then a TypeError will be thrown. The implicit type conversions happen by calling typ(s), where typ is the type we want to convert to and s is the source string. All built-in types accept this type of conversion, but we have extended the type system used by the TypedFields to allow even conversion of iterables, as well as any user type using the same convention. If the field type is List[int] and the user passes the string 1,2 from the command line, this will be converted to the list [1, 2] automatically. Behind the scenes, this works by inspecting the call to the type constructor and delegating that to a user-defined cast method for the source type. In the case of strings, this method is __rfm_cast_str__(), but str can be replaced with any other type name. Users that want to use this type of conversion with their types in TypedFields, will have to use the ConvertibleType metaclass for their classes and define the __rfm_cast_str(s) method as a class method. For the container types, the type metaclasses that implement the type system define this method to take care of the conversion. Here is an example of a custom type that supports implicit conversions from string:

    class X(metaclass=types.ConvertibleType):
        def __init__(self, x):
            self.x = x

        @classmethod
        def __rfm_cast_str__(cls, s):
            return X(int(s))

    assert X('3').x == 3

Note: the __rfm_cast_<type>__ methods can only take a single argument and do not support keyword arguments.

This mechanism allows us to support conversions also for built-in types that they normally do not, such as the NoneType. We define the Null type that supports conversions from 'null' to None and we register to this type the NoneType, so that isinstance(None, Null) is true.

Conversion to None are treated at the loader. The loader simply checks if the passed value is @none and in that case sets the variable to None, otherwise the conversion is delegated to the TypedField.

Limitations

  • Conversion of iterables is not supported, but we could add it.
  • The implicit conversion of 'null' to None will fail to set a string-or-None field to None, since the types are tried in the order they are declared. If the field was defined as None-or-string, the conversion would be successful. To mitigate that, all relevant variables have been changed to None-or-string.
  • Currently all the fields that allow None values are of type type(None). If we want to allow implicit conversions to None from the command line, we should update all these fields to typ.Null.

Todos

  • Document the feature
  • Document the new functionality of the typecheck module
  • Support conversion of iterables
  • Support conversions to None properly
  • Support the mytest.var = val syntax

Fixes #563.

@pep8speaks
Copy link

pep8speaks commented Aug 10, 2021

Hello @vkarak, Thank you for updating!

Cheers! There are no PEP8 issues in this Pull Request!Do see the ReFrame Coding Style Guide

Comment last updated at 2021-08-24 11:54:42 UTC

@vkarak vkarak marked this pull request as draft August 10, 2021 08:03
@codecov-commenter
Copy link

codecov-commenter commented Aug 10, 2021

Codecov Report

Merging #2117 (87fa0f3) into master (0ca19e7) will increase coverage by 0.13%.
The diff coverage is 97.82%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2117      +/-   ##
==========================================
+ Coverage   86.15%   86.29%   +0.13%     
==========================================
  Files          53       53              
  Lines        9465     9557      +92     
==========================================
+ Hits         8155     8247      +92     
  Misses       1310     1310              
Impacted Files Coverage Δ
reframe/core/pipeline.py 92.24% <ø> (ø)
reframe/core/variables.py 96.10% <ø> (ø)
reframe/frontend/loader.py 93.95% <93.33%> (+0.25%) ⬆️
reframe/core/fields.py 98.14% <96.87%> (-0.59%) ⬇️
reframe/core/meta.py 99.12% <100.00%> (+0.01%) ⬆️
reframe/frontend/cli.py 76.31% <100.00%> (+0.36%) ⬆️
reframe/utility/typecheck.py 97.14% <100.00%> (+1.68%) ⬆️
reframe/core/decorators.py 51.00% <0.00%> (+0.67%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0ca19e7...87fa0f3. Read the comment docs.

@victorusu
Copy link
Contributor

@vkarak, I (think that I can) understand the reasoning behind the use of *args and **kwargs.
But I remember that in 2.x ReFrame era we had a bunch of those in the framework which made data flow a bit difficult to understand. I remember that I particularly suffered from things that were crossing the boundaries of the front end and backend. We had major code refactors and some design changes that made the need for the *args and **kwargs usage minimal throughout the framework. Now, the code is much easier to follow.

So, I wonder if the use of *args and **kwargs can be avoided for the sake of code understanding and readability.

I am particularly concerned with the functions

def _instantiate(cls, inst_args, *args, **kwargs):
         kwargs['_rfm_use_params'] = True
         ...

Can't we name the **kwargs to something that makes sense here?

The interface of these functions is a bit more difficult to understand. What is expected to be passed in the *args and **kwargs for the following functions?

def load_from_module(self, module, *args, **kwargs):
def load_from_file(self, filename, force=False, *args, **kwargs):
def load_from_dir(self, dirname, recurse=False, force=False, *args, **kwargs):
def load_all(self, force=False, *args, **kwargs):

An example of where it became difficult to reason about the code is

def load_from_file(self, filename, force=False, *args, **kwargs):
         if not self._validate_source(filename):
             return []

         try:
             return self.load_from_module(
                 util.import_module_from_file(filename, force), *args, **kwargs)
    ...

As you can see, in order to know what is being passed to load_from_file and I need to look into util.import_module_from_file(filename, force), *args, **kwargs) and the chain continues.

What do you think?

Copy link
Contributor

@jjotero jjotero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got a bit of a similar concern as @victorusu with respect to that very long argument forwarding. If we ignore for a second the type conversion issue, the metaclass already provides the mechanism to set the default values of a variable by simply doing cls.var = 5. The only reason why we're not able to do so right now, is because the test registry returns the tests already instantiated, which means it would be too late for the loader to change a variable's default value. Therefore, I think this test registry is the main "bottleneck", and I think this should be made a proper class that the loader can interact with. This would avoid that very long argument forwarding, which is very difficult to keep track of.

Regarding the type conversion, I don't think it should be handled at the variable level. The variables are type-agnostic and their sole function is to hold a default value until the class is instantiated and the value gets passed to the data descriptor, whatever that descriptor is. The descriptor is the one responsible for handling the types (if it cares about types at all), and I think that this should be the pace where the type conversion is addressed, and not in the variables.

Below I described an alternative approach, which I think would simplify things.

@jjotero
Copy link
Contributor

jjotero commented Aug 16, 2021

I have a minor comment regarding which entity should be the one that actually updates the variables with the values from the command line (this is just me thinking out loud). It feels a bit strange that the registry's instantiate_all function takes the external vars as an argument and causes a side effect on the classes present in the registry. If we were to call this instantiate_all function multiple times from the loader (for whatever reason) as

registry.instantiate_all(extern_vars)
...
registry.instantiate_all()

the second call would also return the test instances with the variables set to the values in extern_vars even though this argument is not passed in this second call. We just instantiate this once, so it's not a big deal in any case, but it just feels weird to me.
Perhaps a more explicit way to do this is setting the variables from the loader. The registry is an iterable and exposes the classes to the loader. The downside of this is that we'd be looping over the classes twice ... 😅
Alternatively, the instantiate_all could also copy the classes before setting the value.

@vkarak
Copy link
Contributor Author

vkarak commented Aug 16, 2021

[...] and causes a side effect on the classes present in the registry.

Yes, I agree. The fact that the argument passed causes a side effects works against the idea of actually passing this argument. I will fix that.

The downside of this is that we'd be looping over the classes twice ... 😅

I bet it won't bother @victorusu's crazy parameter exploration experiments. :-)

Copy link
Contributor

@jjotero jjotero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of comments on the error checking side of things.

Also, I'm a big fan of this new _set_defaults function in the loader, because it now gives us the room to make this loader account for any corner case we might encounter, such as the conversion of None :)
I know that is now dealt with by the Null type in the typecheck utility, but this _set_defaults now allows us to interpret certain values specially. For example, we could say that in order to set a variable to None, users would have to set it as -S x=<None> (or something like that) and this _set_defaults would match that string and then just do a test.setvar('x', None) (and not a test.setvar('x', fields.make_convertible(value))). This would allow users to set to None from the command line without having to use the custom Null type.

Copy link
Contributor

@jjotero jjotero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor comment on the __setattr__ in the metaclass. Other than that, all this looks ready for the docs to me!

@jjotero
Copy link
Contributor

jjotero commented Aug 20, 2021

Looks good!

@vkarak vkarak marked this pull request as ready for review August 21, 2021 21:36
Copy link
Contributor

@teojgo teojgo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, I have some minor comments that you can take into account if you think they make sense.

@vkarak vkarak requested a review from teojgo August 24, 2021 08:58
Copy link
Contributor

@jjotero jjotero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@vkarak vkarak merged commit 85cf3e6 into reframe-hpc:master Aug 24, 2021
@vkarak vkarak deleted the feat/set-vars-from-cmdline branch August 24, 2021 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pass command line options to a regression test

6 participants