Support imported suite modules. #1829

hjoliver · 2016-05-04T23:07:46Z

This supersedes #173.

[UPDATE] Description edited to clarify the basic concept - which is sufficient for all use cases I know of - and to remove (my) unnecessary speculation beyond that (sorry if this removes context for some comments below!)

We should support imported suite modules to allow sharing - instead of duplicating - of common (sub-)workflows.

Concept

A suite module

defines one or more workflows that can be imported into other suites, e.g. a build workflow to be imported into the R1 graph of another suite, and a processing workflow to be imported into its cycling graph.
can be stored like a normal suite (and may contain bin scripts and other files for use by its tasks)
(it may be a valid non-cycling suite by itself, for standalone testing purposes)
defines a module interface that allows it to be treated as a black box by the importing suite, e.g.:
- inputs for its initial tasks,
- output locations for its final products,
- (some internal configurability could also be exposed to the importing suite)

Proposed Interface

%import MODULE as NAME
[scheduling]
    [[dependencies]]
         graph = get-data => proc-data => NAME
[runtime]
    [[NAME]]
         # (an externally defined workflow containing many tasks)
         [[[module interface]]]
               INPUT_DIR = $CYLC_SUITE_SHARE_DIR
               OUTPUT_DIR = $CYLC_SUITE_SHARE_DIR
               # etc.

When this suite is parsed, the module

will be installed into the suite from its source location
its [runtime] will be added to that of the importing suite,
- with names qualified by "NAME" to avoid clashes,
- and inside a family called "NAME"
its graph will be inserted into the suite graph, also with qualified task names
(at run time the full module workflow is present, inside a collapsible family).

The text was updated successfully, but these errors were encountered:

arjclark · 2016-05-05T07:52:08Z

Things that are occurring to me as to be thought about:

Can (or should) the importer supply or override module runtime config such as task hosting details, or could this be made part of the module interface, or should we just accept that modules aren't portable in that way? (perhaps the latter, at least initially).

Additionally, if something is changed in the imported module, how do we handle that in terms of reloads and restarts? Possible situations being:

file in some central location has been changed (non-revision based solution), I want to reload my suite, how does that work?
file in some central location has changed (again, non-revision based solution), I want to restart my suite but not pick up changes to that file, just continue as was.

How do we go about validating the imported suite - as "In terms of suite design, the module workflow looks to the importing suite like a single task (albeit without some of the usual runtime settings)." - so how would we capture validating graphing in the imported suite to avoid cases where it might be invalid and would break things at runtime?

Do you envisage us being able to work with graphing brought in from the module in a standard way? i.e. can we do things like PRODUCTS.FAM_FOO => fail-any => some_task and/or PRODUCTS.foo:fail => recovery_task_specific_to_my_suite => !PRODUCTS.foo

Do we need to do something special to avoid:

[runtime]
[[PRODUCTS]]

being mixed up with the result of:

%import <location>:<module-name>@<revision-tag> as PRODUCTS

matthewrmshin · 2016-05-05T13:25:11Z

Can a module run as a standalone suite, e.g. during development and testing?

what to do with the module interface in a standalone run?

support an internal cycling section that is ignored/replaced by the importing suite?

We can do something like Python's if __name__ == "__main__" kind of logic.

matthewrmshin · 2016-05-05T13:36:11Z

Additionally, if something is changed in the imported module, how do we handle that in terms of reloads and restarts? Possible situations being:

file in some central location has been changed (non-revision based solution), I want to reload my suite, how does that work?

file in some central location has changed (again, non-revision based solution), I want to restart my suite but not pick up changes to that file, just continue as was.

If you want ultimate control of suite configuration, you should only import from version controlled location, or locations that you have full control of.

However, it is sometimes advantageous to import from a moving HEAD or latest stable. In which case:

The suite reloads only if the user has to tell the suite to reload, so it should not be problematic.
I would imagine that the full running suite configuration will be stored under log/suiterc/
like we do now, so it should be clear what we are running with.
Restart is problematic under the current logic.
In fact, I think we should modify the current logic so that a restart does not automatically loads
the suite.rc from the registered location, but should loads the relevant suite.rc from the
log/suiterc/ location, unless a reload is also requested for the restart.

arjclark · 2016-05-05T13:50:34Z

Restart is problematic under the current logic. In fact, I think we should modify the current logic so that a restart does not automatically loads the suite.rc from the registered location, but should loads the relevant suite.rc from the log/suiterc/ location, unless a reload is also requested for the restart.

Which I guess means that we'll need to keep a copy of the processed out contents of the imported content in there rather than a reference to some external (possibly processed) suite files.

matthewrmshin · 2016-05-05T13:53:29Z

%import <location>:<module-name>@<revision-tag> as PRODUCTS

We should probably support a search path + import from relative path mechanism - so that the import statement can be detached from site-specific locations.

hjoliver · 2016-05-05T23:03:58Z

@arjclark (above #1829 (comment)):

if something is changed in the imported module, how do we handle that in terms of reloads and restarts? ... (non-revision based solution)

I suppose we could design a solution such that restart and reload do not re-import by default, but in my view we shouldn't do that. If using a "non-revision based solution" (or similarly, extracting from HEAD of a branch in other contexts) you get what you paid for - so be aware, and be careful!

How do we go about validating the imported suite ...

By "in terms of suite design" I mean all the suite writer has to worry about assuming that imported modules are not broken or incompatible with the importer. I suppose we could have a validation that doesn't actually do the import, but I was envisaging that validation would do a full suite parse as for a suite start-up, in which case a broken module would cause the validation to fail (c.f. importing a broken Python module).

Do you envisage us being able to work with graphing brought in from the module in a standard way? ...

Yes, although I haven't thought about this in much depth. As described above, the imported module does end up as a regular task family with internal dependencies, so MODULE:succeed-all => foo would actually be equivalent to triggering off the final internal task(s), without having to know (at suite design time) the internal task names.

Do we need to do something special to avoid...

Yes - abort if a name collision is detected.

hjoliver · 2016-05-05T23:10:57Z

@matthewrmshin (above #1829 (comment)):

Restart is problematic under the current logic. ...

Good point. We also need to take into account our plans to move the "rose suite-run" installed-suite paradigm into cylc, which separates the installed/registered location from the source location.

hjoliver · 2016-05-05T23:28:51Z

Does import MODULE-URL@MODULE-REV as NAME inside a suite.rc file seem a little odd given that cylc suites have hitherto only been objects of (/subject to) revision control, not actually aware of the concept themselves? Maybe we should only support import MODULE as NAME and require that the location (and potentially revision) of MODULE is somehow provided by external means ?

arjclark · 2016-05-06T07:46:48Z

Does import MODULE-URL@MODULE-REV as NAME inside a suite.rc file seem a little odd given that cylc suites have hitherto only been objects of (/subject to) revision control, not actually aware of the concept themselves?

Yes, to me at least, and kind of prompted my questions about things in relation to non-revision controlled files i.e. ones where you can't use a @MODULE-REV extension.

matthewrmshin · 2016-05-06T10:29:56Z

(I am thinking about a system similar to Python's import statement and sys.path, or FCM's configuration file's include and include-path.)

hjoliver · 2016-05-12T23:59:48Z

Module source locations:

This should wait on migration of rose suite-run #1885. Then, the suite.rc syntax should only support import MODULE as NAME, with MODULE expected to be located in a local sub-directory of the suite directory, modules/MODULE say, or perhaps externally via $CYLC_MODULE_PATH or similar. For revision controlled modules, some cylc file analogous to rose-suite.conf can specify URL+revision of the module, to be extracted and installed into the suite installation.

hjoliver · 2016-06-29T05:06:13Z

[meeting] - we agreed on the proposal as described in the updated issue description, plus previous comment on source locations.

An additional thought was:

don't need to wait on Migrate "rose suite-run" functionality into cylc. #1885 for the initial implementation, using modules in a local sub-directory
in fact a module could be imported from a suite (where it exists in the modules sub-directory)
- so you could modularize an existing suite without actually taking the module config out of it, but still make it available to others
- this automatically solves the problem of standalone testing discussed above, without requiring additional cycling config etc. that is ignored on import: just store a module within a simple suite designed for standalone testing.

hjoliver · 2019-06-30T01:29:05Z

Closing this issue: the proposed mechanism worked reasonably well within the confines of the config file format, but we decided quite some time ago to wait on the future Python API to get proper modularity via Python itself.

hjoliver added this to the later milestone May 4, 2016

hjoliver mentioned this issue May 4, 2016

Modular suite building? #173

Closed

hjoliver changed the title ~~Modular suites.~~ Support for importable suite modules. May 4, 2016

hjoliver changed the title ~~Support for importable suite modules.~~ Support importable suite modules. May 4, 2016

hjoliver changed the title ~~Support importable suite modules.~~ Support imported suite modules. May 5, 2016

hjoliver mentioned this issue May 30, 2017

Support imported suite modules. #2311

Closed

matthewrmshin modified the milestones: soon, later May 30, 2017

matthewrmshin mentioned this issue Aug 30, 2017

Better batch system directives configuration. #1560

Closed

oliver-sanders mentioned this issue Oct 18, 2017

Relative Inter Cycle Dependencies #2452

Open

hjoliver closed this as completed Jun 30, 2019

matthewrmshin added the superseded label Jul 2, 2019

matthewrmshin removed this from the soon milestone Jul 2, 2019

matthewrmshin added the wontfix label Jul 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support imported suite modules. #1829

Support imported suite modules. #1829

hjoliver commented May 4, 2016 •

edited

Loading

arjclark commented May 5, 2016

matthewrmshin commented May 5, 2016

matthewrmshin commented May 5, 2016

arjclark commented May 5, 2016

matthewrmshin commented May 5, 2016

hjoliver commented May 5, 2016

hjoliver commented May 5, 2016

hjoliver commented May 5, 2016 •

edited

Loading

arjclark commented May 6, 2016

matthewrmshin commented May 6, 2016

hjoliver commented May 12, 2016 •

edited

Loading

hjoliver commented Jun 29, 2016

hjoliver commented Jun 30, 2019

Support imported suite modules. #1829

Support imported suite modules. #1829

Comments

hjoliver commented May 4, 2016 • edited Loading

Concept

Proposed Interface

arjclark commented May 5, 2016

matthewrmshin commented May 5, 2016

matthewrmshin commented May 5, 2016

arjclark commented May 5, 2016

matthewrmshin commented May 5, 2016

hjoliver commented May 5, 2016

hjoliver commented May 5, 2016

hjoliver commented May 5, 2016 • edited Loading

arjclark commented May 6, 2016

matthewrmshin commented May 6, 2016

hjoliver commented May 12, 2016 • edited Loading

hjoliver commented Jun 29, 2016

hjoliver commented Jun 30, 2019

hjoliver commented May 4, 2016 •

edited

Loading

hjoliver commented May 5, 2016 •

edited

Loading

hjoliver commented May 12, 2016 •

edited

Loading