You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is for discussion of potential improvements to the Python interface for creating and running steps and pipelines. First a summary of the current state of affairs:
Methods for creating steps
Column key:
Selects subclass: Does the method attempt to select the correct Step subclass based on arguments or config file contents?
Runs step: Does the method run the step before returning?
CRDS pars: Does the method incorporate parameters from CRDS pars files?
Config input: Does the method accept a path to a user config?
Override parameters: Does the method support overriding parameters on an individual basis?
Override style: If parameter overrides are supported, are they passed as standard keyword arguments or CLI-style arguments?
Method
Selects subclass
Runs step
CRDS pars
Config input
Override parameters
Override style
Notes
__init__
✗
✗
✗
✗
✓
keyword
Accepts a config_file argument but does not apply parameters from it.
call
✗
✓
✓
✓
✓
keyword
User config file passed as keyword argument. Config file's class field ignored.
from_cmdline
✓
✓
✓
✓
✓
CLI
Selects Step subclass based on config class field or class name argument.
from_config_file
✓
✗
✗
✓
✗
Selects Step subclass based on config class field.
from_config_section
✗
✗
✗
✓
✗
Probably not intended to be part of the public API.
Methods for running steps
Column key:
Creates step: Does the method create the Step instance before running it?
Method
Creates step
Notes
__call__
✗
Alias for run.
call
✓
Python API only (not used by CLI code).
from_cmdline
✓
The strun script is a thin wrapper around this method.
process
✗
Subclass implementation method. Not intended to be called directly by general users.
run
✗
Eventually called by any method that needs to run the step.
Suggestions for improvement
Eliminate run-and-call methods. Little value add, and presents confusing interface where step creation arguments and step run arguments are blended together into one method signature.
Move methods that return instances whose classes may be different from the one the method was invoked on. This is confusing and better handled with module methods.
Remove from_cmdline and instead call the corresponding cmdline module method directly.
Rename process to make clear that it shouldn't be invoked by users. Maybe a name with a leading underscore, or something like run_impl.
Remove one of run or __call__ so that usage is uniform.
Rename config_file argument to __init__ to something like working_dir, to make clear that the config is not loaded.
Change CLI code to be a relatively thin wrapper around the Python interface (instead of parallel implementations like the current from_cmdline vs call). This will ensure consistency between the two interfaces. There is already some divergence between call and from_cmdline, e.g. the _pars_model atttribute is not set by call, and call doesn't know how to select the Step subclass based on a config.
Pass around step parameters as a separate dict argument instead of **kwargs. This provides a clear separation between the parameters and other method arguments.
Possible new interface
Step.__init__(self, params=None, working_dir=None, ...): Parameters are passed to initializer in a dict.
Step.__call__(self, *args): Wrapper around call_impl that handles common setup and teardown.
stpipe.create_step(*, step_class=None, config_path=None, crds_params_enabled=True, dataset=None, params=None, working_dir=None, ...): Convenience method for creating steps. At least one of step_class or config_path is required to determine the step class. dataset is required if crds_params_enabled is True.
stpipe.cmdline.from_cmdline(args): Method that parses CLI arguments. Ends in a call to stpipe.create_step.
Run step from CLI
fromstpipe.cmdlineimportfrom_cmdline# stpipe step run config.cfg dataset.asdf --foo=42step, inputs=from_cmdline(args)
step(*inputs)
This issue is for discussion of potential improvements to the Python interface for creating and running steps and pipelines. First a summary of the current state of affairs:
Methods for creating steps
Column key:
Step
subclass based on arguments or config file contents?config_file
argument but does not apply parameters from it.class
field ignored.class
field or class name argument.class
field.Methods for running steps
Column key:
Step
instance before running it?run
.strun
script is a thin wrapper around this method.Suggestions for improvement
from_cmdline
and instead call the correspondingcmdline
module method directly.process
to make clear that it shouldn't be invoked by users. Maybe a name with a leading underscore, or something likerun_impl
.run
or__call__
so that usage is uniform.config_file
argument to__init__
to something likeworking_dir
, to make clear that the config is not loaded.from_cmdline
vscall
). This will ensure consistency between the two interfaces. There is already some divergence betweencall
andfrom_cmdline
, e.g. the_pars_model
atttribute is not set bycall
, andcall
doesn't know how to select theStep
subclass based on a config.dict
argument instead of**kwargs
. This provides a clear separation between the parameters and other method arguments.Possible new interface
Step.__init__(self, params=None, working_dir=None, ...)
: Parameters are passed to initializer in adict
.Step.call_impl(self, *args)
:Step
subclass implementation.Step.__call__(self, *args)
: Wrapper aroundcall_impl
that handles common setup and teardown.stpipe.create_step(*, step_class=None, config_path=None, crds_params_enabled=True, dataset=None, params=None, working_dir=None, ...)
: Convenience method for creating steps. At least one ofstep_class
orconfig_path
is required to determine the step class.dataset
is required ifcrds_params_enabled
isTrue
.stpipe.cmdline.from_cmdline(args)
: Method that parses CLI arguments. Ends in a call tostpipe.create_step
.Run step from CLI
Run step from Python
or
Developing a step
The text was updated successfully, but these errors were encountered: