Skip to content

[torchx/configs] Make runopts, Runopt, RunConfig, scheduler_args more consistent #250

@kiukchung

Description

@kiukchung

Description

Consolidate redundant names, classes, and arguments that represent scheduler RunConfig.

Motivation/Background

Currently there are different names for what essentially ends up being the additional runtime options for the torchx.scheduler (see dryrun(..., cfg: RunConfig)).

This runconfig is:

  1. has class type torchx.specs.api.RunConfig (dataclass)
  2. function argument name cfg or runcfg in most places in the scheduler and runner source code
  3. passed from the torchx run cli as --scheduler_args

Additionally each scheduler has what is called a runopts, which are the runconfig options that the scheduler advertises and takes (see runopts for local_scheduler).

The difference between RunConfig and runopts is that the RunConfig object is simply a holder for the user-provided config key-value pairs while runopts is the schema (type, default, is_required, help string) of the configs that it takes. Think of runopts being the argparse.ArgumentParser of the Scheduler if it were a cli tool, and RunConfig the sys.argv[1:] (but instead of an array it is a map).

Detailed Proposal

The proposal is to clean up the nomenclature as follows:

  1. Deprecate --scheduler_args option in torchx cli and instead call it --cfg (consistent with the parameter names in the Scheduler API).
  2. Change the section name In the runner INI config files from [$profile.scheduler_args.$sched_name] to [$profile.$scheduler_name.cfg] (e.g. [default.scheduler_args.local_cwd] would become [default.local_cwd.cfg])
  3. Rename Runopt to runopt (to be consistant with runopts which is a holder for runopt by name)

Alternatives

(not really an alternative but other deeper cleanups considered)

  1. changing the cfg parameter name in Scheduler and Runner interfaces to be runconfig (consistent with RunConfig) or alternatively changing RunConfig to RunCfg. This is going to be a huge codemod, hence I've decided to live with it and change the rest of the settings to match cfg.
  2. RunConfig is simply a wrapper around a regular python Dict[str, ConfigValue] (ConfigValue is a type alias not an actual class) and does not provide any additional functionality on top of the dict other than a prettyprint __repr__(). Considering just dropping the RunConfig dataclass and using Dict[str, ConfigValue] directly (also requires a huge codemod)

Additional context/links

See hyperlinks above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions