Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the possibility of defining default decorators for steps #1837

Merged
merged 10 commits into from
May 29, 2024

Conversation

romain-intel
Copy link
Contributor

Default decorators can be defined in one of two ways:

  • using the METAFLOW_DEFAULT_STEP_DECORATORS configuration (env var or configuration value). If this is set, it will be applied regardless of anything else. This is a comma separated string.
  • having extensions use TOGGLE_DEFAULT_STEP_DECORATORS. This is a list of strings that:
    • start with + or - (or something else which is equivalent to a +) -- the + means add a default decorator and a - means remove one.
    • are a decospec for the decorator to be added or removed.

I will add tests and try it out but putting it here for initial review to see if the method is OK.

@saikonen saikonen self-requested a review May 21, 2024 16:11
@romain-intel romain-intel changed the title [WIP] Add the possibility of defining default decorators for steps Add the possibility of defining default decorators for steps May 24, 2024
attrstr = ",".join(attr_list)
return "%s:%s" % (self.name, attrstr)
else:
return self.name

def __str__(self):
mode = "decorated" if self.statically_defined else "cli"
mode = "decorated" if self.statically_defined else "cli or configuration"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe undecorated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find "undecorated" to be less specific. It doesn't really give much information. What do you think of "injected". We could also have "static" and "dynamic" simply. Wdyt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep - static and dynamic works well.

metaflow/cli.py Outdated
@@ -837,6 +859,7 @@ def version(obj):
multiple=True,
help="Add a decorator to all steps. You can specify this option "
"multiple times to attach multiple decorators in steps.",
callback=use_conf_and_merge_cb,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit - decospecs_callback

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't have a huge preference but my intent was to make this function useful for anything that comes from conf too. Maybe use_conf_for_multiple_cb. It also doesn't apply to all decospecs for example.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just config_callback?

savingoyal
savingoyal previously approved these changes May 28, 2024
Copy link
Collaborator

@saikonen saikonen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure on the cause, as no changes touch this codepath directly:
when running a flow remotely with f.ex. --with kubernetes, I'm getting to following in stdout

'nbrun' requires an interactive python environment (such as Jupyter)

same is not happening with master

This is in preparation of having default decorators injected by
the user and/or various metaflow extensions.
Mechanism is now as follows:
 - users can set METAFLOW_DECOSPECS in their configuration or in the environment
 - extensions can set DECOSPECS in their config or set a list of TOGGLE_DECOSPECS

METAFLOW_DECOSPECS/DECOSPECS is a string (space separated to form an array)

Precedence order (from first priority to last):
  - DECOSPECS from extension if it sets DECOSPECS = ...
  - METAFLOW_DECOSPECS in the environment
  - METAFLOW_DECOSPECS in the configuration
  - DECOSPECS (default from extension if it uses from_conf...)
  - If still not set after all these checks, form by adding all the
    TOGGLE_DECOSPECS together
metaflow/cli.py Outdated
Comment on lines 137 to 140
splits = DECOSPECS.split()
if len(splits) == len(value) and all([a == b for (a, b) in zip(splits, value)]):
return value
return tuple(DECOSPECS.split() + list(value))
Copy link
Collaborator

@saikonen saikonen May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this was decided already before, but noting some oddities regarding the hierarchy in the config.

test setup:

# metaflow config.json
"METAFLOW_DECOSPECS": "kubernetes:memory=3072"

# cmd
python HelloFlow.py run --with "kubernetes:memory=1024,cpu=2"

results in the higher memory value being used, and cpu omitted.


same setup with additional env var set

METAFLOW_DECOSPECS="kubernetes:memory=2048"

results in env var overriding value from config.json (as expected)


directly supplying values in a deco seems to override everything else as expected

@kubernetes(memory=512, cpu=3)

so the succession right now seems to be

explicit decorator in flow file
> METAFLOW_DECOSPEC env
> METAFLOW_DECOSPEC in config.json
> CLI --with

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only thing I found odd with the succession is treating an explicit CLI --with secondary when the config/envvar values are advertised as setting defaults only. My expectation would be that an option passed through CLI would perhaps override defaults

Disregard if this is intended behaviour :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed the behavior that is coded. There is no notion of "decorator merge" although there should be. For now though, I think this is good enough but going forward we should see how we can merge decorators. This is the same problem as the @Environment decorator where one overrides all others (and they don't merge). There is also no warning of duplicate decorators because they are all non static. In short, we have not changed the behavior at all but it can be a bit weird. You can try for example --with kubernetes:memory=2048 --with kubernetes:memory=1024,cpu=2, the second one will be ignored just like it is right now. You can view the default decospecs as stuff that is prepended basically. Am happy to make it post pended though if ppl prefer, as in:
explicit deco in flow file
CLI --with
METAFLOW_DECOSPEC env
METAFLOW_DECOSPEC in config.json.

Given we have no promises on any ordering right now, I don't view that as a huge issue and I think we need to more cleanly address it but that would be for another set of changes.

@romain-intel
Copy link
Contributor Author

@saikonen -- could you try again. I rebased on master and addressed @savingoyal 's comments. I suspect the error you are getting had to do with the previous changes for the runner and I didn't take all the "fixes" and master was, at some point, in a bad state.

@saikonen
Copy link
Collaborator

@saikonen -- could you try again. I rebased on master and addressed @savingoyal 's comments. I suspect the error you are getting had to do with the previous changes for the runner and I didn't take all the "fixes" and master was, at some point, in a bad state.

seems to be fixed now 👍

The order is now:
 - statically defined using a decorator
 - passed using --with
 - passed using METAFLOW_DECOSPECS in the environment variable
 - passed using METAFLOW_DECOSPECS in a configuration
 - passed using TOGGLE_DECOSPECS in the configuration

Note that an error will occur if:
 - a decorator is defined statically and dynamically and it doesn't allow multiple

An error will NOT be raised if:
 - a decorator is passed multiple times dynamically (newer ones are *ignored*)
./myflow.py run --with foo both work properly
Comment on lines +532 to +537
elif n == "TOGGLE_DECOSPECS":
if any([x.startswith("-") for x in o]):
raise ValueError("Removing decospecs is not currently supported")
if any(" " in x for x in o):
raise ValueError("Decospecs cannot contain spaces")
_TOGGLE_DECOSPECS.extend(o)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the latest on this? it seems we can only add decospecs for now through extensions, not remove them.

The usage is a bit different from the PR description as well, not requiring prepending a + for adding a decospec. This makes sense for now as - is not supported, but should we agree on the final form already, or is there a use case for this yet?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for adding decospecs the mechanism is working as expected 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@savingoyal had me remove the - option since he thought it would be confusing. It can easily be added back if we feel the need for it later (I don't need it right now, I had initially done the - to keep in line with the TOGGLE_STEP_DECORATOR for example (but that one I really needed the -)).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was more so wondering about the prefixes in general, as currently + is not supported either, so do we even need a check against -?

not working at the moment. Is this intended?

TOGGLE_DECOSPECS = ["+kubernetes:memory=1024"]

this works fine

TOGGLE_DECOSPECS = ["kubernetes:memory=1024"]

as its currently append-only, naming the var toggle is a bit confusing. We can keep it as is though if the end goal is to support toggling the decospecs.

@romain-intel romain-intel merged commit 3947a0b into master May 29, 2024
26 checks passed
@romain-intel romain-intel deleted the feat/default-decorators branch May 29, 2024 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants