Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OR Override-if-exists #1049

Closed
tkornuta-nvidia opened this issue Oct 9, 2020 · 5 comments · Fixed by #1440
Closed

Add OR Override-if-exists #1049

tkornuta-nvidia opened this issue Oct 9, 2020 · 5 comments · Fixed by #1440
Labels
enhancement Enhanvement request
Milestone

Comments

@tkornuta-nvidia
Copy link

🚀 Feature Request

Add option to "Add-OR-Override-if-exists" to command line params

e.g.

python myscript.py !gpus=2 # gpus will be set to two, disregarding whether it existed in the original config file or not

Motivation

Hydra offers few options to handle params from command line:

python myscript.py gpus=2 # override gpus (must exists in config files/structured config earlier!)
python myscript.py ~gpus # removes gpus from configuration 
python myscript.py +gpus=2 # add gpus (cannot exists in config files/structured config earlier!)

Note +gpus will raise an error if it was already present in the configuration.

So the case, when we want to have gpus set to 2, disregarding whether it was present in yaml file or now, is missing.

@tkornuta-nvidia tkornuta-nvidia added the enhancement Enhanvement request label Oct 9, 2020
@omry
Copy link
Collaborator

omry commented Oct 10, 2020

I understand the request, but t looks like you are trying to work around incomplete configuration.
If you are always expecting the config to have something, you can make it happen in multiple ways.
for example, with Structured Configs - you can extend a base config that will ensure it's there.
You can also require your users to include a nemo node in their top level schema:

@dataclass
class NemoConfig:
  gpus: int = 1

# composing (I think it's better here):
@dataclass
class AppConfig:
  nemo : NemoConfig = NemoConfig()

# or extending
@dataclass
class AppConfig2(NemoConfig):
  ...

@tkornuta-nvidia
Copy link
Author

My use-case/motivation is slightly different:

  1. I have an entrypoint script that calls several "task scripts" doing different jobs
  2. entrypoint script has a standardized list of arguments (argparse - due to external requirements)
  3. each "task script" is hydra-driven
  4. the list of scripts is not constant, pre-defined. It is created on the fly, by analyzing the content of the folder.

The problem was that some scripts assumed presence of a particular parameter ("gpus") in yaml files, and the other did not.

So I did not wanted to create a big "if-else" case in the entrypoint script.
With construct:

if xxx then args += " gpus=1"

some scripts were working, while in others hydra was raising errors (as gpus wasn't present in some yaml files).

Analogically, with:

if xxx then args += " +gpus=1"

some other scripts weren't working.

I have solved that by using Structured Configs (as in Feature Request 1048), but still the problem of having a parameter disregarding it added or overriden remains...

@omry
Copy link
Collaborator

omry commented Oct 20, 2020

When you are overriding gpus=1 on the target process, you are communicating using a protocol.
Something on the other side needs to use this information, right?

If the script does support multigpus but it calls it devices or gpu_ids or something else it will also break. It HAS to be gpus.

My proposal is to require scripts that re being executed via the entrypoint script to follow a protocol.
If you were using argparse, you would HAVE to add a --gpus option to it, right?
Why is it too much to require that Hydra apps also add a parallel gpus flag in their configuration?
Hydra makes it possible to more easily reuse the common configuration to standardize the command line protocol. (using one of the methods I outlined in #1048).
Once you do that, you can just set the gpus without a + prefix, assuming it's there.
(It would also probably be something like trainer.gpus=1 or nemo.gpu=1 to avoid abusing the global config namespace, but that's a style choice).

Having said all that, this may still be a useful thing to support. but I am not sure your use-case is convincing enough.
There is related issue #1052, where there is a different problem with the current behavior in the context of sweeps.
I suggested a workaround there of using a dictionary override (but that would not work for your gpus example because merging dictionary to the top level config node is not supported).
Take a look at that issue and see if the suggested workaround can help you.

@tkornuta-nvidia
Copy link
Author

Hey @omry thanks for the reply.

So I have solved that issue with Structured Configs - but I brought up that example just to indicate the problem.

If the script does support multigpus but it calls it devices or gpu_ids or something else it will also break. It HAS to be gpus.
...

Yap, your thinking is 100% correct.

But please note one thing:

  • when you override param by command-line argument "my.magic.param=z" then you validate the presence of the param in the original configuration (which is awesome!)

  • by adding "+" you already opened the door for additional parameters, that weren't in the original configuration. So whenever the user adds "+my.new.param=x" then there are guaranties that the python script will use that param at all.

I am just saying that I am missing solution when one wants to override the parameters disregarding its presence in the original configuration. Hmmmm, when the problem is defined like this, I am starting to think that maybe the solution is to stay with "+", but add to hydra option similar to arg_parse conflict_handler and let the user to decide whether "+" should only add or also override?

@omry
Copy link
Collaborator

omry commented Oct 28, 2020

As I pointed out in the comment on #1052, a solution here is to use dictionary merge, which has a different semantics.

$ python examples/tutorials/basic/your_first_hydra_app/5_defaults/my_app.py '+db={gpus:10,driver:gpudb}'
db:
  driver: gpudb
  user: omry
  pass: secret
  gpus: 10

This covers you if you have a key though, I don't think you can use this to add a top level key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhanvement request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants