[Bug] Multirun does not work with differing parameter sets groups #1052

goens · 2020-10-11T19:42:42Z

🐛 Bug

Description

When running multirun with two configurations, it's impossible to change a variable that is not in one of the configurations, even if it is in the other one. You might not classify this as a bug, but I believe it is, since it renders it impossible to use multirun in such a case. Perhaps it makes more sense looking at the example below.

Checklist

I checked on the latest version of Hydra
I created a minimal repro

To reproduce

** Minimal Code/Config snippet to reproduce **
Imagine you have two database systems (like in the documentation), I'll call them, very creatively, one and two.
You could have a basic setup as follows:

main.py
config.yaml
db/one.yaml
db/two.yaml

This is the content of the files in such a minimal config:

main.py:

import hydra
@hydra.main(config_name='config')
def main(cfg):
    print(cfg['db'])
main()

config.yaml:

# @package _global_
defaults:
  - db: one

db/one.yaml:

#@package _group_
shared_feature : true

db/two.yaml:

#@package _group_
shared_feature: false
fancy_exclusive_feature : false

Say you have a shared feature, shared_feature, and you want to run your program with both databases by iterating that shared feature. You can run something like:

$> python main.py db=one,two db.shared_feature=true,false -m                                                                                                                                                   
[2020-10-11 21:31:17,032][HYDRA] Launching 4 jobs locally
[2020-10-11 21:31:17,032][HYDRA] 	#0 : db=one db.shared_feature=True
{'shared_feature': True}
[2020-10-11 21:31:17,224][HYDRA] 	#1 : db=one db.shared_feature=False
{'shared_feature': False}
[2020-10-11 21:31:17,409][HYDRA] 	#2 : db=two db.shared_feature=True
{'shared_feature': True, 'fancy_exclusive_feature': False}
[2020-10-11 21:31:17,597][HYDRA] 	#3 : db=two db.shared_feature=False
{'shared_feature': False, 'fancy_exclusive_feature': False}

This works perfectly fine. Now assume I want to run one of the two databases, two, with a fancy exclusive feature which the other one does not have. I have two options:

python main.py db=one,two +db.fancy_exclusive_feature=true,false -m

which could potentially add it two the first one (and hopefully not break anything, just unnecessarily run it twice with it). Or the other option:

python main.py db=two,one db.fancy_exclusive_feature=true,false -m

Where it would just change it for two and ideally make three runs, one with db=one and two with db=two and db.fancy_exclusive_feature=true and false.

However, neither option works. Here are the corresponding error messages:

** Stack trace/error message **

$> python main.py db=one,two db.fancy_exclusive_feature=true,false -m                                                                                                                                          Could not override 'db.fancy_exclusive_feature'.
To append to your config use +db.fancy_exclusive_feature=True
Key 'fancy_exclusive_feature' is not in struct
	full_key: db.fancy_exclusive_feature
	reference_type=Optional[Dict[Union[str, Enum], Any]]
	object_type=dict

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

and in the other case:

$> python main.py db=one,two +db.fancy_exclusive_feature=true,false -m                                                                                                                                         
Could not append to config. An item is already at 'db.fancy_exclusive_feature'.

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Obviously Hydra checks that the added option db.fancy_exclusive_feature works with every case of db in the full "product" of configuration schemes, so in both ways one case breaks and multirun breaks.

Expected Behavior

Ideally, if this item exists in one option and not in the rest, hydra would just add the options conditionally to where it makes sense, producing three configurations when running:

python main.py db=two,one db.fancy_exclusive_feature=true,false -m

It is also thinkable that the other option, when adding the feature additionally, would produce four configurations:

python main.py db=two,one +db.fancy_exclusive_feature=true,false -m

System information

Hydra Version : 1.0.3
Python version : 3.8.6
Virtual environment type and version : virtualenv 20.0.29+ds
Operating system : debian sid (unstable)

The text was updated successfully, but these errors were encountered:

omry · 2020-10-11T20:06:11Z

Thanks for the report, this is more of a design issue than a bug.
You can try overriding with a dictionary, which behaves differently (it's merge and not assignment, and as such the semantics allow that if you use +).
You can learn more about dictionaries as overrides here.

Example (here timeout originally exists only in postgresql).

$ python my_app.py  -m '+db={timeout:10}' db=mysql,postgresql
[2020-10-11 13:00:47,752][HYDRA] Launching 2 jobs locally
[2020-10-11 13:00:47,752][HYDRA]        #0 : +db={timeout:10} db=mysql
db:
  driver: mysql
  user: omry
  pass: secret
  timeout: 10

[2020-10-11 13:00:47,860][HYDRA]        #1 : +db={timeout:10} db=postgresql
db:
  driver: postgresql
  user: postgre_user
  pass: drowssap
  timeout: 10

I will keep this open to consider changing the behavior of + (or extending it) for 1.1,

goens · 2020-10-11T20:08:38Z

that makes sense, thanks!

omry · 2020-10-11T20:10:58Z

Great!
Dictionary and list support in overrides are one of the new things in 1.0. Spend some time reading the override grammar documentation, it has many new and awesome features (also for multirun: like glob, shuffle and more).

omry · 2021-04-01T22:42:07Z

#1440 add support for force-adding a config variable (++foo.bar=10).
I think it provides a reasonable solution to this. Please try and reopen if you run into issues.

IlyasMoutawwakil · 2023-05-02T01:36:47Z

hi @goens, did you find a way to only produce 3 configurations ?

omry · 2023-05-02T19:46:24Z

You can define 3 experiment configs and sweep over them.

https://hydra.cc/docs/patterns/configuring_experiments/

goens added the bug Something isn't working label Oct 11, 2020

omry added enhancement Enhanvement request and removed bug Something isn't working labels Oct 11, 2020

omry added this to the 1.1.0 milestone Oct 11, 2020

omry mentioned this issue Oct 20, 2020

Add OR Override-if-exists #1049

Closed

omry closed this as completed Apr 1, 2021

IlyasMoutawwakil mentioned this issue May 2, 2023

Question: How to sweep over alpha and beta samuelstanton/hydra-ray-demo#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Multirun does not work with differing parameter sets groups #1052

[Bug] Multirun does not work with differing parameter sets groups #1052

goens commented Oct 11, 2020 •

edited

Loading

omry commented Oct 11, 2020

goens commented Oct 11, 2020

omry commented Oct 11, 2020

omry commented Apr 1, 2021

IlyasMoutawwakil commented May 2, 2023

omry commented May 2, 2023

[Bug] Multirun does not work with differing parameter sets groups #1052

[Bug] Multirun does not work with differing parameter sets groups #1052

Comments

goens commented Oct 11, 2020 • edited Loading

🐛 Bug

Description

Checklist

To reproduce

Expected Behavior

System information

omry commented Oct 11, 2020

goens commented Oct 11, 2020

omry commented Oct 11, 2020

omry commented Apr 1, 2021

IlyasMoutawwakil commented May 2, 2023

omry commented May 2, 2023

goens commented Oct 11, 2020 •

edited

Loading