Question: How to implement CLI shortcuts for nested configs? #2022

jbaczek · 2022-02-09T13:52:54Z

So far I have resolved tons of config management issues with OmegaConf's custom resolvers. It isn't the neatest solution but from the user perspective it gets the job done. But there is one thing that I cannot do: I want to create shortcut(alias) in the top level config that will be resolved to a full path. Example:

trainer:
    criterion:
        _target_: [...]
        arg: 1337
    config:
        batch_size: 32

Typing python hydra_app.py trainer.config.batch_size=32 trainer/criterion=L1 is really cumbersome. I'd like to type this interchangeably: python hydra_app.py batch_size=32 criterion=L1. Is there any possibility to do this?

The text was updated successfully, but these errors were encountered:

pixelb · 2022-02-09T22:00:49Z

I'm not sure that could be achieved from a general point of view.
One might have a search mechanism, but that would be very brittle if the config was changed, thus causing other items to be matched.

Jasha10 · 2022-02-10T06:52:20Z

At least for batch_size, wouldn't interpolation to a top-level key do the trick?

trainer:
    criterion:
        _target_: [...]
        arg: 1337
    config:
        batch_size: ${batch_size}
batch_size: 32

This would allow python hydra_app.py batch_size=64.

jbaczek · 2022-02-10T10:00:30Z

This solution would leave batch_size in the root level config. I want to keep my final config tidy (I use it for recursive instantiation).

Jasha10 · 2022-02-10T18:50:33Z

Got it. In that case, the only workaround I can think of are:

deleting the "aliases" before instantiating, or
using the root level as a "scratch space", and having the instantiated config be nested one level deeper.

deleting the top-level "aliases" before instantiating:

aliases: List[str] = ["batch_size"]

@hydra.main(...)
def app(cfg):
    OmegaConf.resolve(cfg)
    for alias in aliases:
        if alias in cfg:
            del cfg[alias]
    instantiate(cfg)

Edit: Added a call to OmegaConf.resolve before deleting deleting aliases

using the root level as a "scratch space":

instantiate_me:
  trainer:
      criterion:
          _target_: [...]
          arg: 1337
      config:
          batch_size: ${batch_size}
# scratch:
batch_size: 32

@hydra.main(...)
def app(cfg):
    instantiate(cfg.instantiate_me)

jbaczek · 2022-02-11T11:38:38Z

This will work with recursive instantiation but in turn won't work with duplicate config detection. In experimet_dir/.hydra/config.yaml we have unresolved config. The config created with python hydra_app.py batch_size=64 would be different from python hydra_app.py trainer.config.batch_size=64. (in the first case it will contain scratch space and interpolation, in the latter the value of batch_size, would be different from trainer.config.batch_size). I use these configs to detect whether my experiment has been already performed or killed, so I can avoid doing unnecessary computation. I need a solution that will be resolved at config construction.

Jasha10 · 2022-02-12T20:13:31Z

I use these configs to detect whether my experiment has been already performed or killed, so I can avoid doing unnecessary computation. I need a solution that will be resolved at config construction.

Interesting! Would you be willing to share the code that you use for comparing an experiment with previous experiments? I'm wondering whether a strategically-placed call to OmegaConf.resolve (in the comparison logic) would smooth over the differences between python hydra_app.py batch_size=64 and python hydra_app.py trainer.config.batch_size=64. I'm also curious to see the method you're using to re-load the config from a previous run (as this is an area where Hydra might be improved).

jbaczek · 2022-02-14T11:14:16Z

My approach is really simple and not very pretty :) First I set root of a search to the outputs directory. Then I use os.walk to find all hydra configs in the tree. I don't need any resolution in my case because all interpolations are computed from the other fields and if the formulas are the same then it's value will be the same as well. I leverage the fact that configs have a string representation and compare them this way.

def detect_duplicated_run():                                                                                          
    """                                                                                                               
    Returns list of paths of the runs with the same config as provided                                                
    """                                                                                                               
    # This is meant to be called in a trainer class, which means that this doesn't have access to the top level config
    current_config = OmegaConf.load('.hydra/config.yaml')                                                             
    # I know it's bad, but location of output dir of run and multirun aren't held in the same location in the config                                       
    rel = os.path.relpath(os.getcwd(), get_original_cwd())                                                            
    rel = next(x for x in rel.split(os.path.sep))                                                                     
    result_dir = os.path.join(get_original_cwd(), rel)                                                                
                                                                                                                      
    duplicated = []                                                                                                   
    for p, s, f in os.walk(result_dir):                                                                               
        if '.hydra' in s:                                                                                             
            c = OmegaConf.load(os.path.join(p, '.hydra/config.yaml'))                                                 
            if hash(c) == hash(current_config):                                                                       
                duplicated.append(p)                                                                                  
    # Don't take into account runs that ended before any checkpoint had been saved                                    
    # or current run (at this point hydra's config has already been saved)                                            
    duplicated = [p for p in duplicated if os.path.exists(os.path.join(p,'last_checkpoint.zip'))]                     
                                                                                                                      
    return duplicated

This gives me potential candidates for duplicated experiments. Then I do artifacts checks.
And then I check the appropriate log file to know from which point I should resume my experiment.

Jasha10 · 2022-02-14T20:29:46Z

Cool, thanks for sharing!

I'm wondering whether a strategically-placed call to OmegaConf.resolve (in the comparison logic) would smooth over the differences ...

My idea was to modify def detect_duplicated_run as below:

def detect_duplicated_run():                                                                                          
    current_config = OmegaConf.load('.hydra/config.yaml')                                                             
    OmegaConf.resolve(curent_config)
    for alias in aliases:
        current_config.pop(alias, None)
    ...
        if '.hydra' in s:                                                                                             
            OmegaConf.resolve(c)
            for alias in aliases:
                c.pop(alias, None)
    ...

This has the drawback that you'd need the same list aliases to be compatible with both your current config and all of your past configs -- it doesn't solve the problem of experiment-specific aliases.

Jasha10 · 2022-02-14T20:32:28Z

What if you use a top-level key in your config to store a list of experiment-specific alias keys?
You could then use the following logic in your main function:

@hydra.main(...)
def app(cfg):
    OmegaConf.resolve(cfg)
    for alias in cfg.pop("_aliases", []):
        if alias in cfg:
            del cfg[alias]
    instantiate(cfg)

And you could add something like this to your duplicate config detection function (before the call to hash):

def detect_duplicated_run():                                                                                          
    current_config = OmegaConf.load('.hydra/config.yaml')                                                             
    OmegaConf.resolve(curent_config)
    for alias in current_config.pop("_aliases", []):
        current_config.pop(alias, None)
    ...
        if '.hydra' in s:                                                                                             
            OmegaConf.resolve(c)
            for alias in c.pop("_aliases", []):
                c.pop(alias, None)
    ...

Jasha10 · 2022-02-14T20:32:43Z

Closed accidentally.

jbaczek · 2022-02-15T08:40:32Z

The idea of maintaining list of aliases in a root config is a really good one, I think. In my application I have couple different entry points which reuse config files (for training/inference/deployment/preprocessing). This way I have to add couple config specific python lines to each of them. This feels like it should be handled by hydra. Should I open a feature request?

Jasha10 · 2022-02-15T18:05:42Z

I can't promise that we'd implement such a feature; it can be implemented on the user side or by a third party library, and implementing it on the Hydra side would involve committing to a new top-level reserved key (e.g. _aliases). That being said, if you're inclined to open a feature request, that would certainly be welcome, and would give us a chance to see if there is a lot of demand for such a feature among the community of Hydra users more broadly.

jbaczek · 2022-02-16T10:14:20Z

Moving discussion to the FR #2032

Jasha10 closed this as completed Feb 14, 2022

Jasha10 reopened this Feb 14, 2022

jbaczek mentioned this issue Feb 16, 2022

[Feature Request]Config aliases #2032

Open

jbaczek closed this as completed Feb 16, 2022

Jasha10 added the reproducibility label Feb 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: How to implement CLI shortcuts for nested configs? #2022

Question: How to implement CLI shortcuts for nested configs? #2022

jbaczek commented Feb 9, 2022

pixelb commented Feb 9, 2022

Jasha10 commented Feb 10, 2022

jbaczek commented Feb 10, 2022

Jasha10 commented Feb 10, 2022 •

edited

Loading

jbaczek commented Feb 11, 2022

Jasha10 commented Feb 12, 2022 •

edited

Loading

jbaczek commented Feb 14, 2022 •

edited

Loading

Jasha10 commented Feb 14, 2022 •

edited

Loading

Jasha10 commented Feb 14, 2022 •

edited

Loading

Jasha10 commented Feb 14, 2022

jbaczek commented Feb 15, 2022

Jasha10 commented Feb 15, 2022 •

edited

Loading

jbaczek commented Feb 16, 2022

Question: How to implement CLI shortcuts for nested configs? #2022

Question: How to implement CLI shortcuts for nested configs? #2022

Comments

jbaczek commented Feb 9, 2022

pixelb commented Feb 9, 2022

Jasha10 commented Feb 10, 2022

jbaczek commented Feb 10, 2022

Jasha10 commented Feb 10, 2022 • edited Loading

deleting the top-level "aliases" before instantiating:

using the root level as a "scratch space":

jbaczek commented Feb 11, 2022

Jasha10 commented Feb 12, 2022 • edited Loading

jbaczek commented Feb 14, 2022 • edited Loading

Jasha10 commented Feb 14, 2022 • edited Loading

Jasha10 commented Feb 14, 2022 • edited Loading

Jasha10 commented Feb 14, 2022

jbaczek commented Feb 15, 2022

Jasha10 commented Feb 15, 2022 • edited Loading

jbaczek commented Feb 16, 2022

Jasha10 commented Feb 10, 2022 •

edited

Loading

Jasha10 commented Feb 12, 2022 •

edited

Loading

jbaczek commented Feb 14, 2022 •

edited

Loading

Jasha10 commented Feb 14, 2022 •

edited

Loading

Jasha10 commented Feb 14, 2022 •

edited

Loading

Jasha10 commented Feb 15, 2022 •

edited

Loading