Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Allow selecting config group in another config group #1091

Closed
janvainer opened this issue Oct 25, 2020 · 4 comments
Closed
Labels
enhancement Enhanvement request

Comments

@janvainer
Copy link

🚀 Feature Request

I would like to be able to select an option for a given config group in another config group. I would also like to be able to instantiate all necessary nested objects recursively with a single command.

Motivation

I want to use as many classes in my projects as possible as config groups inside hydra configs. Moreover, I want to be able to automatically compose most of the objects with a single instantiate command. Right now I have list of defaults in the root config file and then instantiate and compose all the objects one by one in my main script. This allows me to select each Class from a config group easily (python my_app.py optionA=ClassX optionB=ClassY), but does not help with recursive instantiation. As the project grows, I add more and more configs to the defaults list and more instantiate statements to my_app.py, but most of the instantiated objects are simply passed to several container objects and could be instantiated recursively. But by nesting configs for the contained objects inside the container objects, I lose the ability to treat the contained objects as config groups, which is frustrating.

Pitch

Describe the solution you'd like
For example, I have Tokenizer classes and Normalizer classes. I would like to be able to do the following:

config/
  config.yaml
    /tokenizer
      CharTokenizer.yaml
      SentencePieceTokenizer.yaml
   /normalizer
      EnglishTextNormalizer.yaml
      GermanTextNormalizer.yaml

<config.yaml>
default:
    - tokenizer: CharTokenizer

<CharTokenizer.yaml> (and similar for SentencePiece tokenizer)
_target_: some.path.CharTokenizer
tokens: ["a", "b", "c"]
normalizer: ${EnglishTextNormalizer}  # this should simply copy the contents of the normalizer config here - other syntax could be used

<EnglishTextNormalizer.yaml>
_target_: some.other.path.EnglishTextNormalizer
lowercase: true
remove_punctuation: true

<my_app.py>

@hydra.main(config_path="config", config_name="config")
def main(cfg):
    # this
    tokenizer = instantiate(cfg.tokenizer)
    # should be equal to this
    tokenizer = CharTokenizer(
        tokens=["a", "b", "c"],
        EnglishNormalizer(true, true)
    )

Then I would simply call python my_app.py tokenizer.normalizer=GermanTextNormalizer to change the group config option for normalizer to german normalizer instead of the english one.

Describe alternatives you've considered
Right now, if I want to use both Tokenizer and Normalizer classes as config groups, I have to instantiate the objects separately and pass the normalizer to the tokenizer as an argument inside my python code. This becomes cumbersome as the project grows.

config/
  config.yaml
    /tokenizer
      CharTokenizer.yaml
      SentencePieceTokenizer.yaml
   /normalizer
      EnglishTextNormalizer.yaml
      GermanTextNormalizer.yaml

<config.yaml>
default:
    - tokenizer: CharTokenizer
    - normalizer: EnglishTextNormalizer

<CharTokenizer.yaml> (and similar for SentencePiece tokenizer)
_target_: some.path.CharTokenizer
tokens: ["a", "b", "c"]
normalizer: ${EnglishTextNormalizer}  # this should simply copy the contents of the normalizer config here - other syntax could be used

<EnglishTextNormalizer.yaml>
_target_: some.other.path.EnglishTextNormalizer
lowercase: true
remove_punctuation: true

<my_app.py>

@hydra.main(config_path="config", config_name="config")
def main(cfg):
    normalizer = instantiate(cfg.normalizer)
    tokenizer = instantiate(cfg.tokenizer, normalizer=normalizer)

Would this be considered a useful feature in the future?

@janvainer janvainer added the enhancement Enhanvement request label Oct 25, 2020
@omry
Copy link
Collaborator

omry commented Oct 25, 2020

Hi @lordofluck,
There are two existing issues for those two things:

  1. Recursive defaults support #171
  2. Recursive instantiation #566

Those two are high priority, and in fact recursive instantiation is already implemented in master, and recursive defaults list is in advanced stages in #1044.

Please experiment with the existing recursive defaults from master. You can learn about it in the updated docs for instantiation here.
As for the second feature, it's already more or less working and in cleanup stages in #1044, you are welcome to check it out as well (although it's not documented yet).

Closing as duplicate.

@omry omry closed this as completed Oct 25, 2020
@janvainer
Copy link
Author

Hi @omry, thanks for your answer. I am aware of the state of recursive instantiation that you described. There might be a misunderstanding. I would like to be able to use recursive configuration without having to specify the config for the nested object inside the container object if the nested object has its configuration in a separate file (is part of defined config group). Right now, I have to do eg.:

trainer:
  _target_: trainer.Trainer
  batch_size: 32
  model:
    _target_: model.ResNet
    num_layers: 50

even if there is ResNet.yaml already with the same defaults.

 _target_: model.ResNet
 num_layers: 50

I want to be able to do the following:

trainer:
  _target_: trainer.Trainer
  batch_size: 32
  model: ResNet   # interpret this as a config group instead of string!

Then, I should be able to do something like python trainer.model=ResNet or python trainer.model=VGG, where ResNet.yaml and VGG.yaml are in the same config group. And the trainer should be instantiatable recurcively. Does this explanation make a bit more sense now? The issue #171 looks relatively close to what I mean, but it deals with merging multiple defaults lists.
If there is a way how to do this already, could you show a short example? Maybe I completely misunderstood the related issues and in that case I am sorry. Regards, Jan

@omry
Copy link
Collaborator

omry commented Nov 10, 2020

You can already do in 1.0 by having two items in your defaults list in the primary config:

defaults:
 - trainer                 # will use trainer.yaml
 - model: resnet      # will use model/trainer.yaml

You can use package overrides to place the model inside the trainer:

defaults:
 - trainer                 
 - model@trainer: resnet      # the content of model/trainer.yaml will be rooted at trainer

You can also place your config groups hierarchically from the start (in such a cause I strongly recommend using # @package _group_ which will be the default in 1.1.

trainer/
  trainer.yaml
  model/
    resnet.yaml
    alexnet.yaml

@janvainer
Copy link
Author

Ok thank you very much. I will try the mentioned approaches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhanvement request
Projects
None yet
Development

No branches or pull requests

2 participants