Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infer from type-annotations for auto-config support #294

Open
addisonklinke opened this issue Aug 12, 2022 · 5 comments
Open

Infer from type-annotations for auto-config support #294

addisonklinke opened this issue Aug 12, 2022 · 5 comments
Labels
enhancement New feature or request

Comments

@addisonklinke
Copy link

When builds(..., populate_full_signature=True) is used on an object that expects recursive instantiation, the example config from a script's --help doesn't provide the user with any insight about the nested fields (and their potential defaults). To demonstrate, I've adapted this Hydra example to use Zen for the config creation

import hydra
from hydra.core.config_store import ConfigStore
from hydra.utils import instantiate
from hydra_zen import builds
from omegaconf import DictConfig


class Driver:
    def __init__(self, name: str = 'Bob', age: int = 16) -> None:
        self.name = name
        self.age = age


class Car:
    def __init__(self, driver: Driver):
        self.driver = driver

    def drive(self) -> None:
        print(f'Car driven by {self.driver.name}')


@hydra.main(config_path=None, config_name='config')
def my_app(cfg: DictConfig) -> None:
    car = instantiate(cfg)
    car.drive()


if __name__ == '__main__':

    cs = ConfigStore.instance()
    cs.store(name='config', node=builds(Car, populate_full_signature=True))
    my_app()

The help output is

_target_: __main__.Car
driver: ???

The correct CLI override syntax would be something like driver='{name:Bob, age:25}', but a user cannot discern this without reading the source code which is rather unfortunate. As an alternative, I can switch the type-hint in Car to have a default like so

class Car:
    def __init__(self, driver: Driver = builds(Driver, populate_full_signature=True):

This provides a much more informative help output which lets the user see that they must provide driver.name, but driver.age is optional and defaults to 16

_target_: __main__.Car
driver:
  _target_: __main__.Driver
  name: ???
  age: 16

Would it be reasonable to propagate the populate_full_signature kwarg from builds(Car) through to the recursive Driver parameter automatically? It seems oddly redundant to have to specify it both with cs.store() and then again in the type-hints. Especially if the signature of Car required multiple nested objects

@jgbos
Copy link
Contributor

jgbos commented Aug 15, 2022

Interesting use case. The configuration for Car cannot be instantiated because there is no default value provided for Car.

>>> instantiate(builds(Car, populate_full_signature=True))
...
MissingMandatoryValue: Missing mandatory value: driver
    full_key: driver
    object_type=Builds_Car

This is why the --help prints driver: ???.

If we provide a default for driver, driver: Driver = Driver(), then we get

>>> instantiate(builds(Car, populate_full_signature=True))
...
HydraZenUnsupportedPrimitiveError: Building: Car ..
 The configured value <__main__.Driver object at 0x000001AE702572B0>, for field `driver`, is not supported by Hydra -- serializing or instantiating this config would ultimately result in an error.

Consider using `hydra_zen.builds(<class '__main__.Driver'>, ...)` create a config for this particular value.

I'm thinking the best approach is to configure driver as well:

if __name__ == "__main__":
    cs = ConfigStore.instance()
    cs.store(
        name="config",
        node=builds(
            Car,
            driver=builds(Driver, populate_full_signature=True),
            populate_full_signature=True,
        )
    )
    my_app()

Here the output would be what you expect:

_target_: __main__.Car
driver:
  _target_: __main__.Driver
  name: Bob
  age: 16

@jgbos
Copy link
Contributor

jgbos commented Aug 15, 2022

Oh, just playing around with this idea. I'm guessing you'd want swapable configs for Driver (and Car). So you might want to do this instead:

if __name__ == "__main__":
    cs = ConfigStore.instance()
    cs.store(
        name="default", group="car", node=builds(Car, populate_full_signature=True)
    )
    cs.store(
        name="default",
        group="car/driver",
        node=builds(Driver, populate_full_signature=True),
    )
    cs.store(
        name="config",
        node=make_config(
            hydra_defaults=["_self_", {"car": "default"}, {"car/driver": "default"}]
        ),
    )
    my_app()

Here the --help provides:

_target_: __main__.Car
driver:
  _target_: __main__.Driver
  name: Bob
  age: 16

@addisonklinke
Copy link
Author

The configuration for Car cannot be instantiated because there is no default value provided for driver

To clarify, I understand why this fails from a technical standpoint. My question is more around the use-ability of displaying driver: ??? with the error output Missing mandatory value: driver. This doesn't give any hints about what kind of value(s) should be provided to satisfy driver (even though we could determine that for the user from code inspection)

Therefore I'm left with the decision to either

  1. Use driver=builds(Driver, populate_full_signature=True) as the default value. This enables a more user friendly CLI interface, but makes the code base fairly verbose since we're already using builds(Car, populate_full_signature=True)
  2. Keep the code-base concise (by avoiding repetitive builds(..., populate_full_signature=True), but this has the use-ability downsides I explained above

Neither option is completely satisfactory, so I was hoping this might spark additional brainstorming. Perhaps there is no win-win solution though, in which case I would opt for the first approach since I think use-ability is more important than conciseness

@rsokl
Copy link
Contributor

rsokl commented Aug 15, 2022

I agree that it would be nice to have the Driver information be auto-populated in your config, and that the Pythonic way to make this happen is via type annotations.

so I was hoping this might spark additional brainstorming

Indeed you have! I had never considered leveraging type-annotations to help populate a hierarchical config (e.g. where a Driver config is a component of the larger Car config). This is could be quite useful from an auto config perspective.

Use driver=builds(Driver, populate_full_signature=True) as the default value.

I wouldn't want hydra-zen to lead people down this sort of design path, where you are designing a class that only works properly in the context of Hydra, via recursive instantiation. Something like the solution you proposed would be much nicer.

I am currently working on my write up for #293, but I will be thinking about your proposal in the meantime, and will get back to you ASAP.

Thanks for sharing your thoughts and ideas on this!

@jgbos
Copy link
Contributor

jgbos commented Aug 16, 2022

Sorry, my comments were mostly me thinking out loud on the issue. I was definitely intrigued and frustrated that there didn't seem to be an approach to provide configuration information about this type of interface. It's been ingrained in me that I have to provide configs.

Indeed you have! I had never considered leveraging type-annotations to help populate a hierarchical config (e.g. where a Driver config is a component of the larger Car config).

I think this would be very powerful.

@rsokl rsokl changed the title Recursive builds(populate_full_signature=True) Infer from type-annotations for auto-config support Aug 16, 2022
@rsokl rsokl added the enhancement New feature or request label Sep 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants