Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Feature: zen_wrappers #117

Closed
wants to merge 25 commits into from
Closed

New Feature: zen_wrappers #117

wants to merge 25 commits into from

Conversation

rsokl
Copy link
Contributor

@rsokl rsokl commented Oct 2, 2021

I'm pretty excited about this.. 😄

New Feature: zen_wrappers

The introduction of hydra_zen.funcs.zen_processing opened the door to incredible power and flexibility for modifying the instantiation process. zen_wrappers allows the user to "inject" one or more wrappers that will wrap the target-object before it is instantiated.

Whereas

conf = builds(target, *args, **kwargs)
instantiate(conf)

ultimately calls

target(*args, **kwargs)

Using zen_wrappers as

conf = builds(target, *args, **kwargs, zen_wrappers=[f1, f2])
instantiate(conf)

ultimately calls

target = f1(target)  # wrap!
target = f2(target)  # wrap!
target(*args, **kwargs)  # instantiate wrapped target!

A Basic Example

Let's cut to the chase and see this in action:

from hydra_zen import builds, instantiate, to_yaml

def say_hello_to(func, repeat=1):
    print(f"hello, {func.__name__}!" * repeat)
    return func

def say_goodbye_to(func):
    print(f"goodbye, {func.__name__}!")
    return func
# providing a single wrapper
>>> conf = builds(int, zen_wrappers=say_hello_to)
>>> instantiate(conf)
hello, int!
0

And the yaml is nice and readable

>>> print(to_yaml(conf))
_target_: hydra_zen.funcs.zen_processing
_zen_target: builtins.int
_zen_wrappers: __main__.say_hello_to
# providing a chain of wrappers, to be composed
>>> conf = builds(int, zen_wrappers=(say_goodbye_to, say_hello_to))
>>> instantiate(conf)
hello, int!
goodbye, int!
0

Incredibly... even the wrappers can be configured

>>> conf = builds(
...     int,
...     zen_wrappers=builds(
...         say_hello_to,
...         hydra_partial=True,
...         repeat=2,
...     ),
... )
>>> instantiate(conf)
hello, int!hello, int!

>>> print(to_yaml(conf))
_target_: hydra_zen.funcs.zen_processing
_zen_target: builtins.int
_zen_wrappers:
  _target_: hydra_zen.funcs.zen_processing
  _zen_target: __main__.say_hello_to
  _zen_partial: true
  repeat: 2

It is kind of unbelievable how nicely this all fits together!

A Realistic Example: Data-Validation with pydantic!

Yeah... this totally works! Try it out by checking out this PR's branch!

from pydantic import PositiveInt

from hydra_zen.experimental.third_party.pydantic import validates_with_pydantic

def needs_pos_int(x: PositiveInt): 
    return x
>>> conf_hydra_val = builds(needs_pos_int, -1)
>>> conf_pydantic_val = builds(needs_pos_int, -1, zen_wrappers=validates_with_pydantic)
>>> instantiate(conf_hydra_val)
-1

>>> instantiate(conf_pydantic_val)
---------------------------------------------------------------------------
ValidationError: 1 validation error for NeedsPosInt
x
  ensure this value is greater than 0 (type=value_error.number.not_gt; limit_value=0)

During handling of the above exception, another exception occurred:

This works really well... even with nested configs!

class F:
    pass


class G:
    pass


class MyGuy:
    def __init__(self, tuple_of_classes: Tuple[F, G]):
        self.x = tuple_of_classes
>>> conf_hydra_val = builds(MyGuy, [builds(F), builds(F)])
>>> instantiate(conf_hydra_val).x
[<__main__.F object at 0x0000022909BB2A90>, <__main__.F object at 0x0000022909BB2B20>]
>>> conf_pydantic_val = builds(
...     MyGuy,
...     [builds(F), builds(F)],
...     hydra_convert="all",
...     zen_wrappers=validates_with_pydantic,
... )
>>> instantiate(conf_pydantic_val).x
---------------------------------------------------------------------------
ValidationError: 1 validation error for Init
tuple_of_classes -> 1
  instance of G expected (type=type_error.arbitrary_type; expected_arbitrary_type=G)
>>> conf_pydantic_val = builds(
...     MyGuy,
...     [builds(F), builds(G)],  # <- fixed!
...     hydra_convert="all",
...     zen_wrappers=validates_with_pydantic,
... )
>>> instantiate(conf_pydantic_val).x
[<__main__.F at 0x22909ecb550>, <__main__.G at 0x22909ecb160>]

Why a Wrapper?

Wrapping gives the user access to the inputs, the target, and the outputs! This enables pre-processing, post-process, transformation, and ...circumvention? This means that we can expose a single interface that enables users to do.. anything! And the implementation on our end is super simple.

The pydantic implementation was already a great proving ground for this. See the implementation here. Their validate_arguments doesn't work on instance methods, but I was able to pretty easily (and elegantly!) work around that. This really drives home the flexibility of the wrapper paradigm.

Enabling third parties to implement and provide their own wrappers

EDIT: This probably isn't necessary. Third party, or fourth party, sources can make these available / directly importable.

validates_with_pydantic was pretty easy to implement, but ultimately it would be nice if third parties were responsible for their own implementations. This looks like a job for entry-points!

It would be nice for hydra-zen to expose an entry-point. Suppose bear-type wants people to be able to use its type-validation abilities, but nothing in its public-API quite gets the trick done. He could implements some function bear_type._internal.hydra_zen_validator and then make it available to anyone who has hydra_zen and bear_type installed:

        entry_points={
            'zen_wrappers': ['bear_type = _internal:hydra_zen_validator']
        },

and we would look for it with:

import pkg_resources

def get_wrappers():
    zen_wrappers = {
        'validates_with_pydantic': validates_with_pydantic,
    }
    for entry_point in pkg_resources.iter_entry_points('zen_wrappers'):
        zen_wrappers[entry_point.name] = entry_point.load()
    return zen_wrappers

I'm not exactly sure where to go from there, in terms of making the various wrappers nice and discoverable/importable to users. But I am sure that this can be done.

Wrangling Unwieldy Configs

For validation with pydantic, users likely want to always use hydra_convert="all" so that pydantic doesn't get mad at omegaconf containers. But this makes for really clunky configs:

conf = builds(
    MyGuy,
    [builds(F), builds(F)],
    hydra_convert="all",
    zen_wrappers=validates_with_pydantic,
)

plus doing things like turning on/off pydantic-validation in your configs becomes really painful.

We should provide a simply utility that allows people to set default options for builds. E.g.

from hydra_zen import builds as _builds

builds = set_defaults(_builds, hydra_convert="all", zen_wrappers=validates_with_pydantic)

conf = builds(MyGuy, [builds(F), builds(F)])  # uses pydantic validation!

Now it is trivial for people to turn this validation on/off. Obviously there are many nice use-cases here.

builds_schema = set_defaults(_builds, hydra_convert="all", zen_wrappers=validates_json_schema)
builds_xarray = set_defaults(_builds, hydra_convert="all", zen_wrappers=netcdf_to_xarray)
builds_verbose = set_defaults(_builds, populate_full_sig=True)

Feedback

Am I missing something here? Are there opportunities that I am missing? Obstacles I am not foreseeing?

Naming Things

First of all, I am going to change hydra_meta to zen_meta. It makes sense for all zen-specific features to have the zen_ prefix and for only hydra-specific stuff to have the hydra_ prefix. Fortunately this hasn't been released yet, so it is no biggie. Along those lines... hydra_partial should be zen_partial... but I am not sure if I am prepared to make that change.

What do you think about the name zen_wrappers? A con is that one might ask what is it wrapping, and where is it wrapping it. But zen_wraps_target_prior_to_instantiation isn't exactly succinct. I think we can be okay with people just copy-pasting from the docs, and power-users doing the work to understand what it means. That being said, totally up for suggestions here.

Config Wrangling

Any suggestions about the aforementioned set_defaults? Implementation-wise? Design-wise?

Entry-points

Note: I got some feedback on this and don't think we need entry-points

I am totally new to these, so I don't even know what to ask for feedback on. Where should they live? hydra_zen.third_party.wrappers? I wish we could be more descriptive..

@rsokl rsokl added the enhancement New feature or request label Oct 2, 2021
@rsokl rsokl changed the title New Feature: zen_wrappers (oh yeah, and *validation with pydantic*) New Feature: zen_wrappers Oct 3, 2021
Comment on lines +56 to +65
# flip order: first wrapper listed should be called first
wrappers = tuple(
get_obj(path=z) if isinstance(z, str) else z for z in _zen_wrappers
)[::-1]

obj = get_obj(path=_zen_target)

for wrapper in wrappers:
obj = wrapper(obj)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your comment "first wrapper listed should be called first" does not seem to agree with the result:

>>> obj = 123
>>> def first(x):
...     print("first")
...     return x
...
>>> def second(x):
...     print("second")
...     return x
...
>>> _zen_wrappers = [first, second]
>>> wrappers = tuple(
...     get_obj(path=z) if isinstance(z, str) else z for z in _zen_wrappers
... )[::-1]
>>> for wrapper in wrappers:
...     obj = wrapper(obj)
...
second
first
>>> # the second wrapper listed is called first

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yep! Good catch.

@rsokl
Copy link
Contributor Author

rsokl commented Oct 5, 2021

Saving some additional polish and validation, zen_wrappers is proven out nicely in this PR. That being said, implementing support for pydantic and beartype made it massive and unreviewable

I am going to close this PR and open a series of smaller ones:

  • The bugfix caught in 481a0a2
  • hydra_meta and hydra_partial -> zen_meta and zen_partial
  • Implementation for zen_wrappers including support for specifying wrappers via interpolated strings
  • Support for pydantic
  • Support for beartype, which requires some heavy machinery on our end to do sequence-coercion

After this, I want to implement a nice interface for customizing the defaults on builds as well as make the beginner-friendly make_config. I think this will be mark a feature-freeze on 0.3.0, which has become much more ambitious than I had expected!

Please feel free to continue to leave feedback on this PR, to keep the thread of discussion in one place.

@rsokl rsokl closed this Oct 5, 2021
@rsokl rsokl deleted the zen-wrappers branch January 9, 2022 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants