Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve zen pickle-compat and support for hydra_main(config_path) #384

Merged
merged 8 commits into from Jan 13, 2023

Conversation

rsokl
Copy link
Contributor

@rsokl rsokl commented Jan 10, 2023

By default hydra.main(config_path) treats config_path as being relative to the directory that houses the file in which the task function is defined. Hydra has some relatively complicated logic for determining this directory.

Because zen(task).hydra_main(...) sends a zen-wrapped task function to hydra.main, the location of task cannot be determined. I did not appreciate what was going on under the hood here until #381 was opened. Fortunately, Hydra 1.3.0 added support for wrapped task functions (i.e. it will follow a __wrapped__ attribute to find the task function.

Thus, in the case that config_path is a string and Hydra 1.3.0+ is installed, zen will now produce a temporary wrapped function that points to the original task function. The reason why zen doesn't always do this is that the resulting wrapped function is not pickle-compatible, whereas the zen-instance is. It is possible in the future that pickle-compatible task functions could be useful for some launchers.

Unfortunately, I was not able to find a way to make config_path=<str> work for Hydra 1.2.0 or earlier. Thus we emit a warning to users.

Given:

dir/
  |- bug_main.py
  |- inner/
       |- config.yaml

config.yaml

db:
  host: local
  port: 3307
# contents of bug_main.py

from hydra_zen import zen

cwd = Path.cwd()


@zen
def main(zen_cfg):
    print(zen_cfg)


if __name__ == "__main__":
    main.hydra_main(
        config_name="config", version_base="1.1", config_path="."
    )

Before:

$ python bug_main.py --config-path=inner
Primary config module 'hydra_zen.wrapper.inner' not found.
Check that it's correct and contains an __init__.py file

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

After
(Hydra 1.3.0+ only)

$ python bug_main.py --config-path=inner
{'db': {'host': 'local', 'port': 3307}}

@rsokl
Copy link
Contributor Author

rsokl commented Jan 10, 2023

Note to self: add pickle compat test

@rsokl rsokl marked this pull request as ready for review January 10, 2023 19:00
@rsokl rsokl requested a review from jgbos January 10, 2023 19:00
@rsokl rsokl changed the title Improve zen Improve zen pickle-compat and support for hydra_main(config_path) Jan 10, 2023
@rsokl
Copy link
Contributor Author

rsokl commented Jan 10, 2023

@Jasha10 I wonder if you have any thoughts about this solution -- is there anything hazardous with the solution I landed on? Any ideas about making this work for Hydra < 1.3.0?

@Jasha10
Copy link
Contributor

Jasha10 commented Jan 12, 2023

is there anything hazardous with the solution I landed on?

I.e. using a wrapper? It looks good to me!

Any ideas about making this work for Hydra < 1.3.0?

Since the hydra/_internal/utils.py:detect_calling_file_or_module_from_task_function function that you mentioned above inspects task_function.__module__, a dirty hack would be to set task_function.__module__ manually inside of the Zen.hydra_main method:

...
        if isinstance(config_path, str) and HYDRA_VERSION < Version(1, 3, 0):  # pragma: no cover
            target = self
            target.__module__ = self.func.__module__
        elif Version(1, 3, 0) <= HYDRA_VERSION and isinstance(config_path, str):
            ...
            target = wrapper
        else:
            target = self
...

To maintain some sense of decorum, you could even use a try/finally block to reset the self.__module__ attribute after the call hydra.main(**kw)(target)() has finished running.

try:
    return hydra.main(**kw)(target)()
finally:
    if self_module_was_fudged:
        self.__module__ = self_module_old

Speaking of "hazardous solutions," hacking the __module__ attribute is definitely not "a normal thing to do." It's possible this hack could have unforeseen consequences (e.g. will the Zen object still be picklable if __module__ has been overridden?)
So feel free to discount this idea :)

Comment on lines 412 to 418
if (config_path is _UNSPECIFIED_ and HYDRA_VERSION < Version(1, 2, 0)) or (
isinstance(config_path, str) and HYDRA_VERSION < Version(1, 3, 0)
): # pragma: no cover
warnings.warn(
"Specifying config_path via hydra_zen.zen(...).hydra_main "
"is only supported for Hydra 1.3.0+"
)
Copy link
Contributor

@Jasha10 Jasha10 Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning message would only seem to be relevant if config_path has been specified. I'm confused about why it's necessary to check whether config_path is _UNSPECIFIED_ in this if statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this because if version_base is Hydra 1.1 and config_path is _UNSPECIFIED_ then Hydra defaults config_path to be "."

But I realize that I need to have additional logic here to handle the case where the version_base via zen is specified

@rsokl
Copy link
Contributor Author

rsokl commented Jan 13, 2023

I.e. using a wrapper? It looks good to me!

Great! 🚀

a dirty hack would be to set task_function.module manually inside of the Zen.hydra_main method

This is an interesting idea. I tried this and it seems that when the module becomes __main__, as in many cases, Hydra looks to the call stack, where ends up finding the hydra-zen internals :/ .

I haven't been able to find a complete backwards-compat w/ Hydra solution, so I think I will just punt and promise functionality for Hydra 1.3.0+. Fortunately the upgrade from 1.2 to 1.3 should be quite benign for most people.

Thanks for the help @Jasha10 !

@rsokl rsokl merged commit 7c455a6 into main Jan 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

config_path broken for hydra_main
3 participants