Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing a Hydra run from previous Hydra job configs #1805

Closed
Tracked by #2070
ashleve opened this issue Aug 31, 2021 · 20 comments
Closed
Tracked by #2070

Reproducing a Hydra run from previous Hydra job configs #1805

ashleve opened this issue Aug 31, 2021 · 20 comments
Assignees
Milestone

Comments

@ashleve
Copy link

ashleve commented Aug 31, 2021

Hi,

Currently Hydra generates the following logs:
image

When I want to reproduce the experiment, I can simply load the config.yaml:
python run.py --config-path /logs/runs/.../.hydra --config-name config.yaml

The problem is, config.yaml doesn't contain the hydra logs output paths configuration, which means that after running the line above, the logging path will revert to default.

Is there a workaround for that?

Is it possible to additionally load hydra specific config from some other path with a simple oneliner?

Thanks

@omry
Copy link
Collaborator

omry commented Aug 31, 2021

This is not an officially supported workflow.

Try overriding the run path as well from the command line.

python run.py --config-path /logs/runs/.../.hydra --config-name config hydra.run.dir=/logs/runs/.../ 

@ashleve ashleve closed this as completed Aug 31, 2021
@Yevgnen
Copy link

Yevgnen commented Feb 10, 2022

@omry

Hi, how can I load the config with properly formatted placeholders like ${abc.xyz} in scripts? Can I do this without wrapping another main function? I'm not trying to re-run, but to use the saved output. To instantiate some classes related to the output, I'd like to read the params used saved in .hydra/config.yaml but some of them are placeholders.

@Jasha10
Copy link
Collaborator

Jasha10 commented Feb 10, 2022

@Yevgnen see OmegaConf.load.

@Yevgnen
Copy link

Yevgnen commented Feb 18, 2022

config/config.yaml

work_dir: ${hydra:runtime.cwd}
when: ${now:%Y-%m-%d}/${now:%H-%M-%S}

main.py

# -*- coding: utf-8 -*-

import hydra
from omegaconf import DictConfig


@hydra.main(config_path="config", config_name="config")
def main(cfg: DictConfig) -> None:
    print(cfg.work_dir)
    print(cfg.when)


if __name__ == "__main__":
    main()

@Jasha10 Hi, I tried to load .hydra/config.yaml of a run by OmegaConf.load, but got the following error when I access the config.work_dir

UnsupportedInterpolationType: Unsupported interpolation type hydra
    full_key: work_dir
    object_type=dict

And I don't seem to see any information related to the interpolated values of work_dir and when in .hydra/config.yaml, .hydra/hydra.yaml and .hydra/overrides.yaml of that run .

@jieru-hu
Copy link
Contributor

@Yevgnen thank you for trying it out and the repro code. Could you provide a bit more details on what you tried with OmegaConf.load so we can repro? Many thanks

@jieru-hu jieru-hu reopened this Feb 25, 2022
@jieru-hu jieru-hu changed the title Reproducing the run Reproducing a Hydra run from previous Hydra job configs Feb 25, 2022
@jieru-hu
Copy link
Contributor

reopening, we should look into this and at least provide some documentation on the approach.

@Jasha10
Copy link
Collaborator

Jasha10 commented Mar 2, 2022

I've had success reproducing the config by using the saved outputs/.../.hydra/overrides.yaml file. Specifically, suppose I run my_app.py as follows:

$ python my_app.py a=b +c/d=e

This produces a file outputs/.../.hydra/overrides.yaml with the following contents:

- a=b
- +c/d=e

This list of overrides can be used to reproduce the config. The following recipe has worked for me:

with hydra.initialize(config_path=config_path):  # same config_path as used by @hydra.main
    recomposed_config = hydra.compose(
        config_name="main",  # same config_name as used by @hydra.main
        overrides=OmegaConf.load("outputs/.../.hydra/overrides.yaml"),
    )

There are a few caveats here:

  • Currently the ${hydra:...} resolver cannot be used outside the context of @hydra.main. Using compose does not play well with the ${hydra:...} resolver.
  • If your config uses non-deterministic resolvers, e.g. to generate a random number or get the current time of day, those resolvers will be run anew (and will probably give different values from the original run). If you need to re-use the values from the initial run, you could try something like this:
    config = OmegaConf.merge(recomposed_config, OmegaConf.load("outputs/.../.hydra/config.yaml"))

    EDIT: I think that config = OmegaConf.merge(recomposed_config, OmegaConf.load("outputs/.../.hydra/config.yaml")) will not work because .hydra/config.yaml may contain unresolved interpolations/OmegaConf resolvers. A yaml file containing resolved values could be useful here, as @jieru-hu pointed out below.

@jieru-hu
Copy link
Contributor

jieru-hu commented Mar 4, 2022

how about we jus save the final composed config of everything related to a job in a separate YAML file (everything including Hydra and the job config, all resolved, no OmegConf syntax in the YAML) we can also provide an experimental CLI option for rerun an application from that finalized composed config. we will only support single run for now and see how it goes.

Saving the finalized YAML will also help with debugging, I've heard from users that the YAML files saved has OC resolvers in them and sometimes is hard to figure out what's going on.

@Jasha10
Copy link
Collaborator

Jasha10 commented Mar 5, 2022

Saving a resolved yaml config (e.g. outputs/.../.hydra/resolved.yaml) could certainly be useful! I've just edited the end of my previous comment since this made me realize that loading .hydra/config.yaml is not good enough to reconstruct the resolved values.

That being said, creating resolved.yaml could be expensive, e.g. if the user has defined custom resolvers to do some complicated calculation. For this reason, I think it might make sense to have automatic saving of resolved.yaml be an opt-in feature (or at least have the option to disable saving resolved.yaml).

Another potential complication is that some application use-cases could rely on interpolations or OmegaConf resolvers being unresolved, e.g. if the user app wants to call these resolvers at runtime.

For re-running the application, more information is needed than just what is in resolved.yaml. The resolved yaml config will not contain typing information. Info about structured config types or enum types cannot be reconstructed from the resolved.yaml file. This is why the compose+overrides.yaml approach is useful: it produces a typed config.

@jieru-hu
Copy link
Contributor

thanks @Jasha10 those are great points! I agree that eagerly resolve all configs is not the best way to go - especially when user applications also plays a role here.
arguably, we should be able to repro a run if the task_cfg and hydra_cfg passed into the task_function stay the same, the rest will be left up to the user application. I think one lightweight way of enabling this could be pickling JobReturn both before and after the run of user application.
Right now JobReturn( which can be very helpful) is not saved anywhere - I think we can enable the opt in of this by adding an experimental CallBack which dumps the JobReturn for each run.
The second part of this would providing an experiment CLI option that takes in a JobReturn pickle and single run the application.

@Jasha10
Copy link
Collaborator

Jasha10 commented Mar 14, 2022

I think that using a Callback to pickle the JobReturn (and later using a CLI option to load the pickle) sounds like a great idea.

I think one lightweight way of enabling this could be pickling JobReturn both before and after the run of user application.

To pickle before the run, we might need to change the signature of the on_job_start hook, since on_job_start currently does not accept a JobReturn argument.

@jieru-hu
Copy link
Contributor

I think that using a Callback to pickle the JobReturn (and later using a CLI option to load the pickle) sounds like a great idea.

I think one lightweight way of enabling this could be pickling JobReturn both before and after the run of user application.

To pickle before the run, we might need to change the signature of the on_job_start hook, since on_job_start currently does not accept a JobReturn argument.

I think for now we can just pickle the config object instead of a uncomplete JobReturn - that seems like it should be enough (both task and hydra config are included in the config object)

@jieru-hu
Copy link
Contributor

jieru-hu commented Mar 17, 2022

Hi @ashleve @Yevgnen
i just created #2098 to address this. It is not a perfect solution but could be useful for what you are trying to achieve.
If you are interested, here is the documentation on this how to use this particular experimental CLI option.

@addisonklinke
Copy link

I'm see three potential approaches suggested in this thread

  1. @ashleve: use the built-in Hydra --config-path and --config-name flags to point the script to an old config [comment]
  2. @Jasha10: use the Compose API to merge config and overrides [comment]
  3. @jieru-hu: use the newly added --experimental-rerun flag [comment]

I have some questions to better understand the differences between these approaches

  1. Do approaches 1 and 2 end up generating the same config object?
  2. If the above answer is yes, should approach 1 be preferred whenever @hydra.main is applicable? I say that based on this docs recommendation

Please avoid using the Compose API in cases where @hydra.main() can be used. Doing so forfeits many of the benefits of Hydra (e.g., Tab completion, Multirun, Working directory management, Logging management and more)

  1. Is approach 3 only necessary if you want to preserve the original output directory and append to the old log file? Are there other advantages over approaches 1 or 2 that I'm missing?

Thanks everyone for the clarification!

@Jasha10
Copy link
Collaborator

Jasha10 commented Nov 8, 2022

Do approaches 1 and 2 end up generating the same config object?

Approaches 1 involves a round-trip through yaml, which does not always capture the original config in full fidelity. Meanwhile, the approach2 config should be basically identical to the original config.

To see some of the differences, you can experiment with round-tripping a config through yaml as follows:

approach1 = OmegaConf.create(OmegaConf.to_yaml(original_config))  # round-trip through yaml

Approach 1 will produce an untyped config (i.e. a config that is not structured). This means you'll miss out on the runtime type-safety that comes with structured configs. Also, you'll see a difference when calling routines such as OmegaConf.to_object (since OmegaConf.to_object treats structured configs differently from unstructured configs). Another pitfall with the round-trip through yaml comes up if you use Enum values in your config:

>>> class MyEnum(Enum):
...     foo = 1
...
>>> original_config = OmegaConf.create({"key": MyEnum.foo})
>>> assert isinstance(original_config.key, MyEnum)
>>> approach1 = OmegaConf.create(OmegaConf.to_yaml(original_config))
>>> assert approach1.key == "foo"
>>> assert isinstance(approach1.key, str)
>>> assert not isinstance(approach1.key, MyEnum)  # approach 1 doesn't preserve enum values

If the above answer is yes, should approach 1 be preferred whenever @hydra.main is applicable?

If you use structured configs or enum values, I'd recommend against approach 1.

Another difference between approaches 1 and 2 is that, if you refactor your python dataclasses or change your input yaml files, the result of approach 2 will change accordingly. Approach 2 reconstructs the config using Hydra's compose process; the output of that process can change if your dataclasses or your input yaml files change. Meanwhile, the result of approach 1 is always the same (as it depend only on the static config.yaml file that was the output from your previous experiment).


I say that based on this docs recommendation

The docs do recommend using @hydra.main over compose if possible, since @hydra.main can do several things that compose cannot (including sweeping, launching, etc).

Actually, approach 2 can be adapted to work with @hydra.main if you don't mind monkeying with sys.argv:

import sys
...

# approach2 adapted to @hydra.main:
overrides: list[str] = list(OmegaConf.load("outputs/.../.hydra/overrides.yaml"))
sys.argv += overrides

@hydra.main(config_path=..., config_name="main")  # use same config_path and config_name as you used in the original run
def my_app(cfg):
    ...

I wonder if it would make sense for @hydra.main to expose an overrides parameter so that we could avoid this sys.argv hack...


Is approach 3 only necessary if you want to preserve the original output directory and append to the old log file? Are there other advantages over approaches 1 or 2 that I'm missing?

Yes, using the --experimental-rerun flag is only necessary if you want to preserve the original output directory and append to the old log file.

There is also a fourth approach that combines a few of these ideas: you can use PickleJobInfoCallback to produce a config.pickle file, and then directly call pickle.load(open("outputs/.../.hydra/config.pickle", "rb")) to load that pickled config later. This fourth approach circumvents the need to call @hydra.main or compose.


In summary, here are the four discussed approaches for re-animating a config from a previous experiment:

Approach 1: Run Hydra with config_name="outputs/../.hydra/config.yaml"

This approach comes from the OP.

  • First, run your experiment, which should produce some log files including an outputs/.../.hydra/config.yaml file.
  • You can later point to this config.yaml to re-run experiments using the --config-name and --config-path flags:
python run.py --config-path /outputs/.../.hydra --config-name config.yaml

Pros:

  • easy to set up

Cons:

  • produces an untyped (non-structured) config

Approach 1 always produces the same config even if you refactor your application / input yaml files. This may or may not be desirable.

Approach 2: Recompose the config with overrides="outputs/../.hydra/overrides.yaml"

This approach comes from my comment above.

  • First, run your experiment, which should produce some log files including an outputs/.../.hydra/overrides.yaml file.
  • You can then re-run the experiment by loading overrides.yaml and using those overrides as input to compose or @hydra.main.

Pros:

  • reconstructs structured configs

Cons:

  • use with @hydra.main requires monkeying with sys.argv.

The config produced by approach 2 will change if you refactor your dataclasses or your input yaml files. This may or may not be desirable.

Approach 3: PickleJobInfoCallback + --experimental-rerun

This approach is only necessary if you want to re-use the same output directory and append to the log file from your previous experiment.

Approach 4: PickleJobInfoCallback + pickle.load(open("config.pickle", "rb")).

This approach avoids using @hydra.main and compose.

@addisonklinke
Copy link

Approach 1 involves a round-trip through yaml, which does not always capture the original config in full fidelity

Thanks for clarifying this subtle difference - that makes a lot of sense

Approach 1 always produces the same config even if you refactor your application / input yaml files. This may or may not be desirable.

I'll have to think more about whether this is desirable for my ML experiments. Typically, I would plan to checkout the git commit of the original run before reproducing, in which case there couldn't be any changes to the application code

I wonder if it would make sense for @hydra.main to expose an overrides parameter so that we could avoid this sys.argv hack...

I agree it could definitely be convenient to officially support that pattern. Worth opening a new issue?

@Jasha10
Copy link
Collaborator

Jasha10 commented Nov 8, 2022

I agree it could definitely be convenient to officially support that pattern. Worth opening a new issue?

Yes, that would be great, thanks!

@npuichigo
Copy link

@Jasha10 For Approach 2, how can we pass the overrides path from command line since we may want the overrides dynamic

@asimukaye
Copy link

@Jasha10 For Approach 2, how can we pass the overrides path from command line since we may want the overrides dynamic

@npuichigo I haven't tried it but this: #1022 seems relevant to your question and might help.

@bkmi
Copy link

bkmi commented Jan 2, 2024

the compose + overrides looks ok for my purposes, but is there any way to include the ${hydra:...} resolves in it as well at the current time?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants