Duplicate pending trials from parent/child for exc #631
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[Fixes #576 ]
Why:
Experiment cannot reserve trial of parent experiment. This is very
problematic as non-completed trials of parents cannot be execute anymore
unless the environment state is reverted to the one used for parent
experiment (ex: resetting code). It should look for executable trials
across the EVC tree.
Running trials from parent experiments may cause issue if the child
experiment has a different script path, different code version or
different cmdline call. We should attempt running the trial with the
corresponding experiment configuration. It's not clear what to do if it
fails. If we simply leave the trial status to interrupted the child
experiment will try it again.
Another option is to copy the trial to the child experiment and run it
with child configuration. If the user checkpointed the trial state with
trial.hash_params, the checkpoint will be lost as trial.hash_params
will change based on the experiment id. This is safe, protecting users
from resuming with a different code version.
How:
Fetch trials from EVC tree and duplicate any pending trials to current
experiment. A hash of the params is used to avoid duplicating trials
that are already available in the current experiment.