Skip to content

Conversation

@pmrowla
Copy link
Contributor

@pmrowla pmrowla commented Aug 4, 2020

Thank you for the contribution - we'll try to review it as soon as possible. πŸ™

Related to #2799.

Adds dvc repro -e --params option which can be used to specify experiment parameters via the command line.

  • --params expects a comma-separated key=value list: [params.yaml:]foo=1,stage.bar=2,stage.baz=1.234,...
    • Filename is optional, if omitted the default params file will be used
    • Specified params are used in addition to any changes made in the user's current workspace, but command line values take priority over workspace changes.
  • -p cannot be used to specify params in this way since it collides with repro --pipeline
  • Currently only accepts single values per parameter (so you cannot specify multiple values to test at once for hyper parameter search yet)

Known limitations:

  • TOML params are supported, but due to limitations in the Python toml package, key ordering is not guaranteed to be preserved when we dump the new experiment version of the .toml file. So if you specify toml params via the command line, the resulting experiment .toml file may contain unrelated changes where sections and keys have been re-ordered.

@pmrowla pmrowla added the A: experiments Related to dvc exp label Aug 4, 2020
@pmrowla pmrowla self-assigned this Aug 4, 2020
@pmrowla
Copy link
Contributor Author

pmrowla commented Aug 4, 2020

asciicast

@pmrowla pmrowla requested a review from efiop August 4, 2020 08:04
Copy link
Contributor

@efiop efiop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

πŸ”₯

@pmrowla pmrowla merged commit ab23bcd into treeverse:master Aug 6, 2020
@pmrowla pmrowla deleted the experiments-cli-params branch August 6, 2020 15:04
@jorgeorpinel
Copy link
Contributor

image

Hi. Can you please link to the docs issue or PR @pmrowla ?

@efiop
Copy link
Contributor

efiop commented Aug 11, 2020

@jorgeorpinel Experiments are experimental hidden feature, we don't need to create/update the docs yet. --exp is hidden too.

@jorgeorpinel
Copy link
Contributor

Ah OK, makes sense. But still... we will need one eventually. Is there a sort of epic issue in this repo that covers all the functionality? I think I've definitely seen it but not sure which one it is. I probably even participated actively at some point...

@pmrowla
Copy link
Contributor Author

pmrowla commented Aug 12, 2020

@jorgeorpinel the original general discussion is here: #2799

as far as what is currently available to play around with in master, this page should always be up to date: https://github.com/iterative/dvc/wiki/Experiments-development-status

But yeah, everything is currently experimental and disabled by default. We still aren't sure if we will even end up keeping this workflow, and it's possible that all of these experiment related commands could end up being removed or reworked significantly between now and when the experiments feature is officially released. So it doesn't make sense to write formal docs for them yet.

@jorgeorpinel
Copy link
Contributor

Definitely not formal docs but I have found that trying to explain things in writing helps a lot in QA e.g. finding inconsistencies, edge cases, even bugs. Just a thought

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Sep 1, 2020

Hello again!

It seems users are finding the WIP repro --params option, which causes confusion. Should it be hidden? If not, that part should definitely be added to docs at least, I think.
Context: https://discord.com/channels/485586884165107732/563406153334128681/750412629511504002

Discussion: Can someone explain the motivation for this option? It seems counter-intuitive (dangerous even) to have a run with overwritten parameter values that are not saved to the checkpoint or maybe they are somewhere internally in the run-cache but hard/impossible to detect. Users may easily assume that the parameters used in previous runs are those in the corresponding versions of the params file, leading to bad decisions. Is there a mechanism to prevent this?

Thanks

@pmrowla
Copy link
Contributor Author

pmrowla commented Sep 2, 2020

@jorgeorpinel the repro --params option not being hidden is a bug which will be fixed in #4515.

The option itself does nothing unless repro -e/--experiments is being used. For experiments, the option can be used to quickly generate several experiments at a time (since it is easily shell scriptable) rather than requiring the to manually edit their params file each time. And for experiments, the values passed into the command line option are written back to params.yaml (or the appropriate custom params file) and committed in the experiments-only git workspace.

@pmrowla
Copy link
Contributor Author

pmrowla commented Sep 2, 2020

Also, we have discussed possibly moving all of the experiment related run/repro functionality into it's own dvc exp run (or similar) command rather than using the current dvc repro approach. Clearing up this potential repro --params confusion in the future is another good argument for us to move to a dedicated experiment run command

@jorgeorpinel
Copy link
Contributor

Hey, sorry for the delay. Thanks for hiding the option for now.

The option itself does nothing unless repro -e/--experiments is being used...
for experiments, the values passed into the command line option are written back to params.yaml

OK I see, a bit confusing. Probably best to call the option something else or move it to dvc exp indeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A: experiments Related to dvc exp

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants