-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
UPDATE: Skip to #2799 (comment) for a summary and updated requirements, and #2799 (comment) for the beginning of the implementation discussion.
Problem
There are a lot of discussions on how to manage ML experiments with DVC. Today's DVC design allows ML experiments through Git-based primitives such as commits and branches. This works nicely for large ML experiments when code writing and testing required. However, this model is too heavy for the hyperparameters tuning stage when the user makes dozens of small, one-line changes in config or code. Users don't want to have dozens of Git-commits or branches.
Requirements
A lightweight abstraction needs to be created in DVC to support hyperparameters-like tiny experiments without Git-commits. Hyperparameters tunning stage can be considered as a separate user activity outside of Git workflow. But the result of this activity still needs to be managed by Git preferably by a single commit.
High-level requirements to the hyperparameters tunning stage:
- Run. Run dozens of experiments without committing any results into Git while keeping track of all the experiments. Each of the experiments includes a small config change or code change (usually, 1-2 lines).
- Compare. A user should be able to compare two experiments: see diffs for code (and probably metrics)
- Visualize. A user should be able to see all the experiments results: metrics that were generated. It might be some table with metrics or a graph. CSV table needs to be supported for custom visualization.
- Propagate. Choose "the best" experiment (not necessarily the highest metrics) and propagate it to the workspace (bring all the config and code changes. Important: without retraining). Then it can be committed to Git. This is the final result of the current hyperparameter tunning stage. After that, the user can continue to work with a project in a regular Git workflow.
- Store. Some (or all) of the experiments might be still useful (in additional to "the best" one). A user should be able to commit them to the Git as well. Preferably in a single commit to keep the Git history clean.
- Clean. Not useful experiments should be removed with all the code and data artifacts that were created. A special subcommand of
dvc gcmight be needed. - [*] Parallel. In some cases, the experiments can be run in parallel which aligns with DVC parallel execution plans: Running DVC in production #2212, repro: add scheduler for parallelising execution jobs #755. This might not be implemented now (in the 1st version of this feature) but it is important to support parallel execution by this new lightweight abstraction.
- Group. Iterations of hyperparameters tuning might be not related to each other and need to be managed and visualized separately. Experiments need to be grouped somehow.
What should NOT be covered by this feature?
This feature is NOT about the hyperparameter grid-search. In most cases, hyperparameters tuning is done by users manually using "smart" assumptions and hypotheses about hyperparameter space. Grid-search can be implemented on top of this feature/command using bash for example.
- The ability to run the experiments from
bashmight be also a requirement for this feature request.
Possible implementations
This is an open question but many data scientists create directories for each of the experiments. In some cases, people create directories for a group of experiments and then experiments inside. We can use some of these ideas/practices to better align with users' experience and intuition.
Actions
This is a high-level feature request (epic). The requirements and an initial design need to be discussed and more feature requests need to be created. @iterative/engineering please share your feedback. Is something missing here?
EDITED:
Related issues
#2379
#2532
#1018 can be relevant (?)
Discussion