-
Notifications
You must be signed in to change notification settings - Fork 1.3k
exp run: add experiment name check. #6848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We should not support slashes in experiment names. Yes, the full ref can contain slashes, and we already use slashes in exp refs to organize them, but the user-readable (short) name should not contain slashes - the same as git branch or tag (short) names. |
|
Ah ok, that's my misunderstanding then, I didn't think cgit allowed you to nest anything in |
I will ban it in this PR, and then clarify in doc. |
Is it harder to handle nested refs in |
1. Add experiment name check (https://git-scm.com/docs/git-check-ref-format) 2. Add duplicate exp name check. 3. Add some unit test for it.
e75b4eb to
b812d61
Compare
The |
There's nothing preventing us from allowing slashes in the exp refs, but once we start allowing them it means we cannot extend the current naming convention of Regarding user naming conventions, dots |
Co-authored-by: Peter Rowlands (변기호) <peter@pmrowla.com>
Co-authored-by: Peter Rowlands (변기호) <peter@pmrowla.com>
dvc/repo/experiments/__init__.py
Outdated
| name = kwargs.pop("name", None) | ||
| self._validate_ref_name(name, **kwargs) | ||
|
|
||
| return self._stash_exp(*args, name=name, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks a bit strange to pop a kwarg and then re-use it again in the following call (to _stash_exp
| name = kwargs.pop("name", None) | |
| self._validate_ref_name(name, **kwargs) | |
| return self._stash_exp(*args, name=name, **kwargs) | |
| self._validate_ref_name(kwargs.get("name"), **kwargs) | |
| return self._stash_exp(*args, **kwargs) |
Alternatively, you could also just make name default to None in _validate_ref_name and call it with **kwargs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that kwargs still contains name argument and this would cause a duplication. If we want to use this method, we need to get name, force, reset and pass them separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the simplest fix is to just do
def _validate_ref_name(self, name: Optional[str] = None, **kwargs):
and call it from here as
self._validate_ref_name(**kwargs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also change this function to be
def new(self, *args, checkpoint_resume: Optional[str] = None, name: Optional[str] = None, **kwargs):
and then keep the call you have as
self._validate_ref_name(name, **kwargs)
return self._stash_exp(*args, name=name, **kwargs)
The point is just that doing
value = kwargs.pop(key)
func(key=value, **kwargs)
isn't particularly clean code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the simplest fix is to just do
def _validate_ref_name(self, name: Optional[str] = None, **kwargs):
and call it from here as
self._validate_ref_name(**kwargs)
Yes, I also considered it, but I think it is more clear to call it explicitly.
| "\tdvc exp branch <exp> <branch>\n" | ||
| ) | ||
|
|
||
| def _validate_new_ref(self, exp_ref: ExpRefInfo): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is used for validating a new reference, not only the name. Because we include duplication detect in it. And also the duplication detection is not to the name but to the whole reference.
Sounds good. I don't think there's a hard requirement for now at least, but helpful to have more context about the rationale. |
|
Looks good! One small request: can we remove "Reproduced" from https://github.com/iterative/dvc/blob/c07592131a616afc51f093ca4d4fe8615b63062e/dvc/repo/experiments/base.py#L39? |
|
This PR forbids the duplicate name and name with some illegal character (space, asterisk, etc), in my opinion, better to make an error msg if someone wants to use an illegal name. |
|
OK I created treeverse/dvc.org#3065 then. |
* ref: state exp names shoold be unique rel. treeverse/dvc#6848 * Restyled by prettier (#3066) Co-authored-by: Restyled.io <commits@restyled.io> Co-authored-by: restyled-io[bot] <32688539+restyled-io[bot]@users.noreply.github.com> Co-authored-by: Restyled.io <commits@restyled.io>
* ref: state exp names shoold be unique rel. treeverse/dvc#6848 * Restyled by prettier (#3066) Co-authored-by: Restyled.io <commits@restyled.io> Co-authored-by: restyled-io[bot] <32688539+restyled-io[bot]@users.noreply.github.com> Co-authored-by: Restyled.io <commits@restyled.io>
fix #6674 and fix #6652
❗ I have followed the Contributing to DVC checklist.
📖 If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
Thank you for the contribution - we'll try to review it as soon as possible. 🙏
@pmrowla , in #6819, exp name with '/' inside are request. And these types of exp names are stored in a subdirectory. For example in #6843, it is stored in
.git/refs/heads/dependabot/pip/pylint-2.11.1. The subdirectory here is used for grouping branches. Currently, ourdvc expdid not support this. It involves not only name checking here, maybe we should reopen #6819 and create a new PR for it?