-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New cache type: read-only hardlink #799
Comments
I think a better way to achieve this would be to introduce 'protected' mode for dvc repository. I.e. |
It looks like this feature should become a default behavior in dvc 1.0 (with the Protect\Abandon - does it affects all the cache types?
|
@dmpetrov I agree, we should definitely consider making it a default behaviour for 1.0 . And yes, only hardlinks and symlinks should be protected by default. |
So, 2 cache types (Hardlink and Symlinks) need the new parameter (Protect\Abandon) for the cache immutability and other 2 types (Reflinks and Copy) do not need that. As a result, DVC can protect immutable data files in two possible ways:
@efiop could you please clarify if my understanding is correct? |
Correct, the pros for this one is that it fits nicely in the currect cache.type logic. We will still have to introduce something similar to
I was actually thinking about something like core.protected = true|false, but it is essentially the same thing. I am also considering it affecting all cache types, just for the simple symmetry, but I'm not yet sure. Will reconsider all the options while preparing a PR for this. My current draft is using |
Good! It looks like the We should think more about how to avoid these commands. |
This is an opt-in mode, that will make data in the workspace read-only, thus protecting cache from corruption when hardlinks or symlinks are used. User can use `dvc unprotect file` command, to replace read-only link to cache with an editable copy. We can use this for now to decide if we want to go with a `unified workflow` in the future or if we want to set unprotected `reflink,copy` by default. Fixes iterative#799 Signed-off-by: Ruslan Kuprieiev <ruslan@iterative.ai>
It is often confusing that user can edit the data files and it breaks immutable cache. Often it is not clear for users. An example #754.
The data file read-only semantic should be clear. Let's:
CacheType
- read-only hardlinkro-hardlink
.ro-hardlink
as the default type for file systems withoutreflinks
support.btw... It should be easy to implement (2) without breaking the back compatibility - just keep
hardlink
as the default type and create a config file withro-hardlink
in new dvc versions.We should understand that
ro-hardlink
will change the expected Git-like behavior (dvc will change data file permissions) but it will protect immutable cache which is more important.The text was updated successfully, but these errors were encountered: