Skip to content

Conversation

@efiop
Copy link
Contributor

@efiop efiop commented Dec 8, 2019

As it turned out (see issue numbers down below), we can't really take
hardlinks for granted, so flufl.lock is not a panacea for all
filesystems. Considering that the vast majority of filesystems that our
users use support zc.lockfile(flock-based) and it has benefits like
more reliable mechanism, auto-delete when process exits, more sturdy
implementation, etc, it makes more sense to bring it back and use by
default again. For filesystems that don't support flock(), users will
be able to manually enable flufl.lock use through the config option.
It would be ideal if we could auto-detect that flock is not supported,
but in the real world, it turned out to be non-trivial, as it might hang
forever in a kernel context, which makes the implementation way too
complex for our purposes. So what we're doing instead is showing a
message before locking with zc.lockfile that, under normal
circumstances will disappear once the lock is taken or failed, otherwise
it will point users to the related documentation where they can learn
about how to opt-in for flufl.lock.

Fixes #2831
Fixes #2897
Fixes #2944
Related #2860

  • ❗ Have you followed the guidelines in the Contributing to DVC list?

  • 📖 Check this box if this PR does not require documentation updates, or if it does and you have created a separate PR in dvc.org with such updates (or at least opened an issue about it in that repo). Please link below to your PR (or issue) in the dvc.org repo.

treeverse/dvc.org#858

  • ❌ Have you checked DeepSource, CodeClimate, and other sanity checks below? We consider their findings recommendatory and don't expect everything to be addressed. Please review them carefully and fix those that actually improve code or fix bugs.

Thank you for the contribution - we'll try to review it as soon as possible. 🙏

@efiop efiop changed the title dvc: make flufl.lock opt-in and use zc.lockfile [WIP] dvc: make flufl.lock opt-in and use zc.lockfile Dec 8, 2019
setup.py Outdated
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: no need to update this in our conda package, as I've left zc.lockfile there in preparation for this PR.

dvc/updater.py Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would name it explicitly, this is shadowing the intent and tying up the interfaces.

@efiop efiop changed the title [WIP] dvc: make flufl.lock opt-in and use zc.lockfile dvc: make flufl.lock opt-in and use zc.lockfile Dec 12, 2019
@efiop efiop requested a review from pared December 12, 2019 02:23
Copy link
Contributor

@Suor Suor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, @efiop ;

@shcheklein
Copy link
Contributor

@efiop @jorgeorpinel I've just put a few minor UI comments, please take a look and improve/fix if needed.

@shcheklein
Copy link
Contributor

@efiop btw, just curious, how does it affect dvc get (DVC API?) - is this config supposed to be saved in Git? should we recommend using --local and/or --global for this option?

@efiop
Copy link
Contributor Author

efiop commented Dec 13, 2019

@shcheklein dvc get is not using repo locks. The only thing I am worried about is analytics lock (for the user_id file), but that one runs after the command and is located in ~, which is less common to have some weird fs, so the user won't notice it most of the time. Will give it a closer look, but it could wait until someone complains about it.

As it turned out (see issue numbers down below), we can't really take
hardlinks for granted, so `flufl.lock` is not a panacea for all
filesystems. Considering that the vast majority of filesystems that our
users use support `zc.lockfile`(flock-based) and it has benefits like
more reliable mechanism, auto-delete when process exits, more sturdy
implementation, etc, it makes more sense to bring it back and use by
default again. For filesystems that don't support `flock()`, users will
be able to manually enable `flufl.lock` use through the config option.
It would be ideal if we could auto-detect that flock is not supported,
but in the real world, it turned out to be non-trivial, as it might hang
forever in a kernel context, which makes the implementation way too
complex for our purposes. So what we're doing instead is showing a
message before locking with `zc.lockfile` that, under normal
circumstances will disappear once the lock is taken or failed, otherwise
it will point users to the related documentation where they can learn
about how to opt-in for `flufl.lock`.

Fixes treeverse#2831
Fixes treeverse#2897
Related treeverse#2860
@efiop efiop merged commit 602a34d into treeverse:master Dec 13, 2019
@pared
Copy link
Contributor

pared commented Dec 13, 2019

@efiop does config have verify argument? When I run any dvc command, I get ERROR: unexpected error - init() got an unexpected keyword argument ' verify' in the log.

efiop added a commit that referenced this pull request Dec 13, 2019
Thanks @pared for catching this one #2918 (comment) . Will add a test on top, need to re-release ASAP
@efiop efiop deleted the lock branch December 13, 2019 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can't set up DVC on a Windows network share Fresh install /.dvc/updater.lock for all commands Unexpected error - Adding files

5 participants