Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dvc: make flufl.lock opt-in and use zc.lockfile #2918

Merged
merged 1 commit into from Dec 13, 2019
Merged

Conversation

efiop
Copy link
Member

@efiop efiop commented Dec 8, 2019

As it turned out (see issue numbers down below), we can't really take
hardlinks for granted, so flufl.lock is not a panacea for all
filesystems. Considering that the vast majority of filesystems that our
users use support zc.lockfile(flock-based) and it has benefits like
more reliable mechanism, auto-delete when process exits, more sturdy
implementation, etc, it makes more sense to bring it back and use by
default again. For filesystems that don't support flock(), users will
be able to manually enable flufl.lock use through the config option.
It would be ideal if we could auto-detect that flock is not supported,
but in the real world, it turned out to be non-trivial, as it might hang
forever in a kernel context, which makes the implementation way too
complex for our purposes. So what we're doing instead is showing a
message before locking with zc.lockfile that, under normal
circumstances will disappear once the lock is taken or failed, otherwise
it will point users to the related documentation where they can learn
about how to opt-in for flufl.lock.

Fixes #2831
Fixes #2897
Fixes #2944
Related #2860

  • ❗ Have you followed the guidelines in the Contributing to DVC list?

  • 📖 Check this box if this PR does not require documentation updates, or if it does and you have created a separate PR in dvc.org with such updates (or at least opened an issue about it in that repo). Please link below to your PR (or issue) in the dvc.org repo.

iterative/dvc.org#858

  • ❌ Have you checked DeepSource, CodeClimate, and other sanity checks below? We consider their findings recommendatory and don't expect everything to be addressed. Please review them carefully and fix those that actually improve code or fix bugs.

Thank you for the contribution - we'll try to review it as soon as possible. 🙏

@efiop efiop changed the title dvc: make flufl.lock opt-in and use zc.lockfile [WIP] dvc: make flufl.lock opt-in and use zc.lockfile Dec 8, 2019
dvc/command/base.py Outdated Show resolved Hide resolved
dvc/config.py Outdated Show resolved Hide resolved
dvc/lock.py Outdated Show resolved Hide resolved
@@ -79,6 +79,7 @@ def run(self):
"shortuuid>=0.5.0",
"tqdm>=4.40.0,<5",
"packaging>=19.0",
"zc.lockfile>=1.2.1",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: no need to update this in our conda package, as I've left zc.lockfile there in preparation for this PR.

dvc/lock.py Show resolved Hide resolved
dvc/lock.py Outdated Show resolved Hide resolved
dvc/lock.py Show resolved Hide resolved
dvc/lock.py Show resolved Hide resolved
dvc/lock.py Outdated Show resolved Hide resolved
dvc/updater.py Outdated
@@ -24,11 +23,13 @@ class Updater(object): # pragma: no cover
TIMEOUT = 24 * 60 * 60 # every day
TIMEOUT_GET = 10

def __init__(self, dvc_dir):
def __init__(self, dvc_dir, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would name it explicitly, this is shadowing the intent and tying up the interfaces.

dvc/command/daemon.py Show resolved Hide resolved
dvc/config.py Outdated Show resolved Hide resolved
dvc/lock.py Show resolved Hide resolved
dvc/lock.py Show resolved Hide resolved
dvc/lock.py Outdated Show resolved Hide resolved
dvc/repo/__init__.py Outdated Show resolved Hide resolved
@efiop efiop changed the title [WIP] dvc: make flufl.lock opt-in and use zc.lockfile dvc: make flufl.lock opt-in and use zc.lockfile Dec 12, 2019
@efiop efiop requested a review from pared December 12, 2019 02:23
Copy link
Contributor

@Suor Suor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, @efiop ;

dvc/lock.py Outdated Show resolved Hide resolved
dvc/lock.py Outdated Show resolved Hide resolved
@shcheklein
Copy link
Member

@efiop @jorgeorpinel I've just put a few minor UI comments, please take a look and improve/fix if needed.

@shcheklein
Copy link
Member

@efiop btw, just curious, how does it affect dvc get (DVC API?) - is this config supposed to be saved in Git? should we recommend using --local and/or --global for this option?

@efiop
Copy link
Member Author

efiop commented Dec 13, 2019

@shcheklein dvc get is not using repo locks. The only thing I am worried about is analytics lock (for the user_id file), but that one runs after the command and is located in ~, which is less common to have some weird fs, so the user won't notice it most of the time. Will give it a closer look, but it could wait until someone complains about it.

dvc/lock.py Outdated Show resolved Hide resolved
As it turned out (see issue numbers down below), we can't really take
hardlinks for granted, so `flufl.lock` is not a panacea for all
filesystems. Considering that the vast majority of filesystems that our
users use support `zc.lockfile`(flock-based) and it has benefits like
more reliable mechanism, auto-delete when process exits, more sturdy
implementation, etc, it makes more sense to bring it back and use by
default again. For filesystems that don't support `flock()`, users will
be able to manually enable `flufl.lock` use through the config option.
It would be ideal if we could auto-detect that flock is not supported,
but in the real world, it turned out to be non-trivial, as it might hang
forever in a kernel context, which makes the implementation way too
complex for our purposes. So what we're doing instead is showing a
message before locking with `zc.lockfile` that, under normal
circumstances will disappear once the lock is taken or failed, otherwise
it will point users to the related documentation where they can learn
about how to opt-in for `flufl.lock`.

Fixes iterative#2831
Fixes iterative#2897
Related iterative#2860
@efiop efiop merged commit 602a34d into iterative:master Dec 13, 2019
@pared
Copy link
Contributor

pared commented Dec 13, 2019

@efiop does config have verify argument? When I run any dvc command, I get ERROR: unexpected error - init() got an unexpected keyword argument ' verify' in the log.

efiop added a commit that referenced this pull request Dec 13, 2019
Thanks @pared for catching this one #2918 (comment) . Will add a test on top, need to re-release ASAP
@efiop efiop deleted the lock branch December 13, 2019 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants