Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dvc: .dvc/lock and sqlite lock on NFS and CIFS #1918

Open
efiop opened this issue Apr 23, 2019 · 3 comments

Comments

Projects
None yet
2 participants
@efiop
Copy link
Member

commented Apr 23, 2019

As it turns out, regular locks can't be relied on when dvc is running on NFS or CIFS. See #1823 and https://discordapp.com/channels/485586884165107732/485596304961962003/570270243964846081 .

There are workarounds such as adding proper mnt options or moving dvc project outside with an external cache directory at NFS/CIFS mount, but it would be great to mitigate such issues in general. Here are a few ways to go about it:

One way to go about it is to use git-like locks (which, if I recall correctly, are using symlinks as an atomic way lock/unlock a file). In that case, we would have to use an unlocked sqlite db relying on our .dvc/lock (or maybe introduce a separate special lock specifically for a db).

@efiop efiop added the enhancement label Apr 23, 2019

@efiop

This comment has been minimized.

Copy link
Member Author

commented Apr 23, 2019

Running

import sqlite3
db  = sqlite3.connect('db')
cursor = db.cursor()
cmd = "CREATE TABLE IF NOT EXISTS 'state' (count INTEGER)"
cursor.execute(cmd)

on CIFS, results in


Traceback (most recent call last):
  File "azureml-setup/context_manager_injector.py", line 161, in <module>
    execute_with_context(cm_objects, options.invocation)
  File "azureml-setup/context_manager_injector.py", line 90, in execute_with_context
    runpy.run_path(sys.argv[0], globals(), run_name="__main__")
  File "/azureml-envs/azureml_f46203ca27ee37bd5932e64f3549ae1c/lib/python3.6/runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/azureml-envs/azureml_f46203ca27ee37bd5932e64f3549ae1c/lib/python3.6/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/azureml-envs/azureml_f46203ca27ee37bd5932e64f3549ae1c/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "code/training_scripts/train_azure_test.py", line 108, in <module>
    cursor.execute(cmd)
sqlite3.OperationalError: database is locked
@efiop

This comment has been minimized.

Copy link
Member Author

commented May 9, 2019

@efiop efiop assigned mroutis and unassigned mroutis May 9, 2019

@efiop efiop added this to To do in Weekly tasks via automation May 9, 2019

@efiop efiop removed this from To do in Weekly tasks May 9, 2019

@efiop efiop added p2-medium and removed p1-important labels May 16, 2019

@efiop

This comment has been minimized.

Copy link
Member Author

commented May 27, 2019

Another user might be getting the same issue but now with Lustre FS https://discordapp.com/channels/485586884165107732/485596304961962003/582672973660684291

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.