Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error adding a file to dvc repo on NAS storage #2818

Closed
midnightradio opened this issue Nov 20, 2019 · 17 comments
Closed

error adding a file to dvc repo on NAS storage #2818

midnightradio opened this issue Nov 20, 2019 · 17 comments
Labels
awaiting response we are waiting for your reply, please respond! :) bug Did we break something?

Comments

@midnightradio
Copy link

DVC version is 0.62.1 installed by pip on Ubuntu.

I have shared NAS storage mounted on my system and want to create a DVC repo in the storage.
I could successfully initialized a repo by dvc init command, but adding a file fails with an error message shown below.

$ ls -al
total 0
drwxrwxrwx 2 root root 0 Nov 20  2019 .  <-- has write permission
drwxrwxrwx 2 root root 0 Oct 25 10:37 ..
drwxrwxrwx 2 root root 0 Oct 29 15:30 data  <-- want to add this directory after creating a repo
$ dvc init --no-scm
+---------------------------------------------------------------------+
|                                                                     |
|        DVC has enabled anonymous aggregate usage analytics.         |
|     Read the analytics documentation (and how to opt-out) here:     |
|              https://dvc.org/doc/user-guide/analytics               |
|                                                                     |
+---------------------------------------------------------------------+

What's next?
------------
- Check out the documentation: https://dvc.org/doc
- Get help and share ideas: https://dvc.org/chat
- Star us on GitHub: https://github.com/iterative/dvc
$ ls -al
total 0
drwxrwxrwx 2 root root 0 Nov 20 09:50 .
drwxrwxrwx 2 root root 0 Oct 25 10:37 ..
drwxrwxrwx 2 root root 0 Oct 29 15:30 data
drwxrwxrwx 2 root root 0 Nov 20 09:50 .dvc  <-- repo created successfully
$ dvc add -R data  <-- trying to add the directory recursively
ERROR: unexpected error - [Errno 1] Operation not permitted  <-- failed with an error message which is unclear to find the cause 

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!  <-- made me write this issue
$
@triage-new-issues triage-new-issues bot added the triage Needs to be triaged label Nov 20, 2019
@efiop
Copy link
Contributor

efiop commented Nov 20, 2019

Hi @midnightradio !

Could you please show full log for dvc add -v -R data?

Also, are you sure -R is really what you want? Is data a directory with giant number of files?

@efiop efiop added the bug Did we break something? label Nov 20, 2019
@triage-new-issues triage-new-issues bot removed the triage Needs to be triaged label Nov 20, 2019
@efiop efiop added the awaiting response we are waiting for your reply, please respond! :) label Nov 20, 2019
@midnightradio
Copy link
Author

midnightradio commented Nov 20, 2019

Hi, @efiop
Thanks for your quick follow up!

There are just a few files in the directory and some verbose messages are following.

$ tree data
data
├── darpa-timit-acousticphonetic-continuous-speech.zip
└── openslr
    └── zeroth
        ├── README
        └── zeroth_korean.tar.gz

2 directories, 3 files
$ dvc add -v -R data
ERROR: unexpected error - [Errno 1] Operation not permitted
------------------------------------------------------------
Traceback (most recent call last):
  File "/home/hjlee/miniconda3/envs/pandas/lib/python3.7/site-packages/dvc/main.py", line 41, in main
    cmd = args.func(args)
  File "/home/hjlee/miniconda3/envs/pandas/lib/python3.7/site-packages/dvc/command/base.py", line 47, in __init__
    updater.check()
  File "/home/hjlee/miniconda3/envs/pandas/lib/python3.7/site-packages/dvc/updater.py", line 51, in check
    self._with_lock(self._check, "checking")
  File "/home/hjlee/miniconda3/envs/pandas/lib/python3.7/site-packages/dvc/updater.py", line 41, in _with_lock
    with self.lock:
  File "/home/hjlee/miniconda3/envs/pandas/lib/python3.7/site-packages/flufl/lock/_lockfile.py", line 334, in __en
ter__
    self.lock()
  File "/home/hjlee/miniconda3/envs/pandas/lib/python3.7/site-packages/dvc/lock.py", line 54, in lock
    super(Lock, self).lock(timedelta(seconds=DEFAULT_TIMEOUT))
  File "/home/hjlee/miniconda3/envs/pandas/lib/python3.7/site-packages/flufl/lock/_lockfile.py", line 208, in lock
    self._touch()
  File "/home/hjlee/miniconda3/envs/pandas/lib/python3.7/site-packages/flufl/lock/_lockfile.py", line 462, in _tou
ch
    os.utime(filename or self._claimfile, (t, t))
PermissionError: [Errno 1] Operation not permitted
------------------------------------------------------------

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!

@efiop
Copy link
Contributor

efiop commented Nov 20, 2019

@midnightradio Hm, weird. Could you show dvc version output? Also, could you check your permissions? Are you able to create files in your project's directory? Are you able to touch foo && stat too? So far this seems like you have an issue with your mount, maybe some incorrect or missing mounting options, hard to put my finger on anything specific. Also, does git status work in your repo dir?

@midnightradio
Copy link
Author

@efiop Actually answers for your questions are already shown in the first post, but let me make a double check.

$ dvs --version
0.62.1
$ touch foo
$ stat foo
  File: foo
  Size: 0               Blocks: 0          IO Block: 16384  regular empty file
Device: 48h/72d Inode: 8260955347  Links: 1
Access: (0777/-rwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2019-11-20 13:27:51.860794700 +0900
Modify: 2019-11-20 13:27:51.860794700 +0900
Change: 2019-11-20 13:28:44.555995200 +0900
 Birth: -

When I do the same operation (init and add) for the same data in local storage, it works well.

@midnightradio
Copy link
Author

Previously, I missed the last thing @efiop asked me to try and found something interesting this time with trying git on the same directory. Seems the filesystem does not allow to make lock file though the directory has full permission.

$ git init
error: chmod on /mnt/DSshare/DAI/STT/.git/config.lock failed: Operation not permitted
fatal: could not set 'core.filemode' to 'false'

@shcheklein
Copy link
Member

I think we still should do something about lock. Probably Git is not the best analogy for us in this case.

@midnightradio could you elaborate on your case a little bit please? Why do you want to run the repo on a NAS storage directly?

@efiop
Copy link
Contributor

efiop commented Nov 20, 2019

@midnightradio I meant dvc version and not dvc --version 🙂 Could you run that one please?

@midnightradio
Copy link
Author

@efiop dvc version gives an error like this. It prints out the same message twice.. /mnt/DSshare is a mount point of the NAS storage and /mnt/DSshare/DAI/STT is a working directory for running dvc version

$ dvc version
ERROR: unexpected error - /mnt/DSshare/DAI/STT is not a git repository

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
$ ERROR: unexpected error - /mnt/DSshare/DAI/STT is not a git repository

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!

Here's the result of the same command ran when working directory is in local storage.

dvc version
DVC version: 0.62.1
Python version: 3.7.3
Platform: Linux-4.15.0-64-generic-x86_64-with-debian-buster-sid
Binary: False
Cache: reflink - False, hardlink - True, symlink - True
Filesystem type (cache directory): ('ext4', '/dev/sdb1')
Filesystem type (workspace): ('ext4', '/dev/sdb1')

@efiop
Copy link
Contributor

efiop commented Nov 20, 2019

@midnightradio Hm, that's interesting. Could you run dvc version -v please?

@midnightradio
Copy link
Author

Hi, @shcheklein

There's no specific reason for storing data or keeping dvc repo on shared NAS storage. Just accidentally there was not enough space for newly downloaded data on my local disk and I stored them on shared disk. Then I wanted to make dvc remote for the data and tried make dvc repo on the same directory where I stored the data but failed.

I don't think this is a bug when git even not allows to initiate a repo on this kind of storage.

@efiop Looks like it's a bug to be fixed when dvc version has dependency on having a git repo while dvc init allows --no-scm option to create a repo.

ERROR: unexpected error - /mnt/DSshare/DAI/STT is not a git repository
------------------------------------------------------------
Traceback (most recent call last):
  File "/home/hjlee/miniconda3/envs/pandas/lib/python3.7/site-packages/dvc/scm/git/__init__.py", line 50, in __ini
t__
    self.repo = git.Repo(self.root_dir)
  File "/home/hjlee/miniconda3/envs/pandas/lib/python3.7/site-packages/git/repo/base.py", line 184, in __init__
    raise InvalidGitRepositoryError(epath)
git.exc.InvalidGitRepositoryError: /mnt/DSshare/DAI/STT

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/hjlee/miniconda3/envs/pandas/lib/python3.7/site-packages/dvc/main.py", line 42, in main
    ret = cmd.run()
  File "/home/hjlee/miniconda3/envs/pandas/lib/python3.7/site-packages/dvc/command/version.py", line 48, in run
    repo = Repo()
  File "/home/hjlee/miniconda3/envs/pandas/lib/python3.7/site-packages/dvc/repo/__init__.py", line 84, in __init__
    self.scm = SCM(self.root_dir)
  File "/home/hjlee/miniconda3/envs/pandas/lib/python3.7/site-packages/dvc/scm/__init__.py", line 27, in SCM
    return Git(root_dir)
  File "/home/hjlee/miniconda3/envs/pandas/lib/python3.7/site-packages/dvc/scm/git/__init__.py", line 53, in __ini
t__
    raise SCMError(msg.format(self.root_dir))
dvc.scm.base.SCMError: /mnt/DSshare/DAI/STT is not a git repository
------------------------------------------------------------

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!

@midnightradio
Copy link
Author

As I mentioned previously, I would rather move the data to local space before making a dvc repo when the shared storage even not play well with git. So, I'm ready to close this issue for that point if there's no others who still want to create dvc repo on weird storage more then I did.

However, at the point of git version resulting an error on dvc repo made not to be a git repo, I think it should be fixed. Maybe I can try.

@shcheklein
Copy link
Member

Thanks @midnightradio ! I think both issue are pretty valid. It's a reasonable case when you run dvc add on the attached storage to get data initially under DVC control since it's large enough to fit into your local disk/space. @efiop will decide if keep this one open or not (since it looks like you have a workaround), but will definitely keep this scenario in mind.

For the version issue - yes, let's create a separate ticket and if you can contribute the PR that would be awesome 🙏 We'll try to help you with that.

@skshetry
Copy link
Member

@midnightradio, the error in #2818 (comment) is due to git init earlier. But, anyway, version should quietly work here.

I was able to reproduce the same error with:

temp=$(mktemp -d)
cd $temp
dvc init --no-scm
mkdir .git
dvc version -v

Output is quite similar:

Traceback (most recent call last):
  File "/home/saugat/repos/iterative/dvc/dvc/scm/git/__init__.py", line 52, in __init__
    self.repo = git.Repo(self.root_dir)
  File "/home/saugat/repos/iterative/dvc/.env/py36/lib/python3.6/site-packages/git/repo/base.py", line 184, in __init__
    raise InvalidGitRepositoryError(epath)
git.exc.InvalidGitRepositoryError: /tmp/tmp.hOiYxhj5Wh

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/saugat/repos/iterative/dvc/dvc/main.py", line 49, in main
    ret = cmd.run()
  File "/home/saugat/repos/iterative/dvc/dvc/command/version.py", line 49, in run
    repo = Repo()
  File "/home/saugat/repos/iterative/dvc/dvc/repo/__init__.py", line 87, in __init__
    self.scm = SCM(self.root_dir)
  File "/home/saugat/repos/iterative/dvc/dvc/scm/__init__.py", line 26, in SCM
    return Git(root_dir)
  File "/home/saugat/repos/iterative/dvc/dvc/scm/git/__init__.py", line 55, in __init__
    raise SCMError(msg.format(self.root_dir))
dvc.scm.base.SCMError: /tmp/tmp.hOiYxhj5Wh is not a git repository
------------------------------------------------------------


Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
ERROR: unexpected error - /tmp/tmp.hOiYxhj5Wh is not a git repository                                                                                   


Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!

/tmp/tmp.hOiYxhj5Wh via py36 
ERROR: unexpected error - /tmp/tmp.hOiYxhj5Wh is not a git repository

@efiop
Copy link
Contributor

efiop commented Nov 30, 2019

@skshetry Thanks for the reproducer! So looks like we are dealing here with a broken .git, effectively. I'm a bit hesitant to catch this error and fallback to NoSCM, as it might cause us to ignore real errors in the future. Broken git in a git repo seems like a very bad env issue to me, so I don't feel like it should be solved on dvc's side. Also, initial error with "operation not permitted" is even more serious(os.utime not working!), and no one knows what else would break there. So I'll close this issue for now, since there is a workaround of simply using that weird FS as an external cache directory.

@efiop efiop closed this as completed Nov 30, 2019
@midnightradio
Copy link
Author

@skshetry You are right! I tried again with considering your comment, and it turned out the error about dvc version -v was neither caused by the weird FS nor dvc itself. There must be incomplete '.git' directory remained after git init before trying git status and caused the error. Thanks for make it clear.

I should be more precise. Sorry for my misleading report on dvc version command.

@efiop
Copy link
Contributor

efiop commented Dec 1, 2019

@midnightradio No worries! Glad you've found the cause! Thank you for the feedback! 🙂

@mikolysz
Copy link

Posting this here in case anybody stumbles up on a similar issue in the future.

I had this problem on Mac OS when adding a file downloaded from Google Drive. The file was copied with Finder, from a folder exposed by Google Drive for Desktop. For some reason, either Finder or Drive marked the file as locked, which DVC couldn't handle and errored out with an "operation not permitted" error.

This problem can be fixed in two ways. If it's a single file, just click "get info" in Finder on that file and remove the locked attribute. If you have many files and that's too much work, use the chflags -R nouchg <path_to_directory_to_unlock> command from your terminal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting response we are waiting for your reply, please respond! :) bug Did we break something?
Projects
None yet
Development

No branches or pull requests

5 participants