-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Closed
Labels
awaiting responsewe are waiting for your reply, please respond! :)we are waiting for your reply, please respond! :)
Description
Execute the following script on a Windows network drive
mkdir repo
cd repo
git init --quiet
dvc init -q
dvc config cache.type copy
measure-command { fsutil file createnew data 134217728 }
# Probably less than 1 second
measure-command { dvc add data }
# Around 30 seconds on our setup
Digging into the problem a little bit it seems that the cache copy operations end up in shutil.copyfileobj which reads the file to be copied into memory in 16kb chunks before writing out again. Unless the network is very local this is always going to be a performance killer.
The situation might be better with Python 3.8 (#3033), but it would be good to ease the pain until DVC supports this version, and even then some users will not be able to upgrade straightaway.
pared
Metadata
Metadata
Assignees
Labels
awaiting responsewe are waiting for your reply, please respond! :)we are waiting for your reply, please respond! :)