-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Bug Report
Description
While pushing data to our azure remote, errors occur in updating the progress bar for some files ( I've so far seen 1-3 file uploads fail out of just 7 being pushed). This causes dvc to think that the file upload failed and so it decides not to upload the .dir file. Actually the file upload did succeed, so on a retry of the dvc push command only the missing .dir file is uploaded.
Reproduce
- dvc repro dvc.yaml (produces a directory dir with nested directories and e.g. 8 files)
- dvc add dir
- dvc commit -f dir.dvc
- dvc push dir.dvc
Expected
All files in the directory structure are pushed to the remote cache, as well as the .dir file representing them.
Environment information
Output of dvc doctor:
$ dvc doctor
DVC version: 2.10.2 (pip)
---------------------------------
Platform: Python 3.9.12 on Linux-5.15.0-27-generic-x86_64-with-glibc2.35
Supports:
azure (adlfs = 2022.4.0, knack = 0.9.0, azure-identity = 1.10.0),
webhdfs (fsspec = 2022.3.0),
http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6)
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/nvme0n1p3
Caches: local
Remotes: azure
Workspace directory: ext4 on /dev/nvme0n1p3
Repo: dvc, git
Additional Information (if any):
First try with failed uploads:
2022-05-05 13:10:00,228 DEBUG: Preparing to transfer data from '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache' to 'dvc/models_v3/some_new_folder'
2022-05-05 13:10:00,228 DEBUG: Preparing to collect status from 'dvc/models_v3/some_new_folder'
2022-05-05 13:10:00,228 DEBUG: Collecting status from 'dvc/models_v3/some_new_folder'
2022-05-05 13:10:00,229 DEBUG: Querying 3 hashes via object_exists
2022-05-05 13:10:01,080 DEBUG: Querying 1 hashes via object_exists
2022-05-05 13:10:03,613 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:03,614 DEBUG: Estimated remote size: 4096 files
2022-05-05 13:10:03,614 DEBUG: Querying '9' hashes via traverse
2022-05-05 13:10:06,704 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,707 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,714 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,724 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,726 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,727 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,734 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,736 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,755 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,755 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,757 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,773 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:49,294 DEBUG: Preparing to collect status from '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache'
2022-05-05 13:10:49,294 DEBUG: Collecting status from '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache'
2022-05-05 13:10:49,648 DEBUG: Uploading '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache/43/0e4a44fc39b482c1344b2d2ca286f1' to 'dvc/models_v3/some_new_folder/43/0e4a44fc39b482c1344b2d2ca286f1'
2022-05-05 13:10:50,236 DEBUG: Uploading '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache/f9/50bdf84b33643f71b7eefd93864d5d' to 'dvc/models_v3/some_new_folder/f9/50bdf84b33643f71b7eefd93864d5d'
2022-05-05 13:10:50,271 DEBUG: Uploading '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache/bb/f019f9da1f79aa4bcd611925756a60' to 'dvc/models_v3/some_new_folder/bb/f019f9da1f79aa4bcd611925756a60'
2022-05-05 13:10:50,279 DEBUG: Uploading '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache/0d/69161c5a8d2f5883675e21454c9f5b' to 'dvc/models_v3/some_new_folder/0d/69161c5a8d2f5883675e21454c9f5b'
2022-05-05 13:10:50,300 DEBUG: Uploading '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache/5e/c95974eacea034d009e940062eba14' to 'dvc/models_v3/some_new_folder/5e/c95974eacea034d009e940062eba14'
2022-05-05 13:10:50,307 DEBUG: Uploading '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache/72/58fd7add28137624d7214a146efac5' to 'dvc/models_v3/some_new_folder/72/58fd7add28137624d7214a146efac5'
2022-05-05 13:11:43,447 ERROR: failed to transfer 'md5: 430e4a44fc39b482c1344b2d2ca286f1' - unsupported operand type(s) for -: 'NoneType' and 'int'
------------------------------------------------------------
Traceback (most recent call last):
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/data/transfer.py", line 25, in wrapper
func(fs_path, *args, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/data/transfer.py", line 162, in func
return dest.add(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/objects/db.py", line 126, in add
self._add_file(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/objects/db.py", line 98, in _add_file
return fs.utils.transfer(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/utils.py", line 96, in transfer
_try_links(links, from_fs, from_path, to_fs, to_path)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/utils.py", line 66, in _try_links
return _copy(from_fs, from_path, to_fs, to_path)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/utils.py", line 44, in _copy
return to_fs.upload(from_path, to_path)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/base.py", line 386, in upload
return self.put_file(from_info, to_info, callback=cb, size=size)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/base.py", line 329, in put_file
self.fs.put_file(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 85, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 65, in sync
raise return_result
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 25, in _runner
result[0] = await coro
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/adlfs/spec.py", line 1574, in _put_file
await bc.upload_blob(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/tracing/decorator_async.py", line 74, in wrapper_use_tracer
return await func(*args, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/aio/_blob_client_async.py", line 388, in upload_blob
return await upload_block_blob(**options)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/aio/_upload_helpers.py", line 129, in upload_block_blob
return await client.commit_block_list(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/_generated/aio/operations/_block_blob_operations.py", line 1006, in commit_block_list
pipeline_response = await self._client._pipeline.run(request, stream=False, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 215, in run
return await first_node.send(pipeline_request)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
[Previous line repeated 4 more times]
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/policies/_redirect_async.py", line 64, in send
response = await self.next.send(request)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/_shared/policies_async.py", line 111, in send
response = await self.next.send(request)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/_shared/policies_async.py", line 90, in send
response_callback(response)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/adlfs/spec.py", line 298, in wrapper
callback.absolute_update(current)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/_callback.py", line 122, in absolute_update
self.progress_bar.update_to(value)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/progress.py", line 121, in update_to
self.update(current - self.n)
TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'
------------------------------------------------------------
2022-05-05 13:11:45,918 ERROR: failed to transfer 'md5: 7258fd7add28137624d7214a146efac5' - unsupported operand type(s) for -: 'NoneType' and 'int'
------------------------------------------------------------
Traceback (most recent call last):
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/data/transfer.py", line 25, in wrapper
func(fs_path, *args, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/data/transfer.py", line 162, in func
return dest.add(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/objects/db.py", line 126, in add
self._add_file(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/objects/db.py", line 98, in _add_file
return fs.utils.transfer(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/utils.py", line 96, in transfer
_try_links(links, from_fs, from_path, to_fs, to_path)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/utils.py", line 66, in _try_links
return _copy(from_fs, from_path, to_fs, to_path)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/utils.py", line 44, in _copy
return to_fs.upload(from_path, to_path)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/base.py", line 386, in upload
return self.put_file(from_info, to_info, callback=cb, size=size)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/base.py", line 329, in put_file
self.fs.put_file(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 85, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 65, in sync
raise return_result
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 25, in _runner
result[0] = await coro
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/adlfs/spec.py", line 1574, in _put_file
await bc.upload_blob(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/tracing/decorator_async.py", line 74, in wrapper_use_tracer
return await func(*args, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/aio/_blob_client_async.py", line 388, in upload_blob
return await upload_block_blob(**options)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/aio/_upload_helpers.py", line 129, in upload_block_blob
return await client.commit_block_list(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/_generated/aio/operations/_block_blob_operations.py", line 1006, in commit_block_list
pipeline_response = await self._client._pipeline.run(request, stream=False, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 215, in run
return await first_node.send(pipeline_request)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
[Previous line repeated 4 more times]
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/policies/_redirect_async.py", line 64, in send
response = await self.next.send(request)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/_shared/policies_async.py", line 111, in send
response = await self.next.send(request)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/_shared/policies_async.py", line 90, in send
response_callback(response)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/adlfs/spec.py", line 298, in wrapper
callback.absolute_update(current)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/_callback.py", line 122, in absolute_update
self.progress_bar.update_to(value)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/progress.py", line 121, in update_to
self.update(current - self.n)
TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'
------------------------------------------------------------
2022-05-05 13:11:45,920 DEBUG: failed to upload full contents of 'None', aborting .dir file upload
2022-05-05 13:11:45,921 ERROR: failed to upload '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache/76/df0b9081b1d33145e688a4b23b5cf3.dir' to 'dvc/models_v3/some_new_folder/76/df0b9081b1d33145e688a4b23b5cf3.dir'
2022-05-05 13:11:45,922 ERROR: failed to push data to the cloud - 3 files failed to upload
------------------------------------------------------------
Traceback (most recent call last):
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/repo/push.py", line 68, in push
pushed += self.cloud.push(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/data_cloud.py", line 85, in push
return transfer(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/data/transfer.py", line 176, in transfer
_do_transfer(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/data/transfer.py", line 116, in _do_transfer
raise FileTransferError(total_fails)
dvc.exceptions.FileTransferError: 3 files failed to transfer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/commands/data_sync.py", line 58, in run
processed_files_count = self.repo.push(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/repo/__init__.py", line 48, in wrapper
return f(repo, *args, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/repo/push.py", line 72, in push
raise UploadError(exc.amount)
dvc.exceptions.UploadError: 3 files failed to upload
------------------------------------------------------------
2022-05-05 13:11:45,932 DEBUG: Analytics is enabled.
2022-05-05 13:11:46,047 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmpu8n72eqg']'
2022-05-05 13:11:46,050 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmpu8n72eqg']'
Second try which just uploads the .dir file:
2022-05-05 18:04:51,465 DEBUG: Preparing to transfer data from '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache' to 'dvc/models_v3/some_new_folder'
2022-05-05 18:04:51,465 DEBUG: Preparing to collect status from 'dvc/models_v3/some_new_folder'
2022-05-05 18:04:51,466 DEBUG: Collecting status from 'dvc/models_v3/some_new_folder'
2022-05-05 18:04:51,467 DEBUG: Querying 3 hashes via object_exists
2022-05-05 18:04:52,254 DEBUG: Querying 1 hashes via object_exists
2022-05-05 18:04:54,340 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:54,341 DEBUG: Estimated remote size: 4096 files
2022-05-05 18:04:54,342 DEBUG: Querying '9' hashes via traverse
2022-05-05 18:04:57,243 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,257 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,263 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,272 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,281 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,291 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,343 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,360 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,405 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,407 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,418 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,422 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:05:17,034 DEBUG: Preparing to collect status from '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache'
2022-05-05 18:05:17,034 DEBUG: Collecting status from '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache'
2022-05-05 18:05:17,389 DEBUG: Uploading '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache/76/df0b9081b1d33145e688a4b23b5cf3.dir' to 'dvc/models_v3/some_new_folder/76/df0b9081b1d33145e688a4b23b5cf3.dir'
2022-05-05 18:05:17,578 DEBUG: Indexing pushed dir 'md5: 76df0b9081b1d33145e688a4b23b5cf3.dir' with '8' nested files
1 file pushed
2022-05-05 18:05:17,599 DEBUG: Analytics is enabled.
2022-05-05 18:05:17,714 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmp_8_jinwu']'
2022-05-05 18:05:17,716 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmp_8_jinwu']'
I already went digging a bit myself and found a temporary fix. The problem seems to be that a response from azure can have None for the current progress, which causes an error when updating the progress bar. I fixed this for now by changing the update_to function where the error happens as follows:
def update_to(self, current, total=None):
if total:
self.total = total
if current is not None:
self.update(current - self.n)
I don't know for sure if this is intended behaviour on azure's side, but it does seem like it. By following the traceback from the bottom up I found this Azure/azure-sdk-for-python#11419 (comment) link in their code. The linked comment also checks if current is None, so perhaps the solution is as simple as that?