-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
_list_oids_traverse is not traversing 00 prefix for s3 remote #192
Comments
Ahh! too bad I missed this message before looking into this. dvc-objects/src/dvc_objects/fs/base.py Line 669 in 06a9ab4
I've been using dvc 2.10 which persumably had a oids = self._oids_with_limit(
max_oids / total_prefixes, prefixes=[prefix]
) and finally we'll get an empty set of oids This caused that I couldn't pull an existing oid because I couldn't find it in the bucket. Let me know if this sounds right |
This problem has come up before in the past (iterative/dvc#4141, iterative/dvc#6089) and was fixed, but may have been reintroduced after the performance related refactoring in #180 We do have tests that are supposed to catch this issue, but the performance refactor also touched those tests, which may be why it was not caught. |
Also, just to clarify:
This is intentional, the missing |
The issue is that |
This issue is fixed by installing |
There was another issue that came up in the downstream DVC CI, the updated fix is in |
Hi,
we are using dvc with s3 remote and
dvc status -c
started to behave a little bit strange.With dvc
2.43.2 (dvc data 0.37.3)
thedvc status -c
returns a new file to push, however I'm 100% sure that the file already exists in storage (I've calleddvc push
several times and also calledaws s3 ls
just to be 100 % sure).I've also tried the same with
dvc==2.30.0 (dvc data 0.17.1)
and it seems that this version acts as expected - it returns no file to push.I did a little bit of debugging and found out that the md5 of the problematic file starts with
00
.After more digging I've maybe found a problematic piece of code that was probably refactored a few weeks ago.
The _list_oids_traverse function is traversing over prefixes but it seems that the
traverse_prefixes
contains prefixes from01
toff
and prefixes from001
to00f
.When I've tried to hack the function by adding simple
traverse_prefixes.append("00")
thedvc status -c
returned no file to push as expected.It smells a little bit fishy to me and I guess that could indicate a bug.
https://github.com/iterative/dvc-objects/blob/main/src/dvc_objects/db.py#L302
The text was updated successfully, but these errors were encountered: