-
Notifications
You must be signed in to change notification settings - Fork 1.3k
utils: add decorator to wrap diskcache pickle errors #7300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| f"Could not open pickled '{name}' cache. Remove the " | ||
| f"'.dvc/tmp/{name}' directory and then retry this command. " | ||
| f"See {link} for more information." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are "pickled index/md5s/links/test caches" ? Seems like a totally alien concept.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.s. I realize it refers to some of the dirs listed in https://dvc.org/doc/user-guide/project-structure/internal-files but I doubt many users will be familiar at all. Calling these locations "caches" on top of that seems extra confusing π Let me think...
Also, "Could not" -> "Failed to". For consistency with treeverse/dvc.org#3205 and existing Troubleshooting entries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- So firstly, should we document somewhere that something is pickled in the .dvc/tmp/ dirs? I incline to avoid this as it seems like an unnecessary implementation detail. Plus I thought they're SQLite dbs... π€·
- Assuming not, let's generalize here with something like "
.dvc/tmp/{name}seems to be corrupted. Please remove the directory and try again. See ... for more info."
Not 100% sure on the term "corrupted" though. Rel. treeverse/dvc.org#3062
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an implementation detail, but the dir is a cache, and it works by using a combination of sqlite db + pickled data on disk.
It's also not corrupted, because in this case everything would work properly (without removing the dir) if the user used DVC with a more recent python version (and this scenario only happens when the user uses DVC with both python 3.8+ and python < 3.8).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK thanks for the info @pmrowla but TBH I'm still somewhat confused...
Is it possible to install a same release of DVC with different underlying Python versions?
Even after reading #7300 (review) I'm struggling to understand how users would end up with this problem. Is there one specific pathway e.g. you installed DVC 2.x having Python 3.7 and then update only Python to 3.9 and your repo starts to fail? Or is it staying on py37 and updating DVC?
If there's a few specific ways to get here it would be helpful to list them in order to explain this (here and in treeverse/dvc.org#3205).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to install a same release of DVC with different underlying Python versions?
Yes, if you are using virtualenvs you can have more than one virtualenv with DVC installed. Or you could have both a virtualenv DVC and a binary package DVC installed on the same machine with different Python versions. Or you could have a setup where you have a repo directory that is shared across different platform installations with different Python versions - like if I have a repo dir that is exposed to VMs or containers, but the Python/DVC install inside the VM is different than my host system's Python/DVC.
Basically there is a wide range of possible ways that this could happen, especially in the case where a user has to make sure that their code works across multiple Python versions or across different platforms.
Even after reading #7300 (review) I'm struggling to understand how users would end up with this problem.
The issue here is that someone runs DVC in Python > 3.7, and then they run DVC in Python <= 3.7 using the same repo directory. This is not a common problem, but it happens.
Just as an example a user could have DVC installed on macos, where the default system python 3 is 3.8. They may also need to test their training code to make sure it works in linux. If the user installs a linux VM where the system python is 3.7, they would encounter this issue inside the VM, if they re-use the same repo directory (by sharing it from the host to the VM).
β I have followed the Contributing to DVC checklist.
π If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
Thank you for the contribution - we'll try to review it as soon as possible. π
Docs PR: treeverse/dvc.org#3205
Generates a better error message w/troubleshooting link for pickle errors in state/odb-index.
Related to #7222 (see discussion in #7222 (comment))