-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH - Shorten hash used in environment path #611
Comments
I'd consider using a different encoding for the hash. Our current If we want readability, I'd keep a series of symlinks to the hashed path. If you want to map back, you can always query the database. Convenience shouldn't trump usability. |
[UPD: Nevermind, see my messages below. This is a conda limitation on all platforms, we cannot solve this just by using the extended length prefix.] @jaimergp Not sure where base32 comes from, it's sha256 + some other stuff: @property
def build_key(self):
"""A conda environment build is a function of the sha256 of the
environment.yaml along with the time that the package was built.
The last two parts of the build key are to assist finding the
record in the database.
The build key should be a key that allows for the environment
build to be easily identified and found in the database.
"""
datetime_format = "%Y%m%d-%H%M%S-%f"
return f"{self.specification.sha256}-{self.scheduled_on.strftime(datetime_format)}-{self.id}-{self.specification.name}" In #588, Aaron shows this example:
Yes, our IMO, a better solution, which shouldn't require changing much and won't affect other platforms is using the extended-length path prefix https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation IIUC, this will cause Windows to use different APIs, which don't have this legacy path limit. Also, based on what've read, the support for this should go back to Windows 7, but it needs to be verified. The only issue I can think of is not being able to browse these paths using Explorer, but I think that was only a problem on XP, and it was updated to use different APIs on Windows 7, but again, this needs to be verified. Also, it's not something that should regularly happen as these are internal. Finally, these extended paths are supposed to be absolute (cannot use A number of projects seem to be using this approach:
Not going to explore a different
Requirements for new
|
I forgot to mention advantages of the extended path prefix
|
OK, nevermind. Jaime told me this is needed because of point 1 here: #588 (comment). So there's no other way than to shorten the
Repro: #588 (comment) |
@jaimergp I've thought more about this in the context of the conda prefix. Let's look at Aaron's example, which exceeded the limit: /private/var/folders/wc/dppcpmxs1tlb36nqcw853wkm0000gn/T/pytest-of-aaronmeurer/pytest-12/test_conda_store_register_envi0/conda-store-state/pytest-namespace/35f604188f69ceb5d9e3fae2c93ffd48c2971d192f21c776e3ebfed7e1196868-20231005-205956-556312-1-pytest-name
Summary: there are many things that are non-constant and can exceed the prefix length. I think the long-term solution should be similar to what Nix uses: https://nixos.org/guides/nix-pills/nix-store-paths. For now, I'll implement the truncated hash + timestamp idea, but it's a bandaid rather than a real solution. We should use something else long-term. UPD: More info on Nix hashes: https://nixos.wiki/wiki/Nix_Hash |
Conda itself would need to be updated to use |
Status update: submitted a WIP PR but need to address review feedback: #652. |
Opened a separate issue for Nix-like hashes: #678 |
Context
Environment paths currently use a hash of the environment plus a timestamp plus the environment name. For example
99108419ad0fd922fdeff9bbc434b58d41f68e3f923a83f6a7ab19568463bc82-20230915-201619-848782-1-test
for an environment named "test".This path is very long (90 characters plus the environment name) and causes issues on Windows where the default filesystem limits paths to 260 characters, and some packages can have long paths. See #588. For example,
notebook
has a path with 122 characters (#588 (comment)), so 122 + 90 is 212 characters, leaving only 48 characters for the root of wherever the conda-store state directory is stored plus the environment name.Long paths can also cause issues on other platforms for some packages that are built to only be able to be installed in paths up to 255 characters. See #588 (comment)
Value and/or benefit
On Windows, this can be fixed by setting a registry key, but this requires administrator privileges, so some deployments might not be able to do it.
Anything else?
See #588 for more technical details on the Windows path length problem.
My suggestion would be to use a shorter hash. We can incorporate the timestamp into the hash, since it is already stored in the database. However, we would need to make sure this is done in a way that is backwards compatible.
The text was updated successfully, but these errors were encountered: