Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use file locking to protect selfcheck state file #6954

Closed
chrahunt opened this issue Aug 31, 2019 · 3 comments · Fixed by #6879

Comments

@chrahunt
Copy link
Member

@chrahunt chrahunt commented Aug 31, 2019

What's the problem this feature will solve?

There are several issues around file locking that have been filed over the years, specifically related to:

  1. Underlying OS/filesystem does not support hardlinks as used by the file lock (#2993, #5322, #6761)
  2. Lingering lock files and/or lock files in an inconsistent state can cause pip to hang when attempting to acquire the lock (some of #3532, #5034)
  3. lockfile uses hostname when creating its unique name, which can result in invalid paths when hostname includes a / (#6938)

Describe the solution you'd like

  1. Write a selfcheck state file per-prefix, to remove the need to read and then write the file within a lock
  2. Write the file atomically (write to a separate tmp file and then move into place) to avoid partial writes if the process is killed

This will satisfy the linked issues and help us progress on #4766 to remove lockfile entirely.

Alternative Solutions

  1. Switch to MkdirLockFile as currently used in the HTTP cache - the downside of this approach is that it is not backwards-compatible, so we would need to use a separate file to track the information for modern pip versions. If we would need to use a separate file anyway, we might as well go one step further to progress #4766.

Additional context

  • PR #6855 - writes per-prefix selfcheck state files
  • PR #6879 - removes file locking
@pfmoore

This comment has been minimized.

Copy link
Member

@pfmoore pfmoore commented Aug 31, 2019

Write a selfcheck state file per-prefix

Where would this go? If it's within site-packages, it will have problems with permissions (the system site-packages is often read-only) but if it's elsewhere, how will it get cleared up? For example, I routinely use pew mktmpenv to create throwaway virtual environments - I would not want selfcheck files remaining on my system for every one of these.

@chrahunt

This comment has been minimized.

Copy link
Member Author

@chrahunt chrahunt commented Aug 31, 2019

For context, moving to per-prefix selfcheck state files is implemented by #6855 (I have added this to the original issue).

Where would this go?

In #6855, it is <cache dir>/selfcheck/<sha224 of absolute prefix path>

  1. The total path length is exactly the same as for any of our HTTP cache files
  2. The prefix path itself is now in the contained JSON object so we can still tell where it comes from.

how will it get cleared up?

Great question. There is nothing for it in #6855, but we could envision either an ad-hoc activity supported as part of #4685, or a task that removes any selfcheck state files where the prefix path no longer exists. The task could be done by pip when it is updating its state file. I would prefer to implement this after we get rid of the file locking.

@pfmoore

This comment has been minimized.

Copy link
Member

@pfmoore pfmoore commented Aug 31, 2019

I would prefer to implement this after we get rid of the file locking.

I guess this is reasonable, as it's in the cache directory (which is already essentially a repository of opaque "clutter" from the POV of the user. It's not like the state files are a space issue. I've never really thought about it, but I guess we don't have any form of cache housekeeping at the moment, so this is really just a further case of that.

I raised #6956 to track the question of whether we need some form of cache housekeeping.

@lock lock bot added the S: auto-locked label Oct 15, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 15, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
2 participants
You can’t perform that action at this time.