Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support volatile processes. #10768

Merged
merged 3 commits into from Sep 13, 2020
Merged

Conversation

jsirois
Copy link
Member

@jsirois jsirois commented Sep 12, 2020

This just wraps up the current UUID hack behind a type to prevent the
surface area of the hack from spreading. If we eventually plumb a
volatility flag through Process to the Rust side we can still do so with
a small change to the current UUID env var hack and a controlled
deprecation of the VolatileProcess type in favor of the Process flag
should it come to exist.

[ci skip-rust]
[ci skip-build-wheels]

This just wraps up the current UUID hack behind a type to prevent the
surface area of the hack from spreading. If we eventually plumb a
volatility flag through Process to the Rust side we can still do so with
a small change to the current UUID env var hack and a controlled
deprecation of the VolatileProcess type in favor of the Process flag
should it come to exist.

[ci skip-rust]
[ci skip-build-wheels]
@coveralls
Copy link

coveralls commented Sep 12, 2020

Coverage Status

Coverage remained the same at 0.0% when pulling 15c306c on jsirois:pull/10751/alternative into 263ce68 on pantsbuild:master.

# could be gone tomorrow. Ideally we'd only do this for local processes since all known
# remoting configurations include a static container image as part of their cache key which
# automatically avoids this problem.
VolatileProcess(
Copy link
Member Author

@jsirois jsirois Sep 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aaaaand ... this is still broken in realistic ways. Namely, upgrading or downgrading a binary found here previously will result in the same output from this rule and thus not trigger a re-run of rules depending on this data, which is wrong. The only cases currently covered by this fix from the list below are 1 and 2:

  1. An applicable binary is added to the search path.
  2. An applicable binary is removed from the search path.
  3. An applicable binary on the search path is modified (upgraded or downgraded most likely).

To handle 3 the contents of the discovered binaries need to be hashed (And even that is not enough! Dynamically linked libraries could change the binaries output ... but we'll need to whistle past the graveyard - that level of detail can really only be addressed by always running in an image / using a local docker / podman / runc / crun type solution). This could be done by storing it in the CAS as a hack, but we won't / can't generally use the binary stored in the CAS so that just adds more (see make_process_volatile comment) storage bloat. Ideally we'd ask an engine intrinsic to get us a fingerprint without storing the object. It could also be done by just hashing the resulting absolute path in python rule code at the expense of potentially heavy IO inhibiting parallelism. On my machine, hashing my biggest binary (/usr/bin/packer @ 154M) takes ~400ms in pure python code and hashing the currently more typical python binary (@ 19M) takes ~50ms.

I'd like to defer handling case 3 to a follow-up if folks are amenable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Tom had suggested hashing the binaries, as well. My original design was going to update the find_binary script to also run shasum -a 256 if available, then to store a private field _hashes on the BinaryPaths object. Call sites wouldn't use this, but it would result in the cache becoming invalidated if the hashes change, even if the paths remain the same.

I don't think we need to block on #3. By definition, fixing #1 and #2 will be an improvement from what we have right now.

We should track this, though. It could result in really subtle issues for users that require them knowing to purge lmdb_store..

--

It gets even worse for Pyenv, which was the whole motivation of this fix. ~/.pyenv/shims/python is a Bash script. It will always have the same hash, regardless of what interpreter is used. The only "fixes" I could imagine are a) calling pyenv which python somehow, or b) telling users to stop using Pyenv how you're supposed to, e.g. don't use pyenv global and instead explicitly list the .pyenv/versions on their PATH.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think shasum if available is not a great idea. Even relying on bash and optional which as is is bad and TODO'd for replacement with a native binary for each platform we support.

Pyenv shims would be solved IIUC by something this should already be doing, which is allowing a test function. Really a test argv where each found binary is executed as argv0 + argvTest. After all, the found binary is just name matched. It may be broken, it may be malicious, etc. A test function at least provides a nominal check the binary works and the result of the output of test function, if any, could be mixed into the hash.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll follow up with this change and temporarily push hashing into the check process with an issue broken out to separate this out. Since python is an interpreter we only happen to be able to do fancy checks with it - for other binaries that won't be possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The follow-up is in #10770.

Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good call to reduce the surface area. This is a good change for our plugin API, too. We know at least one user wants to run Git processes for their plugin, which should never be cached.

How about "uncachable" instead of "volatile"? I acknowledge "volatile" has prior art with C/C++/Java. However, in an ideal world, to avoid storage bloat, we will never cache an uncachable process in the first place. Personally, I think "uncachable" is simpler English than having to be familiar with "volatile" from those languages.

@jsirois
Copy link
Member Author

jsirois commented Sep 13, 2020

Uncacheable sgtm.

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants