Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Environment variable sandboxing #2794

Open
wants to merge 6 commits into
base: master
from

Conversation

@jsgf
Copy link

jsgf commented Oct 26, 2019

Rendered

First draft proposal to add mechanism to precisely control environment available to the env!/option_env! macros.



# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

This comment has been minimized.

Copy link
@sfackler

sfackler Oct 26, 2019

Member

env(1) seems like a plausible alternative to me.

This comment has been minimized.

Copy link
@jsgf

jsgf Oct 27, 2019

Author

Sure, I'll add it, and explicitly using the shell.

@Ixrec

This comment has been minimized.

Copy link
Contributor

Ixrec commented Oct 27, 2019

  • AFAIK FOO=BAR rustc ... in most shells will set the (real) environment variable FOO for just that invocation of rustc. So what problem does --env-set solve? Is there a reason you'd ever want to set an environment variable in the "logical" environment but not the "real" environment of a single rustc invocation?
  • The other use cases seem to want --env-whitelist. What is would --env-blacklist be used for?
  • Allowing arbitrary combinations of all three flags and making the combined semantics of conflicting combinations depend on the order they're passed feels very strange to me. Why would anyone pass something like --env-set FOO=BAR --env-blacklist FOO in the first place? This almost feels like it's designed for some automated tool to generate these cli args; is that a use case we have?
@jsgf

This comment has been minimized.

Copy link
Author

jsgf commented Oct 27, 2019

@lxrec:

  • AFAIK FOO=BAR rustc ... in most shells will set the (real) environment variable FOO for just that invocation of rustc. So what problem does --env-set solve?

A shell-based mechanism only applies if you're invoking rustc from a shell. If its some other build tool doing the invocation, then it may not have a mechanism to control the environment in that way.

Is there a reason you'd ever want to set an environment variable in the "logical" environment but not the "real" environment of a single rustc invocation?

There seems to be a few use-cases of using things like env!("PATH") to capture a fallback path of last resort (if any runtime mechanism fails). In this case you'd want to specify a path independently of the build environment, especially if you're cross-compiling, or otherwise building in an environment different from the deployment environment.

Or you may just want to block anything reading PATH entirely.

  • The other use cases seem to want --env-whitelist. What is would --env-blacklist be used for?

That's just for completeness because its hard to do negative matching in a regex. One could imagine that you might want to blacklist security sensitive variables, such as SSH_.*, without needing/wanting to manage everything.

  • Allowing arbitrary combinations of all three flags and making the combined semantics of conflicting combinations depend on the order they're passed feels very strange to me. Why would anyone pass something like --env-set FOO=BAR --env-blacklist FOO in the first place? This almost feels like it's designed for some automated tool to generate these cli args; is that a use case we have?

Yes, it is primarily intended for tooling to generate these options (esp since rustc is rarely invoked directly anyway). I suggest a Cargo-based idea in the RFC, though my primary use-case is a non-Cargo environment.

It doesn't make any real sense to set a variable then blacklist it, but it is important to define semantics for this case. I propose these because they're easy to describe and implement, but I'm open to other proposals if they make more sense.

jsgf added 3 commits Oct 27, 2019
…chanisms.

Complexities of knowing which env vars are actually needed to run rustc.
initialized from the process environment. Then each each `--env` option is processed in turn, as it appears, to update the logical
environment. Specifically:

- `--env-whitelist REGEX` - Any name which doesn't match the REGEX is removed from the logical environment,

This comment has been minimized.

Copy link
@pcpthm

pcpthm Oct 27, 2019

Is the regular expression needed? Explicitly listing all individual variables seems not unreasonable.
Another question: does TEST match NOT_TEST?

This comment has been minimized.

Copy link
@jsgf

jsgf Oct 27, 2019

Author

I chose regex for things like allowing all CARGO_ vars without having to enumerate all the features or whatever else cargo adds.

I mentioned that the regex is anchored so it implicitly has ^$ around it - so no TEST wouldn't match NOT_TEST.


These options are:
- `--env-whitelist REGEX` - match the REGEX against all existing process environment
variables and allow them to be seen. This overrides `--env-remove-all`. The regex is matched against the entire variable name

This comment has been minimized.

Copy link
@matklad

matklad Oct 27, 2019

Member

It seems that --env-remove-all is not defined in the RFC?

This comment has been minimized.

Copy link
@jsgf

jsgf Oct 27, 2019

Author

--env-blacklist .*

@matklad

This comment has been minimized.

Copy link
Member

matklad commented Oct 27, 2019

rust-analyzer will have to do something along this lines, once we get to supporting env!. Unlike rustc, rust-analyzer has to work with many crates at the same time. If we just used std::env::var, we would make env vars the same for all crates, which clearly doesn't work for things like CARGO_OUTPUT_DIR. So, instead we plan to maintain a per-crate logical environment, with, possibly, a fallback to std::env::var.

@jsgf

This comment has been minimized.

Copy link
Author

jsgf commented Oct 27, 2019

Yes I was thinking about the benefits of setting the env explicitly for persistent/embedded compiler instances.

@jsgf jsgf changed the title First draft for environment variable sandboxing Environment variable sandboxing Oct 28, 2019

By default all environment variables are available with their value taken from the environment. There are several
additional controls to control the logical environment accessed by `env!()`/`option_env!()`:
- only allow access to a specific whitelist of variables

This comment has been minimized.

Copy link
@joshtriplett

joshtriplett Oct 31, 2019

Member

I like this idea in general, and am broadly in favor of this RFC. I'll review it again later. As an initial review pass, though, could you please avoid using "blacklist" and "whitelist" (both in the RFC and in the option names), in favor of (for instance) "blocklist" and "safelist", or perhaps "block" and "allow"? See numerous references such as https://twitter.com/dhh/status/1032050325513940992 and https://twitter.com/andybohm/status/1038491107829530625 for rationale. Thank you.

This comment has been minimized.

Copy link
@kennytm

kennytm Oct 31, 2019

Member

The two arguments take regexes, not lists, so calling them "lists" are misnomer anyway 🙃.

(If we call them --env-allow and --env-deny, I wonder if --env-warn makes sense 😛)

This comment has been minimized.

Copy link
@jsgf

jsgf Oct 31, 2019

Author

Allow/deny is consistent with other rustc terminology.

[drawbacks]: #drawbacks

The primary cost is additional complexity in invoking `rustc` itself, and additional complexity in documenting
`env!`/`option_env!`. Procedual macros would need to be changed to access the logical environment, either by

This comment has been minimized.

Copy link
@programmerjake

programmerjake Oct 31, 2019

appears as though you forgot to write the rest of the sentence.

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Oct 31, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.