Skip to content
This repository has been archived by the owner on Jan 20, 2022. It is now read-only.

Allowlist for envp (& argv?) #2356

Closed
Nirusu opened this issue May 7, 2021 · 29 comments · Fixed by gramineproject/gramine#56
Closed

Allowlist for envp (& argv?) #2356

Nirusu opened this issue May 7, 2021 · 29 comments · Fixed by gramineproject/gramine#56

Comments

@Nirusu
Copy link
Contributor

Nirusu commented May 7, 2021

Description of the problem

Sometimes there may be cases in which it is desired to pass information from the host to the enclave, as long as it is handled in a secure way, as already discussed in #2347 .

However, when choosing the approach to pass the whole environment variables and sanitize them later in the application (e.g. in some kind of "premain"), you can run into certain issues with the LD_ variables Graphene requires to correctly execute dynamically linked bianries, thus potentially achieving unwanted code execution when a manifest with loose "pass-through" mounts/options was deployed.

Suggested Solution

Given that the only other sane "easy" known way to actually pass through configuration or data from the host is by using "sgx.allowed_files", I would suggest to add certain allowed "envp" or "argv" (if this is handable in a reasonable way, argv seems more tricky to handle) arguments to be specifiable in the manifest, similar to Occlum which allows to define certain "untrusted" env vars to pass through to the enclave.

Thoughts

This is still not 100% free of misuse, however would at least prevent the case in which an user has to manually sanitize host envs or argv in their application (and potentially make mistakes on the way as it mixes host env & Graphene env vars), and would provide an easier way compared to sgx.allowed_files with a similar level of potential misuse.

I know from #2347 that @mkow discourages this kind of option, however I think there are definitely legit use cases to pass these kind of arguments from the host to the enclave, yet there seems to be a lack of other good/easy options so far. I think manually specifying the vars to pass through "untrusted" is still a better option than going with the shotgun-approach to allow all and sanitize later, or just don't and come up with clever workaround.

If there is another secure way for this use case, I think it requires at least some documentation then to show users how to handle this case correctly.

Feedback & Opinions appreciated!

@dimakuv
Copy link
Contributor

dimakuv commented May 7, 2021

So what you want is basically this syntax in Graphene manifest:

loader.env.first = "This envvar is hard-coded and trusted (always overwrites untrusted env, if any)"
loader.untrusted_env.second = "This envvar is passed to app as-is (if no env, then this default value is assigned)"

I'm fine with this, but we need to streamline our envvar handling... We have way too many options now, and it is unclear how they combine.

@Nirusu Nirusu changed the title Allowlist for argv & envp Allowlist for envp (& argv?) May 7, 2021
@Nirusu
Copy link
Contributor Author

Nirusu commented May 7, 2021

Yup, or other way around (provide default value, make it overwritable), though I suppose the "trusted does overwrite untrusted" approach is properly better.

@dimakuv
Copy link
Contributor

dimakuv commented May 7, 2021

Yup, or other way around (provide default value, make it overwritable)

Yeah, that is exactly what my new proposed syntax means:
loader.untrusted_env.second = "this default will be overwritten by user-supplied envvar, if any"

@Nirusu
Copy link
Contributor Author

Nirusu commented May 7, 2021

Ah sorry, yes that would work fine for our use case 👍🏻

@mkow
Copy link
Member

mkow commented May 8, 2021

Overall this sounds like a useful feature and I'm not against it.
What I'm against is passing LD_LIBRARY_PATH from untrusted world :) (and I don't see how adding such a selective passthrough could solve the original issue which you solved with passing LD_LIBRARY_PATH. I don't know what the issue was, but this new proposal doesn't sound like a fix for it)

@Nirusu
Copy link
Contributor Author

Nirusu commented May 8, 2021

Reality is that we couldn't care less about LD_LIBRARY_PATH. Not my lines of code, but I can only imagine we pass this through to correctly load a dynamically linked application. Normally we don't touch any of those LD vars because we have no reasons to. We know it's a bad idea - it's just that we always assumed that one would set this in the manifest (which of course does not necessarily need to be the case, so the assumption was not good). In this case they would be set in stone, so nothing a malicious host could temper with ideally. And if someone sets bad variables in the manifest, that's not quite our problem, the issue would have existed in any other way too.

It's just when it's not defined, and the case being that after entering the entrypoint, the definition of host vars and Graphene manifest vars become virtually indistinguishable unless you reparse the manifest again.

This is why we propose this. We only care about the "EDG_" vars for Coordinator discovery and the right parameters on what the application should get provisioned from it, verified by the package definitions in the manifest and the signature properties. LD_LIBRARY_PATH only was bandaid for correct execution, and we would be very delighted to get rid of this from our application :)

@mkow
Copy link
Member

mkow commented May 8, 2021

Ok, I get it now :) Until we get this feature, please ensure that LD_LIBRARY_PATH is actually defined in the user manifest, otherwise this will be completely insecure (and we can't expect users to know this quirk).

@thomasten
Copy link

After the discussion in edgelesssys/marblerun#158 we came to the conclusion that this is an important feature for us. Do you consider to implement this? And if so, can you give an approximate ETA?

@dimakuv
Copy link
Contributor

dimakuv commented May 18, 2021

@thomasten This is a generally simple feature to implement. I can do it this week, though I'm not sure if @mkow likes my proposed approach. Michal, what do you think about this:

loader.env.first = "This envvar is hard-coded and trusted (always overwrites untrusted env, if any)"
loader.untrusted_env.second = "This envvar is passed to app as-is (if no env, then this default value is assigned)"

@mkow
Copy link
Member

mkow commented May 18, 2021

@dimakuv Looks fine, although I think passthrough_env name would be easier to understand than untrusted_env for the users (in terms of what it actually does). Also, when doing this, maybe we should finally switch to proper TOML dicts syntax? This would be more work, but it parses to the same result in TOML, so the change would be only in the conventions we use in our examples. So, it would look like this:

[loader.env]
first = "asdf"
second = "qwer"

[loader.passthrough_env]
third = "1234"
fourth = "5678"

One thing I don't like is that these default values are confusing for passthroughs, it's not intuitive that they are defaults. Do we even need a defaults for passthroughs? Maybe we could just change this into a list of env names for passing through, without any defaults.

@Nirusu
Copy link
Contributor Author

Nirusu commented May 20, 2021

I think having defaults could actually prove useful for some cases, e.g. you want to set some default value that you expect most of the time, but could be overrided in certain cases? Though not too sure about any concrete examples right now where this would actually be important.

Another solution I could think of to handle this would be to define the default values in loader.env, and make them passable from the host via loader.passthrough_env. I think this would this more intuitive for the default values, however might create confusion about which vars can be trusted without special care, and which under consideration. But I don't think that would be a huge problem.

@mkow
Copy link
Member

mkow commented May 20, 2021

I think having defaults could actually prove useful for some cases, e.g. you want to set some default value that you expect most of the time, but could be overrided in certain cases? Though not too sure about any concrete examples right now where this would actually be important.

This could be handled either by the app (which has to handle the lack of this env anyways) or by the caller. I prefer to keep Graphene as simple as possible, as the logic inside Graphene is security-critical.

Another solution I could think of to handle this would be to define the default values in loader.env, and make them passable from the host via loader.passthrough_env. I think this would this more intuitive for the default values, however might create confusion about which vars can be trusted without special care, and which under consideration. But I don't think that would be a huge problem.

I like this approach, definitely better than the originally suggested syntax :)

@dimakuv
Copy link
Contributor

dimakuv commented May 21, 2021

@mkow @Nirusu So do we converge on the following syntax?

[loader]
passthrough_host_env = ["foo", "bar", "baz"]

[loader.env]
foo = "asdf"
baz = "qwer"
# "bar" retains the (insecure) value from the host
qux = "1234" # GRAPHENE LOUDLY FAILS BECAUSE NOT IN THE passthrough_host_env ARRAY!

Moreover, Graphene will loudly fail if loader.insecure__use_host_env and loader.passthrough_host_env are simultaneously specified in the manifest file (they are mutually exclusive).

P.S. Note that I renamed passthrough_env into passthrough_host_env for readability reasons and to be similar to the already existing loader.insecure__use_host_env.

@mkow
Copy link
Member

mkow commented May 21, 2021

Looks good, except this part:

qux = "1234" # GRAPHENE LOUDLY FAILS BECAUSE NOT IN THE passthrough_host_env ARRAY!

I thought you were supposed to still be able to define other variables? Without this the feature will be unusable - e.g. you won't be able to hardcode LD_LIBRARY_PATH in the manifest and simultaneously allow EDG_SOMETHING from the host.

@dimakuv
Copy link
Contributor

dimakuv commented May 21, 2021

@mkow True. So it should be this:

[loader]
passthrough_host_env = ["foo", "bar", "baz"]

[loader.env]
foo = "asdf"  # overrides the host env
# "bar" retains the (insecure) value from the host
baz = "qwer"  # overrides the host env
qux = "1234"  # not propagated from host env, created inside Graphene

@mkow
Copy link
Member

mkow commented May 21, 2021

foo = "asdf"  # overrides the host env

Not the other way? (to make it work as defaults for these)

@dimakuv
Copy link
Contributor

dimakuv commented May 21, 2021

I don't like such semantics. Why not allow the host (user) to pass the default, or no-value? I think the user should decide on this.

Also, when I see loader.foo = "asdf", I am automatically assuming that this is the "secure" value. But in your semantics, it will only be the "default", which could be overwritten by a passthrough value.

@mkow
Copy link
Member

mkow commented May 21, 2021

I don't like such semantics. Why not allow the host (user) to pass the default, or no-value? I think the user should decide on this.

That was my original suggestion (that Graphene shouldn't handle defaults), but @Nirusu argued that it would be very handy for them ;)

So, I personally prefer to not support passthrough defaults, but it's not a terribly strong preference.

@Nirusu
Copy link
Contributor Author

Nirusu commented May 21, 2021

@dimakuv The suggestion to define a pass-through variable and make it override in the same manifest does not make sense. It essentially makes the definition useless, as the manifest is set in stone after the build and thus always overrides the pass-through. Might as well just not define it at all in this case.

@mkow I don't have a terribly strong preference either. Having defaults seems like a good idea to me because it allows to use host environment variables to override certain things when desired, but I get that defining a default and then defining it as passthrough makes it worse in terms of security is a weird UX behavior.

For our own use case it's not really important, it might make our samples a little bit cleaner to launch assuming you can just put the defaults in the manifest and do not have to define them every time on launch, but that's not really critical. I was trying to think of a broader picture if it actually would make sense for some applications to use it this way. We did it in EGo with an extra flag if someone wants the host to override defintions, and Occlum does it similarly.

But feel free to choose what makes more sense for you, I get both sides :)

@mkow
Copy link
Member

mkow commented May 21, 2021

Ok, so maybe for now we could disallow the same variable to be present in both lists (i.e. fail with an error), and if we find a stronger reason to allow specifying defaults, then we could implement it. This way we could add this feature later without silently changing semantics of something which users already would have in their manifests.

@dimakuv
Copy link
Contributor

dimakuv commented Jul 22, 2021

Somehow we still don't have a good user-friendly backwards-compatible proposal for this feature.

@Nirusu
Copy link
Contributor Author

Nirusu commented Jul 22, 2021

Mhh, why should it not be backwards-compatible if pass-through is specified as an extra value in the config? When it's not set the environment variable definition can work as before. Or do I miss something here?

@dimakuv
Copy link
Contributor

dimakuv commented Jul 22, 2021

Ok, I digged our old internal conversation on Slack, and this is the latest state of affairs:

loader.env.VAR1 = "value1"
loader.env.VAR2 = { passthrough = true }
loader.env.VAR3 = { value = "for those, who always want toml table" }

# another, more TOML-ish syntax for the same
[loader.env.VAR4]
passthrough = true
[loader.env.VAR5]
value = "asdfg"
  • loader.env.VAR = { passthrough = true, value = "hard-coded" } is an error -- as per Michal's comment Allowlist for envp (& argv?) #2356 (comment)
  • loader.env.VAR = { passthrough = true}; loader.env_src_file = "file:contains_this_var" is an error -- mainly because it's unclear which one should take precedence...
  • loader.env.VAR = { passthrough = true}; loader.insecure__use_host_env = true -- the first statement is essentially ignored

Actually, this looks pretty straight-forward to me.

@dimakuv
Copy link
Contributor

dimakuv commented Aug 10, 2021

@Nirusu Could you take a look at #2641 and maybe try it out?

@Nirusu
Copy link
Contributor Author

Nirusu commented Aug 10, 2021

Will give it a shot tomorrow! Thanks, very appreciated :)

@Nirusu
Copy link
Contributor Author

Nirusu commented Aug 11, 2021

@dimakuv Tested it out with our Marblerun samples, and yup, it works as expected and we can finally get rid of our filtering hack when this is merged. Great job!

@dimakuv
Copy link
Contributor

dimakuv commented Sep 29, 2021

@Nirusu Just in case you haven't noticed, we merged the envp PR!

@thomasten
Copy link

Thanks @dimakuv. Are you able to estimate when you will publish the first (pre) release (candidate) of Gramine? Ideally, we would wait for this and then make all necessary changes to MarbleRun.

@dimakuv
Copy link
Contributor

dimakuv commented Sep 29, 2021

@thomasten I would say definitely by mid-October. Ideally by the end of next week (~ 8. October).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.