Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop unmatched environment variables. #1550

Merged
merged 6 commits into from
Nov 1, 2022
Merged

Drop unmatched environment variables. #1550

merged 6 commits into from
Nov 1, 2022

Conversation

matajoh
Copy link
Member

@matajoh matajoh commented Oct 25, 2022

This adds a allow_environment_variable_dropping flag to data.policy which allows the framework, if set, to try to drop environment variables if there are no matching containers/processes. This is an important aspect of serviceability, as it will allow customer policies to be robust to the addition of extraneous environment variables which are unused by their containers. Note throughout this that the existing logic of required environment variables holds.

Before we dive into the details of how the logic for this feature works, it may help to discuss narrowing. The fundamental challenge of creating an execution policy for GCS is that we do not know ahead of time which actions the host will take, nor in which order those actions will be taken. As a result, there can be some degree of uncertainty at each stage of policy execution. For example, when there is a request to create a container, there may be several containers with the same mounts, layers, environment variables, etc. The fact that there is more than one indicates that the action should be allowed, but at that point all we have done is narrow the set of potential containers from all containers in the policy to those which match the request. As further requests are made (e.g., exec_in_container) we may be able to narrow down the set of matching containers further. As dropping environment variables can have a large effect on the size, or cardinality, of the set of matching policy objects it is very important to approach the process in a principled manner.

Assume that we have narrowed the set of matching policy objects to a set which agree in every aspect except the set of environment variables. We want to drop environment variables such that one or more of these matching policy objects can be used to approve the request. The logic used is the following:

  1. Produce a set $A$ of valid environment variable subsets for each entity (e.g. container, process)
  2. All subsets which share the maximum cardinality from $A$ form a subset $B \subseteq A$
  3. If $\bigcup B = \bigcap B$, return $B$
  4. Else, return undefined

The resulting subset of environment variables is then used to determine the matching containers. This may be best explained via an example. We have three containers with the following sets of environment variables:

  1. $c_0 = \{a, b\}$
  2. $c_1 = \{a, b, c\}$
  3. $c_2 = \{a, b, c, d\}$

If the host requests to start a container with $[a, b, c, d, e]$ then, without dropping environment variables, the request will be denied. However, if variables are allowed to be dropped, then we could potentially match any of these containers:

$$ A = \{[a, b], [a, b, c], [a, b, c, d]\} \\ $$

however, the cardinality rule means that we will choose:

$$ B = \{[a, b, c, d]\} $$

As (in this case) $\bigcup B = \bigcap B$ is trivially true, we return $[a, b, c, d]$ as the new set of environment variables, which will then match with $c_2$. If, however, we had one more conatainer:

$$ c_3 = \{a, b, c, e\} $$

Then we get a very different result. As before, the request would be denied if dropping environment variables is allowed. If allowed, though, we get the following:

  1. $A = \{[a, b], [a, b, c], [a, b, c, d], [a, b, c, e]\}$
  2. $B = \{[a, b, c, d], [a, b, c, e]\}$
  3. $\bigcup B = [a, b, c, d, e]$
  4. $\bigcap B = [a, b, c]$

As we can see, $\bigcup B \neq \bigcap B$ and so the result is undefined. This is because, at this stage, we cannot choose between these two containers fairly.

This adds a `allow_environment_variable_dropping` flag to `data.policy`
which allows the framework, if set, to try to drop environment variables
if there are no matching containers/processes. This is an important
aspect of serviceability, as it will allow customer policies to be
robust to the addition of extraneous environment variables which are
unused by their containers. Note throughout this that the existing logic
of `required` environment variables holds. The logic used is the
following:

1. Produce a set $A$ of valid environment variable subsets for each
   entity (e.g. container, process)
2. All subsets which share the maximum cardinality from $A$ form a
   subset  $B \subseteq A$
3. If $\bigcup B = \bigcap B$, return $B$
4. Else, return undefined

The resulting subset of environment variables is then used to determine
the matching containers. This may be best explained via an example.
We have three containers with the following sets of environment
variables:

$$
c_0 = \{a, b\} \\
c_1 = \{a, b, c\} \\
c_2 = \{a, b, c, d\}
$$

If the host requests to start a container with $[a, b, c, d, e]$ then,
without dropping environment variables, the request will be denied.
However, if variables are allowed to be dropped, then we could
potentially match any of these containers:

$$
A = \{[a, b], [a, b, c], [a, b, c, d]\} \\
$$

however, the cardinality rule means that we will choose:

$$
B = \{[a, b, c, d]\}
$$

As (in this case) $\bigcup B = \bigcap B$ is trivially true, we return
$[a, b, c, d]$ as the new set of environment variables, which will then
match with $c_2$.

If, however, we had one more conatainer:

$$
c_3 = \{a, b, c, e\}
$$

Then we get a very different result. As before, the request would be
denied if dropping environment variables is allowed. If allowed, though,
we get the following:

$$
A = \{[a, b], [a, b, c], [a, b, c, d], [a, b, c, e]\} \\
B = \{[a, b, c, d], [a, b, c, e]\} \\
\bigcup B = [a, b, c, d, e] \\
\bigcap B = [a, b, c] \\
$$

As we can see, $\bigcup B \neq \bigcap B$ and so the result is
undefined. This is because, at this stage, we cannot choose between
these two containers fairly.

Signed-off-by: Matthew A Johnson <matjoh@microsoft.com>
@matajoh matajoh requested a review from a team as a code owner October 25, 2022 10:03
@matajoh
Copy link
Member Author

matajoh commented Oct 25, 2022

@anmaxvl @helsaawy

Signed-off-by: Matthew A Johnson <matjoh@microsoft.com>
Signed-off-by: Matthew A Johnson <matjoh@microsoft.com>
Signed-off-by: Matthew A Johnson <matjoh@microsoft.com>
internal/guest/runtime/hcsv2/uvm.go Outdated Show resolved Hide resolved
internal/guest/runtime/hcsv2/uvm.go Outdated Show resolved Hide resolved
pkg/securitypolicy/securitypolicyenforcer_rego.go Outdated Show resolved Hide resolved
pkg/securitypolicy/securitypolicy_marshal.go Show resolved Hide resolved
pkg/securitypolicy/securitypolicyenforcer_rego.go Outdated Show resolved Hide resolved
pkg/securitypolicy/securitypolicyenforcer_rego.go Outdated Show resolved Hide resolved
Signed-off-by: Matthew A Johnson <matjoh@microsoft.com>
@BryceDFisher
Copy link
Contributor

BryceDFisher commented Oct 27, 2022

Then we get a very different result. As before, the request would be denied if dropping environment variables is allowed.

Is this supposed to read "if dropping environment variables is not allowed"?

@BryceDFisher
Copy link
Contributor

BryceDFisher commented Oct 27, 2022

In your second example where UB != nB, you mention that the result is undefined. I'm concerned that this will be the typical case when looking towards production workloads. If the customer container includes 1 variable that a sidecar container does not have, and the sidecar includes 1 variable that the customer containers does not have, would this result in always undefined behavior?

@SeanTAllen
Copy link
Contributor

SeanTAllen commented Oct 27, 2022

In your second example where UB != nB, you mention that the result is undefined. I'm concerned that this will be the typical case when looking towards production workloads. If the customer container includes 1 variable that a sidecar container does not have, and the sidecar includes 1 variable that the customer containers does not have, would this result in always undefined behavior?

That wouldn't be undefined. The key context here is "all other things being equal except for environment variables".

An example of undefined would be two payload containers that are exactly the same in terms of layers, commands, working directory etc, but we have something like this:

drop environment variables allowed

container A
env: Foo required

container B
env: Bar required

and the request is to start a container that could be A or B, but has both env Foo and env Bar.

@BryceDFisher
Copy link
Contributor

Ok. That wasn't clear from the original description. Thanks

@matajoh
Copy link
Member Author

matajoh commented Oct 27, 2022

@BryceDFisher The original description was unclear, thanks for asking for clarification. I've updated it with more explanation of the "narrowing" process, which should hopefully help.

}

func (policy *regoEnforcer) query(enforcementPoint string, input map[string]interface{}) (map[string]interface{}, error) {
func (policy *regoEnforcer) query(enforcementPoint string, input inputData) (policyResults, error) {
store := inmem.NewFromObject(policy.data)

input["name"] = enforcementPoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side question: since map types are pointers, this will update the caller's map, input.
Is that an intended side effect?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, intended in the sense that I realize what you mention but that it doesn't really matter in this particular case. The API has been designed such that name is reserved to always be equal to the name of the enforcement point being queried.


// cri adds TERM=xterm for all workload containers. we add to all containers
// to prevent any possible error
envRules = append(envRules, securitypolicy.EnvRuleConfig{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRI adds this by default to all container specs I believe. We'd need to handle this similar to how we handle default CRI mounts. Not sure if we just update the existing ExtendDefaultCRIMounts to be something else and also accept default env vars or add a new method that does it.


response, err := client.CreateContainer(ctx, containerRequest)
if err != nil {
if config.allowed {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is allowed config for? client.CreateContainer only creates metadata in containerd, so the actual enforcement happens during startContainer(t, client, ctx, containerID) request. So if you want to catch any policy errors and ignore/fail based on this config, that's when they'll surface.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Signed-off-by: Matthew A Johnson <matjoh@microsoft.com>
Signed-off-by: Matthew Johnson (MSRC) <matjoh@microsoft.com>
@anmaxvl anmaxvl merged commit 0f0c894 into microsoft:main Nov 1, 2022
if err != nil {
return pid, errors.Wrapf(err, "exec in container denied due to policy")
}

if envToKeep != nil {
params.OCIProcess.Args = envToKeep
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, ok. we missed this...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed what?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that we're assigning env to args...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its how drop env needs to work at some level. currently the "protection" for this is in the engine so that the list can only contain items that were in the original list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I'm saying is that instead of assinging to OCIProcess.Env we're assining to OCIProcess.Args in the code above. effectively overriding the process command rather than environment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes, we need to get that fixed asap.

@anmaxvl anmaxvl deleted the drop_envs branch November 3, 2022 00:36
anmaxvl pushed a commit that referenced this pull request Feb 7, 2023
PR is for updating hcsshim ADO branch from github equivalent.

Related work items: #1538, #1539, #1544, #1545, #1550, #1551, #1552, #1553, #1559, #1561
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants