Added image-policy proposal #27129
Added image-policy proposal #27129
Conversation
|
@philips it might make sense to have https://github.com/coreos/clair be a backend for this image policy webhook, or something like that? Can you at-mention the right people from Clair? Also @ericchiang because it is a type of authorization, but intentionally not in RBAC. |
| - For non-replicated things (size 1 ReplicaSet, PetSet), a single node failure may disable it. | ||
| - a node rolling update will eventually check for liveness of replacements, and would be throttled if | ||
| in the case when the image was no longer allowed and so replacements could not be started. | ||
| - rapid node restarts will cause existing pod objects to be restared by kubelet. |
ericchiang
Jun 9, 2016
•
Member
s/restared/restarted/
s/restared/restarted/
erictune
Jun 9, 2016
Author
Member
fixed
fixed
| * Implementing objects in core kubernetes which describe complete policies for what images are approved. | ||
| * A third-party implementation of an image policy checker could optionally use ThirdPartyResource to store its policy. | ||
| * Kubernetes core code dealing with concepts of image layers, build processes, source repositories, etc. | ||
| * We expect there will be multiple PaaSes and/or de-facto programming enviroments, each with different takes on |
ericchiang
Jun 9, 2016
Member
s/enviroments/environments/
s/enviroments/environments/
erictune
Jun 9, 2016
Author
Member
fixed
fixed
| } | ||
| // ImageReviewSpec is a description of the token authentication request. | ||
| type ImageReviewSpec struct { |
smarterclayton
Jun 9, 2016
Contributor
Do we need to know the authenticated user / service account that is attempting to perform this action? Who can matter as much as what in this case.
Do we need to know the authenticated user / service account that is attempting to perform this action? Who can matter as much as what in this case.
smarterclayton
Jun 9, 2016
Contributor
Nm, saw the comment below. Commented there.
Nm, saw the comment below. Commented there.
| // Error should be empty unless Allowed is false in which case it | ||
| // may contain a short description of what is wrong. Kubernetes | ||
| // may truncate excessively long errors when displaying to the user. | ||
| Error string |
smarterclayton
Jun 9, 2016
Contributor
Shouldn't this be Reason and Message just like other status responses?
Shouldn't this be Reason and Message just like other status responses?
erictune
Jun 9, 2016
Author
Member
fixed.
fixed.
|
|
||
| The ReplicaSet, or other controller, is responsible for recognizing when a 403 has happened | ||
| (whether due to user not having permission due to bad image, or some other permission reason) | ||
| and throttling itself and surfacing the error in a way that CLIs and UIs can show to the user. |
smarterclayton
Jun 9, 2016
Contributor
I'm starting to feel like we're not doing enough to actually solve the usability problem here - the approach you describe is correct (RS needs to surface that info) but we're doing a terrible job of it. For instance... should RS have a condition "CreationForbidden": "true" with a reason "ImageRejected"? Should the deployment also have a condition surfaced?
Right now, we see this all the time in OpenShift (by virtue of security policy) and end users really suffer without this info being surfaced (in quota, in policy, and in crash looping pods).
I'm starting to feel like we're not doing enough to actually solve the usability problem here - the approach you describe is correct (RS needs to surface that info) but we're doing a terrible job of it. For instance... should RS have a condition "CreationForbidden": "true" with a reason "ImageRejected"? Should the deployment also have a condition surfaced?
Right now, we see this all the time in OpenShift (by virtue of security policy) and end users really suffer without this info being surfaced (in quota, in policy, and in crash looping pods).
kargakis
Jun 9, 2016
Member
I'm starting to feel like we're not doing enough to actually solve the usability problem here - the approach you describe is correct (RS needs to surface that info) but we're doing a terrible job of it. For instance... should RS have a condition "CreationForbidden": "true" with a reason "ImageRejected"? Should the deployment also have a condition surfaced?
Right now, we see this all the time in OpenShift (by virtue of security policy) and end users really suffer without this info being surfaced (in quota, in policy, and in crash looping pods).
Conditions are one thing that may or may not help here but I believe we already have a mechanism for reporting failures, albeit we are doing a bad job with it: events. Events are nice but fetching exactly those you are interested in is impossible currently. It feels like we should have events as a subresource to all objects or something. #11994 is related.
I'm starting to feel like we're not doing enough to actually solve the usability problem here - the approach you describe is correct (RS needs to surface that info) but we're doing a terrible job of it. For instance... should RS have a condition "CreationForbidden": "true" with a reason "ImageRejected"? Should the deployment also have a condition surfaced?
Right now, we see this all the time in OpenShift (by virtue of security policy) and end users really suffer without this info being surfaced (in quota, in policy, and in crash looping pods).
Conditions are one thing that may or may not help here but I believe we already have a mechanism for reporting failures, albeit we are doing a bad job with it: events. Events are nice but fetching exactly those you are interested in is impossible currently. It feels like we should have events as a subresource to all objects or something. #11994 is related.
smarterclayton
Jun 10, 2016
Contributor
Found your answer later in the issue. I'm forgetting which proposals I've commented on at this point.
Found your answer later in the issue. I'm forgetting which proposals I've commented on at this point.
soltysh
Jul 20, 2016
Contributor
I'd like to generalize this paragraph to be applicable to all controllers. Deplyments and ReplicaSets are not the only ones currently. But I agree with Clayton and Michalis here that we need to make it more obvious and Conditions are one of the possibilities here.
I'd like to generalize this paragraph to be applicable to all controllers. Deplyments and ReplicaSets are not the only ones currently. But I agree with Clayton and Michalis here that we need to make it more obvious and Conditions are one of the possibilities here.
| We could have sent the username of the pod creator to the backend. The username could be used to allow different users to run | ||
| different categories of images. This would require propagating the username from e.g. Deployment creation, through to | ||
| Pod creation via, e.g. the `Impersonate-User:` header. This feature is not ready. When it is, we will re-evaluate | ||
| adding user as a field of `ImagePolicyRequest`. |
smarterclayton
Jun 9, 2016
Contributor
Isn't this ready now?
Isn't this ready now?
erictune
Jun 9, 2016
Author
Member
API-server respects this header. Controllers do not use it.
API-server respects this header. Controllers do not use it.
| The ImageReviewSpec is used as the key to the cache. | ||
|
|
||
| In the case of a cache miss and timeout talking to the backend, the default is to allow Pod creation. | ||
| Keeping services running is more important than a hypotetical threat from an un-verified image. |
gtank
Jun 9, 2016
s/hypotetical/hypothetical
s/hypotetical/hypothetical
gtank
Jun 9, 2016
Doesn't this mean any backend DoS lasting more than an hour will result in a fail-open cluster?
Doesn't this mean any backend DoS lasting more than an hour will result in a fail-open cluster?
erictune
Jun 9, 2016
Author
Member
Yes. Not sure what else to do, though.
Yes. Not sure what else to do, though.
gtank
Jun 13, 2016
•
Thinking out loud here: what if you fell back to using any existing "allow" cache as a whitelist if no contact with the backend can be established for a TTL period? This allows things that are already deployed to continue running without failing completely open.
Thinking out loud here: what if you fell back to using any existing "allow" cache as a whitelist if no contact with the backend can be established for a TTL period? This allows things that are already deployed to continue running without failing completely open.
|
|
||
| ## Ubernetes | ||
|
|
||
| If two clusters share an image policy backend, then they will have the same policies. |
gtank
Jun 9, 2016
•
Is the intent to synchronize the admission caches in the federated/ubernetes cases?
Is the intent to synchronize the admission caches in the federated/ubernetes cases?
erictune
Jun 9, 2016
Author
Member
no.
no.
| We will wait and see how much demand there is for closing this hole. If the community demands a solution, | ||
| we may suggest one of these: | ||
|
|
||
| 1. Use a backend that refuses to accept images that are specified with tags, and require users to resolve to IDs |
gtank
Jun 9, 2016
•
Always forcing users to resolve IDs would allow backends to interact more effectively with content trust measures. With just the tag, there's no binding between what was actually deployed and what the backend sees. cc @ecordell
Always forcing users to resolve IDs would allow backends to interact more effectively with content trust measures. With just the tag, there's no binding between what was actually deployed and what the backend sees. cc @ecordell
smarterclayton
Jun 9, 2016
Contributor
Resolving IDs before they are put into pods / deployments / rcs is the best
outcome. Doing that resolution in the admission control is fraught (there
should be another admission controller that does the transformation, if you
really want that)
On Thu, Jun 9, 2016 at 4:53 PM, George Tankersley notifications@github.com
wrote:
In docs/proposals/image-provenance.md
#27129 (comment)
:
+## Image tags and IDs
+
+Image tags are like: myrepo/myimage:v1.
+
+Image IDs are like: myrepo/myimage@sha256:beb6bd6a68f114c1dc2ea4b28db81bdf91de202a9014972bec5e4d9171d90ed.
+You can see image IDs with docker images --no-trunc.
+
+The Backend needs to be able to resolve tags to IDs (by talking to the images repo).
+If the Backend resolves tags to IDs, there is some risk that the tag-to-ID mapping will be
+modified after approval by the Backend, but before Kubelet pulls the image. We will not address this
+race condition at this time.
+
+We will wait and see how much demand there is for closing this hole. If the community demands a solution,
+we may suggest one of these:
+
+1. Use a backend that refuses to accept images that are specified with tags, and require users to resolve to IDs
Forcing users to resolve IDs would allow a backend to interact more
effectively with content trust measures. With just the tag, there's no
binding between what was actually deployed and what the backend sees. You
need to somehow supply an account of the data. cc @ecordell
https://github.com/ecordell
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/pull/27129/files/790ab4954039f8d601b925896c01962face3765f#r66519606,
or mute the thread
https://github.com/notifications/unsubscribe/ABG_pykCE46gTf8z3jGttIEODyKkW9UPks5qKH1LgaJpZM4IyI6J
.
Resolving IDs before they are put into pods / deployments / rcs is the best
outcome. Doing that resolution in the admission control is fraught (there
should be another admission controller that does the transformation, if you
really want that)
On Thu, Jun 9, 2016 at 4:53 PM, George Tankersley notifications@github.com
wrote:
In docs/proposals/image-provenance.md
#27129 (comment)
:+## Image tags and IDs
+
+Image tags are like:myrepo/myimage:v1.
+
+Image IDs are like:myrepo/myimage@sha256:beb6bd6a68f114c1dc2ea4b28db81bdf91de202a9014972bec5e4d9171d90ed.
+You can see image IDs withdocker images --no-trunc.
+
+The Backend needs to be able to resolve tags to IDs (by talking to the images repo).
+If the Backend resolves tags to IDs, there is some risk that the tag-to-ID mapping will be
+modified after approval by the Backend, but before Kubelet pulls the image. We will not address this
+race condition at this time.
+
+We will wait and see how much demand there is for closing this hole. If the community demands a solution,
+we may suggest one of these:
+
+1. Use a backend that refuses to accept images that are specified with tags, and require users to resolve to IDsForcing users to resolve IDs would allow a backend to interact more
effectively with content trust measures. With just the tag, there's no
binding between what was actually deployed and what the backend sees. You
need to somehow supply an account of the data. cc @ecordell
https://github.com/ecordell—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/pull/27129/files/790ab4954039f8d601b925896c01962face3765f#r66519606,
or mute the thread
https://github.com/notifications/unsubscribe/ABG_pykCE46gTf8z3jGttIEODyKkW9UPks5qKH1LgaJpZM4IyI6J
.
ecordell
Jun 9, 2016
Contributor
Agreed; any signature verification doesn't mean much if you don't verify that the image id matches the content
Agreed; any signature verification doesn't mean much if you don't verify that the image id matches the content
erictune
Jul 20, 2016
Author
Member
I just commented on this in https://github.com/kubernetes/kubernetes/pull/27129/files/4a60be831efce93d7e210df47d79e7c18d5d13c2#r71607506
I think it makes sense to either map tag to SHA in kubectl, in CICD, or after the fact in Kubelet. I agree the admission controller is a bad place to map tag to SHA. I think admission is a fine place to require image names that use SHAs.
I just commented on this in https://github.com/kubernetes/kubernetes/pull/27129/files/4a60be831efce93d7e210df47d79e7c18d5d13c2#r71607506
I think it makes sense to either map tag to SHA in kubectl, in CICD, or after the fact in Kubelet. I agree the admission controller is a bad place to map tag to SHA. I think admission is a fine place to require image names that use SHAs.
|
xref: #22888 |
| - a newly created replicaSet will be unable to create Pods. | ||
| - updating a deployment will be safe in the sense that it will detect that the new ReplicaSet is not scaling | ||
| up and not scale down the old one. | ||
| - an existing replicaSet will not be unable to create Pods that replace ones which are terminated. If this is due to |
alex-mohr
Jun 21, 2016
Contributor
nit: "will not be unable to create" -> "will be unable to create"
nit: "will not be unable to create" -> "will be unable to create"
erictune
Jul 22, 2016
Author
Member
fixed.
fixed.
|
|
||
| ## Admission controller | ||
|
|
||
| An `ImagePolicyWebhook` admission controller will be written. The admission controller examines all pod objects which are |
bgrant0607
Jun 22, 2016
Member
Is the webhook going to be tried until success? How would it distinguish retryable failure from permanent failure?
Is the webhook going to be tried until success? How would it distinguish retryable failure from permanent failure?
erictune
Jul 23, 2016
Author
Member
The admission controller will admit if the webhook times out.
The admission controller will admit if the webhook times out.
Q-Lee
Jul 23, 2016
Contributor
Will the admin be able to set a policy for a namespace (e.g., fail-open/fail-closed)?
Will the admin be able to set a policy for a namespace (e.g., fail-open/fail-closed)?
deads2k
Jul 25, 2016
Contributor
Will this be a generic webhook for any admission plugin? I'd like to see a generic one and it seems like the work here would be about the same.
Will this be a generic webhook for any admission plugin? I'd like to see a generic one and it seems like the work here would be about the same.
erictune
Jul 25, 2016
Author
Member
It will not be a generic webhook. A generic webhook would need a lot more discussion:
- a generic webhook needs to touch all objects, not just pods. So it won't have a fixed schema
- a generic webhook client needs to ignore kinds it doesn't care about, or the apiserver needs to know which backends care about which kinds
- it exposes our whole API to a webhook without giving us (the project) any chance to review or understand how it is being used.
- because we don't know which fields of an object are inspected by the backend, caching is not effective. Sending fewer fields allows caching.
- sending fewer fields makes it possible to rev the version of the webhook request slower than the version of our internal obejcts (e.g. pod v2 could still use imageReview v1.)
- probably lots more reasons.
It will not be a generic webhook. A generic webhook would need a lot more discussion:
- a generic webhook needs to touch all objects, not just pods. So it won't have a fixed schema
- a generic webhook client needs to ignore kinds it doesn't care about, or the apiserver needs to know which backends care about which kinds
- it exposes our whole API to a webhook without giving us (the project) any chance to review or understand how it is being used.
- because we don't know which fields of an object are inspected by the backend, caching is not effective. Sending fewer fields allows caching.
- sending fewer fields makes it possible to rev the version of the webhook request slower than the version of our internal obejcts (e.g. pod v2 could still use imageReview v1.)
- probably lots more reasons.
erictune
Jul 25, 2016
Author
Member
Added section about this in Alternatives section.
Added section about this in Alternatives section.
|
@fabioy this is the proposal for image admission controller that we talked about. |
| - only run images that are scanned to confirm they do no contain vulnerabilities | ||
| - only run images that use a "required" base image | ||
| - only run images that contain binaries which were built from peer reviewed, checked-in source | ||
| by a trusted compiler toolchain. |
philips
Jul 23, 2016
Contributor
@erictune isn't the other obvious example images that are signed by a public key in a key list?
@erictune isn't the other obvious example images that are signed by a public key in a key list?
erictune
Jul 25, 2016
Author
Member
Yes. I would do that in conjunction with other controls. I might want to also enforce the whitelist of signing keys at the node level, at which point the check in the apiserver is more a nice-to-have. But you could do it that way.
Yes. I would do that in conjunction with other controls. I might want to also enforce the whitelist of signing keys at the node level, at which point the check in the apiserver is more a nice-to-have. But you could do it that way.
erictune
Jul 25, 2016
Author
Member
Adding your suggestion to the doc.
Adding your suggestion to the doc.
| There will be a cache of decisions in the admission controller. | ||
|
|
||
| If the apiserver cannot reach the webhook backend, it will log a warning and admit the pod. The rationale here is | ||
| that keeping the system running is most important, and people who care about image provenance deeply will |
deads2k
Jul 25, 2016
Contributor
I think I'd prefer to default the other way or at least offer an option for someone to choose which way they want to change. If we run a bad image and it destroys something, there might not be a way to un-ring the bell.
I think I'd prefer to default the other way or at least offer an option for someone to choose which way they want to change. If we run a bad image and it destroys something, there might not be a way to un-ring the bell.
liggitt
Jul 25, 2016
•
Member
I think I'd prefer to default the other way or at least offer an option for someone to choose which way they want to change
+1... getting a bad image in should not be as simple as DOSing an admission webhook endpoint
I think I'd prefer to default the other way or at least offer an option for someone to choose which way they want to change
+1... getting a bad image in should not be as simple as DOSing an admission webhook endpoint
soltysh
Jul 25, 2016
•
Contributor
+1 for default deny, with an option to change it if admin wants to do so and is aware of the risk.
+1 for default deny, with an option to change it if admin wants to do so and is aware of the risk.
erictune
Jul 25, 2016
Author
Member
Interesting question: how to bootstrap image provenance controller running on the cluster itself.
Interesting question: how to bootstrap image provenance controller running on the cluster itself.
erictune
Jul 25, 2016
Author
Member
Adding line about the flag.
Adding line about the flag.
| An `ImagePolicyWebhook` admission controller will be written. The admission controller examines all pod objects which are | ||
| created or updated. It can either admit the pod, or reject it. If it is rejected, the request sees a `403 FORBIDDEN` | ||
|
|
||
| The admission controller code will go in `plugin/pkg/admission/imagepolicy`. |
smarterclayton
Jul 25, 2016
Contributor
I would like for the admission controller to take an interface for "check decision" rather than embedding all the client logic in it. That would allow alternate admission controller implementations to be more easily implemented. As an example, we're trying to build composable chunks of admission logic for policy like this that can be reused in other contexts. Things like authorizer and authentication have succeeded pretty well at this (@liggitt has done some crazy authenticator wrappers that work cleanly). I would like to generally have our policy decision steps behind clean interfaces (in this case, an interface that answers the question about an image and it being accepted and mirrors the ImageReview object) that can be composed later on.
I would like for the admission controller to take an interface for "check decision" rather than embedding all the client logic in it. That would allow alternate admission controller implementations to be more easily implemented. As an example, we're trying to build composable chunks of admission logic for policy like this that can be reused in other contexts. Things like authorizer and authentication have succeeded pretty well at this (@liggitt has done some crazy authenticator wrappers that work cleanly). I would like to generally have our policy decision steps behind clean interfaces (in this case, an interface that answers the question about an image and it being accepted and mirrors the ImageReview object) that can be composed later on.
deads2k
Jul 25, 2016
•
Contributor
I would like for the admission controller to take an interface for "check decision" rather than embedding all the client logic in it.
You're saying you don't like the webhook mechanism suggested, that you want a generic webhook like I asked above above, or that you want the reference webhook impl to accept what amounts to an admission.Interface?
I would like for the admission controller to take an interface for "check decision" rather than embedding all the client logic in it.
You're saying you don't like the webhook mechanism suggested, that you want a generic webhook like I asked above above, or that you want the reference webhook impl to accept what amounts to an admission.Interface?
smarterclayton
Jul 25, 2016
Contributor
On Jul 25, 2016, at 3:21 PM, David Eads notifications@github.com wrote:
In docs/proposals/image-provenance.md
#27129 (comment):
- reduces latency and allows short outages of the backend to be tolerated.
+Detailed discussion in Ensuring only images are from approved sources are run.
+
+# Implementation
+
+A new admission controller will be added. That will be the only change.
+
+## Admission controller
+
+An ImagePolicyWebhook admission controller will be written. The admission controller examines all pod objects which are
+created or updated. It can either admit the pod, or reject it. If it is rejected, the request sees a 403 FORBIDDEN
+
+The admission controller code will go in plugin/pkg/admission/imagepolicy.
I would like for the admission controller to take an interface for "check
decision" rather than embedding all the client logic in it.
You're saying you don't like the webhook mechanism suggested, that you want
a generic webhook like I asked above above, or that you want the reference
webhook impl to accept what amounts to and admission.Interface?
The latter. Authorizer and Admission interfaces are very successful.
Would like to try to define the equivalents for other policy equally well.
On Jul 25, 2016, at 3:21 PM, David Eads notifications@github.com wrote:
In docs/proposals/image-provenance.md
#27129 (comment):
- reduces latency and allows short outages of the backend to be tolerated.
+Detailed discussion in Ensuring only images are from approved sources are run.
+
+# Implementation
+
+A new admission controller will be added. That will be the only change.
+
+## Admission controller
+
+AnImagePolicyWebhookadmission controller will be written. The admission controller examines all pod objects which are
+created or updated. It can either admit the pod, or reject it. If it is rejected, the request sees a403 FORBIDDEN
+
+The admission controller code will go inplugin/pkg/admission/imagepolicy.
I would like for the admission controller to take an interface for "check
decision" rather than embedding all the client logic in it.
You're saying you don't like the webhook mechanism suggested, that you want
a generic webhook like I asked above above, or that you want the reference
webhook impl to accept what amounts to and admission.Interface?
The latter. Authorizer and Admission interfaces are very successful.
Would like to try to define the equivalents for other policy equally well.
| // ImageReviewSpec is a description of the pod creation request. | ||
| type ImageReviewSpec struct { | ||
| // Containers is a list of a subset of the information in each container of the Pod being created. | ||
| Containers []ImageReviewContainerSpec |
smarterclayton
Jul 25, 2016
Contributor
Unfortunately you also need to accept init containers. I would recommend creating a nested struct that is similar to pod template that has the subset of info in it, and make that hierarchal rather than flattened, i.e.:
type ImageReviewPodTemplate struct {
Metadata ImageReviewObjectMeta
Spec ImageReviewPodSpec
}
If we can preserve the same hierarchy as a PodTemplate, that makes automated tools easier in the future (we might take the pod object as unstructured and whitelist the things that are included in a generic fashion).
Unfortunately you also need to accept init containers. I would recommend creating a nested struct that is similar to pod template that has the subset of info in it, and make that hierarchal rather than flattened, i.e.:
type ImageReviewPodTemplate struct {
Metadata ImageReviewObjectMeta
Spec ImageReviewPodSpec
}
If we can preserve the same hierarchy as a PodTemplate, that makes automated tools easier in the future (we might take the pod object as unstructured and whitelist the things that are included in a generic fashion).
smarterclayton
Jul 25, 2016
Contributor
It also establishes a pattern for future examples of this where subsets of data are returned.
It also establishes a pattern for future examples of this where subsets of data are returned.
erictune
Jul 25, 2016
Author
Member
Is it settled where in the PodSpec that init containers will live?
Is it settled where in the PodSpec that init containers will live?
erictune
Jul 25, 2016
Author
Member
Also, if we do what you said, it couples the versioning of Pods to the versioning of the ImageReview api. Is that desirable?
Also, if we do what you said, it couples the versioning of Pods to the versioning of the ImageReview api. Is that desirable?
smarterclayton
via email
Jul 26, 2016
Contributor
If is extremely likely init containers will be a peer to containers,
barring any sudden revelations.
smarterclayton
Jul 26, 2016
Contributor
Possibly not, although creating a new structure to learn is causing
API drift. I would expect this API to evolve to be consistent with
pods in the long term (v1 pods to vX ImagePolicyReview)
Possibly not, although creating a new structure to learn is causing
API drift. I would expect this API to evolve to be consistent with
pods in the long term (v1 pods to vX ImagePolicyReview)
Q-Lee
Aug 4, 2016
Contributor
I'm of the opinion that instead of mimicking pod layout, we should eliminate the container list altogether. The goal here is to establish a chain of trust for containers, and not to enforce pod-level policies. If you create a pod with 5 containers, then you make 5 requests to the backend.
I'm of the opinion that instead of mimicking pod layout, we should eliminate the container list altogether. The goal here is to establish a chain of trust for containers, and not to enforce pod-level policies. If you create a pod with 5 containers, then you make 5 requests to the backend.
deads2k
Aug 4, 2016
Contributor
I'm of the opinion that instead of mimicking pod layout, we should eliminate the container list altogether. The goal here is to establish a chain of trust for containers, and not to enforce pod-level policies. If you create a pod with 5 containers, then you make 5 requests to the backend.
I think that it's likely we'll end up doing both. Given the current power of PSP and it's ability to describe who can request which powers for a pod/container, it seems likely that the decision about whether a particular image is allowed may be affected by the same or a similar policy.
I'm not suggesting that we add that level of complication now, but designing the structure for future expansion seems like a reasonable thing to do and this would be a way to do it.
Also, remote callouts are expensive and if we already have all the data ready, validating it all at once seems pretty reasonable.
I'm of the opinion that instead of mimicking pod layout, we should eliminate the container list altogether. The goal here is to establish a chain of trust for containers, and not to enforce pod-level policies. If you create a pod with 5 containers, then you make 5 requests to the backend.
I think that it's likely we'll end up doing both. Given the current power of PSP and it's ability to describe who can request which powers for a pod/container, it seems likely that the decision about whether a particular image is allowed may be affected by the same or a similar policy.
I'm not suggesting that we add that level of complication now, but designing the structure for future expansion seems like a reasonable thing to do and this would be a way to do it.
Also, remote callouts are expensive and if we already have all the data ready, validating it all at once seems pretty reasonable.
Q-Lee
Aug 4, 2016
Contributor
Exactly, we have PSP for pod level control. The purpose here is to establish a chain of trust from source code to deployment.
A small scaling factor from pods to containers is nothing to bat an eye at.
Exactly, we have PSP for pod level control. The purpose here is to establish a chain of trust from source code to deployment.
A small scaling factor from pods to containers is nothing to bat an eye at.
|
|
||
| The exact nature of "approval" is beyond the scope of Kubernetes, but may include reasons like: | ||
|
|
||
| - only run images that are scanned to confirm they do no contain vulnerabilities |
soltysh
Jul 25, 2016
Contributor
s/do no/do not
s/do no/do not
| * Block creation of pods that would cause "unapproved" images to run. | ||
| * Make it easy for users or partners to build "image provenance checkers" which check whether images are "approved". | ||
| * We expect there will be multiple implementations. | ||
| * Allow users to request an "override" of the policy in a convenient way (subject to the override being allowed). |
soltysh
Jul 25, 2016
Contributor
Is the override be available to all users? Or this will be tied to specific authz?
Is the override be available to all users? Or this will be tied to specific authz?
erictune
Jul 25, 2016
Author
Member
In one possible implementation, the override is available to all users, but a user who requests the override would be expected to answer to an auditor, sometime after the fact, if she requests the override.
In one possible implementation, the override is available to all users, but a user who requests the override would be expected to answer to an auditor, sometime after the fact, if she requests the override.
|
I've addressed most comments and updated the docs. The open issues, as I see it, are:
|
|
Interfaces was about making the guts of the admission controller call
a policy interface that looks like Authorizer and Admission
interfaces, because it imposes discipline and also may lead to
experimentation.
|
|
cc the Clair team @Quentin-M @jzelinskie @josephschorr |
| ## Admission controller | ||
|
|
||
| An `ImagePolicyWebhook` admission controller will be written. The admission controller examines all pod objects which are | ||
| created or updated. It can either admit the pod, or reject it. If it is rejected, the request sees a `403 FORBIDDEN` |
jzelinskie
Jul 27, 2016
This description is a little vague in the plurality sense. I think this should work over a set of webhooks rather than one. For example, this Pull Request on GitHub has multiple webhooks that must validate it before it is merged.
This description is a little vague in the plurality sense. I think this should work over a set of webhooks rather than one. For example, this Pull Request on GitHub has multiple webhooks that must validate it before it is merged.
soltysh
Jul 28, 2016
Contributor
I kinda understood this as being able to setup multiple, but having it explicit in the proposal is a good idea.
I kinda understood this as being able to setup multiple, but having it explicit in the proposal is a good idea.
deads2k
Jul 28, 2016
Contributor
This description is a little vague in the plurality sense. I think this should work over a set of webhooks rather than one. For example, this Pull Request on GitHub has multiple webhooks that must validate it before it is merged.
Seems like we could support a single callout and if someone wanted a union, they could write the union in their particular handler. That keeps our core code out of the business of deciding between the ands, ors, and trumps, which inevitably follow the "give me more than one".
I don't see an issue with making a reference impl for the webhook that can provide a simple union, but I don't think we want to bake multiples into our admission plugin.
This description is a little vague in the plurality sense. I think this should work over a set of webhooks rather than one. For example, this Pull Request on GitHub has multiple webhooks that must validate it before it is merged.
Seems like we could support a single callout and if someone wanted a union, they could write the union in their particular handler. That keeps our core code out of the business of deciding between the ands, ors, and trumps, which inevitably follow the "give me more than one".
I don't see an issue with making a reference impl for the webhook that can provide a simple union, but I don't think we want to bake multiples into our admission plugin.
soltysh
Jul 28, 2016
Contributor
I feel convinced.
I feel convinced.
jzelinskie
Jul 31, 2016
•
@deads2k It seems there's precedent in other parts of k8s for doing it the way you have described, so I agree.
@deads2k It seems there's precedent in other parts of k8s for doing it the way you have described, so I agree.
| The WebHook request and response are JSON, and correspond to the following `go` structures: | ||
|
|
||
| ```go | ||
| // Filename: pkg/apis/authentication.k8s.io/register.go |
ericchiang
Aug 3, 2016
Member
Are these types intended to live in the authentication.k8s.io API group? Is the package name correct here?
Are these types intended to live in the authentication.k8s.io API group? Is the package name correct here?
erictune
Aug 3, 2016
Author
Member
No, fixed.
No, fixed.
| // ImageReviewContainerSpec is a description of a container within the pod creation request. | ||
| type ImageReviewContainerSpec struct { | ||
| Image string |
Q-Lee
Aug 3, 2016
Contributor
Shouldn't this be Image, ImageHash string?
Shouldn't this be Image, ImageHash string?
erictune
Aug 4, 2016
Author
Member
Images can be specified to docker, and in pod.spec.container[].image as either image:tag or image@SHA:012345679abcdef. So, this field also accepts either format.
It is up to the backend to decide if it accepts image:tag or only accepts image@SHA:012345679abcdef format. There are reasons you might chose to do either way, so this API doesn't have an opinion.
Images can be specified to docker, and in pod.spec.container[].image as either image:tag or image@SHA:012345679abcdef. So, this field also accepts either format.
It is up to the backend to decide if it accepts image:tag or only accepts image@SHA:012345679abcdef format. There are reasons you might chose to do either way, so this API doesn't have an opinion.
|
This is very close. Let's put this and I'll make the remaining changes in a new, narrower PR. |
|
GCE e2e build/test passed for commit 9d59ae5. |
6f0bc85
into
kubernetes:master
Added image-policy proposal
Add proposal for image policy.