Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit the IAM EC2 policy for the master nodes #3186

Merged

Conversation

KashifSaadat
Copy link
Contributor

@KashifSaadat KashifSaadat commented Aug 11, 2017

Related to: #3158

The EC2 policy for the master nodes are quite open currently, allowing them to create/delete/modify resources that are not associated with the cluster the node originates from. I've come up with a potential solution using condition keys to validate that the ec2:ResourceTag/KubernetesCluster matches the cluster name.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 11, 2017
@k8s-ci-robot
Copy link
Contributor

Hi @KashifSaadat. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 11, 2017
@KashifSaadat
Copy link
Contributor Author

KashifSaadat commented Aug 11, 2017

/assign @justinsb

Effect IAMStatementEffect
Action stringorslice.StringOrSlice
Resource stringorslice.StringOrSlice
Condition Condition `json:",omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why, this makes writing IAM statements almost pleasurable ;-)

Resource: wildcard,
Condition: Condition{
"StringEquals": map[string]string{
"ec2:ResourceTag/KubernetesCluster": b.Cluster.GetName(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're also trying to phase out this tag in favor of the tag that allows sharing kubernetes.io/cluster/<clustername>. The value there can be shared or owned. Is it possible to match on the presence of a tag? (e.g. does StringNotEquals "" work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I believe we could do on the presence of a tag, I'll see which condition operator fits best.
I noticed that only some resources have the new tag that allows sharing. It only seems to be present on some shared resources, such as VPCs and Subnets. Unless we add the new tag to everything managed by kops, we wouldn't be able to control access to EC2 resources.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good point... I guess we should continue to honor the existing label for now. We can either just have two IAM grants (I believe), one on the old and one on the new tags, both with ec2:*, or we can try to be granular now. I'm inclined to to do the former, or to do particular rules for subnets & vpcs (which will soon only have the new tags). I think technically we tag with both for now, even on VPCs & subnets, so it is OK to do this as-is for now, but we'll have to fix this going forward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to force VPCs to be tagged. I know in a perfect world the VPC should be tagged, but we are not in
perfect world. I also know that we have people that do not have the VPCs tagged. Let me chew on this a bit. I am wondering if we want to make this optional.

Effect: IAMStatementEffectAllow,
Action: stringorslice.Slice([]string{
"ec2:ModifyInstanceAttribute",
"ec2:CreateRoute",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am worried that there might be more permissions in this list, but I don't know what they would be so...

@justinsb
Copy link
Member

So generally this looks good, but we really want to figure out a path for users that might be sharing the existing permisisons.

  • One option is to get kube2iam working, and then use enabling kube2iam as the signal that we should lock down other permissions as much as possible.
  • Another option is to add a separate flag around iam, perhaps based on how strict we want to be, and I think some people also wanted to disable access to ECR / ELB / EBS. There is a PR out there that creates a section in the cluster spec for IAM, though that is currently only for changing the names of the policies, but we could use an iam object and have a "strictness" flag.
  • We could just make it a feature-flag, which is another dodge we use when we're not entirely sure how to enable something. The big downside is that you have to always specify the feature flag.

I would prefer kube2iam, because option 2 just feels like punting a decision to the user that really we should be doing for them - if we can secure the nodes, we should. At some stage in the future we would make kube2iam the default as well, so everyone would get the more secure setup.

If you agree, tactically then, doing a feature-flag for now would let us do this until we could get kube2iam integrated...

@justinsb
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 17, 2017
@KashifSaadat
Copy link
Contributor Author

Yup I agree, all sounds sensible and would be really good to get kube2iam in place! I could use the work in PR #3210 to wrap this up under the cluster spec flag, which should give us some flexibility on tweaking and getting the permissions right?

@KashifSaadat
Copy link
Contributor Author

/retest

@justinsb
Copy link
Member

Another thing I was pondering ... when we create "things" we typically can't tag them. The canonical example of that is EBS volumes, but when I checked that there is actually a TagSpecification there so it looks like we could create a volume with tags at the same time ... but we're not using that currently. I think we should fix that in kubernetes/kubernetes (opened kubernetes/kubernetes#50898 ). But we need to be mindful of this problem...

I was thinking about the semantics of how to allow it, I do think we could have a field in the API. The meaning would be that you're not relying on reusing the IAM policies, and thus kops should set the IAM policies to the minimal set we can derive.

Perhaps:

spec:
  iam:
    strict: true

But yes, the TLDR is that what you've done in #3210 is spot-on, we just need to decide naming & nesting :-) The semantics (I propose) are that if we set strict to true kops won't guarantee any IAM permissions beyond what k8s needs, and you're expect to manage your own permissions (whether directly or with kube2iam) rather than reuse the node / master permissions (which is best practice anyway!)

@justinsb
Copy link
Member

(And one more thought: if we make strict default to false, we can merge this even when it is sort of alpha/experimental...)

@KashifSaadat
Copy link
Contributor Author

Yeah I was concerned about that case too, wasn't sure what resources the master nodes may attempt to create themselves as it would currently fail with the policies I've defined here. We can add specific policies for API calls that create resources, like CreateVolume, with an ec2:RequestTag condition key. That would hopefully fit well with the Kubernetes issue you've opened.

The spec definition you've given looks good, I'll tweak #3210 with that change and then we can later rebase this to wrap under the API flag, defaulting to false.

Thanks for the feedback :)

@chrislovecnm
Copy link
Contributor

/test pull-kops-e2e-kubernetes-aws

@KashifSaadat is this PR blocked by the other?

@KashifSaadat
Copy link
Contributor Author

Yep, if the other PR is okay and goes through I was planning to wrap the API flag around this code (or potentially leave this and go straight to your PR which is much more thorough).

@chrislovecnm
Copy link
Contributor

@KashifSaadat marked this as WIP, so it does not accidentally get merged.

k8s-github-robot pushed a commit that referenced this pull request Aug 23, 2017
…licies

Automatic merge from submit-queue

Allow the strict IAM policies to be optional

The stricter IAM policies could potentially cause regression for some edge-cases, or may rely on nodeup image changes that haven't yet been deployed / tagged officially (currently the case on master branch since PR #3158 was merged in).

This PR just wraps the new IAM policy rules around a cluster spec flag, `EnableStrictIAM`, so will default to the original behaviour (where the S3 policies were completely open). Could also be used to wrap PR #3186 if it progresses any further.

- Or we could reject this and have the policies always strict! :)
@KashifSaadat KashifSaadat force-pushed the limit-master-ec2-policy branch 4 times, most recently from ccf224a to 289f294 Compare August 25, 2017 14:26
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 25, 2017
@KashifSaadat
Copy link
Contributor Author

KashifSaadat commented Aug 26, 2017

Some notes on implementation (taken out of main PR description):

  • ec2:Describe* - (Standard requirement) Describe Instances, Tags, etc. An instance needs to be able to describe the resources so it can see what tags are set. We could lock this down further in the future to only specific resource types that it needs to know about.
  • ec2:ModifyInstanceAttribute - Kubernetes codebase uses this, setting the 'Source->Destination' check to false on a node's network interface (does not support resource-level permissions).
  • ec2:CreateRoute - Called to update the VPC main routing table with a route to the network interface of the compute node (does not support resource-level permissions).
  • ec2:CreateVolume- Called by kubernetes to create an EBS Volume, triggered during the e2e tests. We could use a RequestTag condition key here, but it requires a change to Kubernetes to add tags within the CreateVolume API call. Currently the tag addition is separate.
  • ec2:DeleteVolume - This API call failed with the condition key check, I'm guessing because when it's created (in the above call) the expected resource tag isn't added (KubernetesCluster).
  • ec2:CreateTags - This API call is made following on from the CreateVolume call in kubernetes. As mentioned above we should look to do something like this: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ExamplePolicies_EC2.html#iam-example-manage-volumes-tags

@KashifSaadat
Copy link
Contributor Author

KashifSaadat commented Aug 26, 2017

/test pull-kops-e2e-kubernetes-aws

@justinsb
Copy link
Member

/retest

The Opaque Resources failures seem to be an upstream flake (though I haven't looked into it yet)

@justinsb
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 26, 2017
@KashifSaadat
Copy link
Contributor Author

Aha thank you, I was just trying to debug the logs but couldn't spot any EC2 API unauthorized failures.

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: KashifSaadat, justinsb

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 26, 2017
@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@justinsb
Copy link
Member

/retest

@justinsb
Copy link
Member

Test failure filed as kubernetes/kubernetes#51429

@KashifSaadat
Copy link
Contributor Author

PR got merged in, will check with a retest:
/test pull-kops-e2e-kubernetes-aws

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit fdce8b4 into kubernetes:master Aug 28, 2017
@KashifSaadat KashifSaadat deleted the limit-master-ec2-policy branch August 28, 2017 09:01
@nick4fake
Copy link

I am curious, why "ec2:ResourceTag/KubernetesCluster" is used?
Don't we currently use "kubernetes.io/cluster/..." ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants