Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tighten down S3 IAM policy statements #3158

Merged

Conversation

KashifSaadat
Copy link
Contributor

This PR contains updates to:

  • Remove default s3:* IAM policy for master and compute nodes
  • Allow all nodes to list bucket contents
  • Allow master nodes to get all bucket contents
  • Allow compute nodes to get specific bucket contents (certain private key files are disallowed)
  • Adds unit tests around the S3 policy build function

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 8, 2017
@k8s-ci-robot
Copy link
Contributor

Hi @KashifSaadat. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 8, 2017
@KashifSaadat
Copy link
Contributor Author

/assign @justinsb

@@ -292,6 +273,57 @@ func addRoute53ListHostedZonesPermission(p *IAMPolicy) {
})
}

func addS3Permissions(p *IAMPolicy, iamPrefix string, s3Path *vfs.S3Path, role api.InstanceGroupRole) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have a summary comment to example what method is responsible for

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes forgot about that, will add now. 👍

strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/instancegroup/*"}, ""),
strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/pki/issued/*"}, ""),
strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/pki/ssh/*"}, ""),
strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/pki/private/ca/*"}, ""),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does the node need access to the ca.key?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kops-configuration service fails to start when it doesn't have access to the ca key. I'll dig through the repo and see if I can find out why it needs it.

Copy link
Contributor

@gambol99 gambol99 Aug 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue was due to the Ca_Certificate being treated differently .. i.e it trys to pull the certificate pair .. we probably just need to find the reason why and if it's actually required anymore

Resource: stringorslice.Of(
strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/addons/*"}, ""),
strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/instancegroup/*"}, ""),
strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/pki/issued/*"}, ""),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm ... also the issued?? .. i'm not sure how to get around this ... as issued would have a bunch of certificates which a node doesn't really need ...

Copy link
Contributor

@gambol99 gambol99 Aug 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually ... given your blocking off the private keys for the role this is "probably" fine ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree it's fine. Certificates are generally considered public. There is an information disclosure, and maybe you wouldn't even be able to get the certificate, but it is only to the nodes, and it is believe safe to expose certificates to the (hostile) world.

@KashifSaadat
Copy link
Contributor Author

@gambol99 there were two locations where the nodes were attempting to pull the CA key down. I've removed both and it appears to have had no negative impact (cluster still provisions and functions correctly). Commit hash c74c56f

@gambol99
Copy link
Contributor

gambol99 commented Aug 9, 2017

hi @justinsb .. can you confirm this safe remove?

Resource: stringorslice.Of(
strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/addons/*"}, ""),
strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/instancegroup/*"}, ""),
strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/pki/issued/*"}, ""),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree it's fine. Certificates are generally considered public. There is an information disclosure, and maybe you wouldn't even be able to get the certificate, but it is only to the nodes, and it is believe safe to expose certificates to the (hostile) world.

strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/instancegroup/*"}, ""),
strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/pki/issued/*"}, ""),
strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/pki/ssh/*"}, ""),
strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/pki/private/master/*"}, ""),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the master private key on the nodes (not saying you're wrong, just saying that we should fix this!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially coded without access, but the nodes failed to start up and required access to the master key for some reason. I'll try to investigate and see why!

strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/pki/private/master/*"}, ""),
strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/pki/private/kube-proxy/*"}, ""),
strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/pki/private/kubelet/*"}, ""),
strings.Join([]string{iamPrefix, ":s3:::", iamS3Path, "/secrets/*"}, ""),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "one big bucket" here is also not great for granularity. I think we're mostly phasing these out though anyway (using PKI instead of tokens)

@@ -300,19 +302,11 @@ func (c *VFSCAStore) FindKeypair(id string) (*Certificate, *PrivateKey, error) {
func (c *VFSCAStore) FindCert(id string) (*Certificate, error) {
var certs *certificates

if id == CertificateId_CA {
caCertificates, _, err := c.readCAKeypairs()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So readCAKeypairs performs caching of the CA keypair, but it also is a little naughty and will generate the CA certificate if needed. I guess though that this is required so that this can be used from nodeup, where we don't have access to the certificates. I believe that the auto-generation of the CA certificate was a remnant of the template-driven generation, so should not be needed any more. If we do need caching or autogeneration I think we can probably add caching back in, or we can explicitly generate the CA key as one of our first steps.

@justinsb
Copy link
Member

One comment on whether we really need that master-key permission on the nodes (and we should open a bug if we do). But this is great.

The trap in tightening these IAM permissions is that technically they can be used by pods as well (though we're promoting kube2iam to try to avoid that), and thus technically IAM changes are breaking changes - and when we have tightened them previously we have broken users. (e.g. people running alternative route53 controllers on the master). But here you're locking down access to the kops S3 bucket only, and so this is good - we should probably still have a release note, but if users are using the master keys on the nodes we're going to have to stop that eventually!

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 11, 2017
@justinsb
Copy link
Member

(And on the specific question of whether it is safe to remove that CA key code path: removing the caching is only going to be a performance issue and we can add it back if needed; removing the auto-generation might be a correctness issue but we should fail hard if so, and e2e testing should find it, and again it's an easy fix if needed. Because IssueCert still goes through readCAKeypairs I think that is where we need auto-generation, and also where the performance issue is most critical)

@KashifSaadat
Copy link
Contributor Author

@justinsb I've been testing this out. Without the ca key fix I get the following error:

Aug 11 14:57:13 <hostname>.eu-west-1.compute.internal nodeup[878]: W0811 14:57:13.628931     878 main.go:141] got error running nodeup (will retry in 30s): error building loader: error loading private key "s3://<bucket-name>/<cluster-name>/pki/private/master/<master-key-id>.key": error fetching s3://<bucket-name>/<cluster-name>/pki/private/master/<master-key-id>.key: AccessDenied: Access Denied

When I add in the ca key fix (to prevent it retrieving the key along with the certificate), the nodes no longer attempt to pull down the master key. I've removed this in commit hash: fd0ce23

@justinsb
Copy link
Member

Good news!

/lgtm (presuming it passes CI, if it doesn't the lgtm won't work anyway!)

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 11, 2017
@KashifSaadat
Copy link
Contributor Author

@justinsb sorry to prod you again, I think the lgtm command didn't trigger :D

@justinsb
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 11, 2017
@justinsb
Copy link
Member

Ooops - I think lgtm has to be on its own line :-)

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: KashifSaadat, justinsb

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@KashifSaadat
Copy link
Contributor Author

Thank you! :)

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit d8dde0c into kubernetes:master Aug 11, 2017
@KashifSaadat KashifSaadat deleted the node-iam-policy-updates branch August 14, 2017 08:26
k8s-github-robot pushed a commit that referenced this pull request Aug 23, 2017
…licies

Automatic merge from submit-queue

Allow the strict IAM policies to be optional

The stricter IAM policies could potentially cause regression for some edge-cases, or may rely on nodeup image changes that haven't yet been deployed / tagged officially (currently the case on master branch since PR #3158 was merged in).

This PR just wraps the new IAM policy rules around a cluster spec flag, `EnableStrictIAM`, so will default to the original behaviour (where the S3 policies were completely open). Could also be used to wrap PR #3186 if it progresses any further.

- Or we could reject this and have the policies always strict! :)
k8s-github-robot pushed a commit that referenced this pull request Aug 28, 2017
Automatic merge from submit-queue

Limit the IAM EC2 policy for the master nodes

Related to: #3158

The EC2 policy for the master nodes are quite open currently, allowing them to create/delete/modify resources that are not associated with the cluster the node originates from. I've come up with a potential solution using condition keys to validate that the `ec2:ResourceTag/KubernetesCluster` matches the cluster name.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants