Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add e2e regression tests for the kubelet being secure #64140

Merged
merged 2 commits into from Jun 21, 2018

Conversation

@dixudx
Copy link
Member

dixudx commented May 22, 2018

What this PR does / why we need it:
This PR does,

  1. The kubelet cAdvisor port (4194) can't be reached, neither via the API server proxy nor directly on the public IP address
  2. The kubelet read-only port (10255) can't be reached, neither via the API server proxy nor directly on the public IP address
  3. The kubelet can delegate ServiceAccount tokens to the API server
  4. The kubelet's main port (10250) has both authentication (should fail with no credentials) and authorization (should fail with insufficient permissions) set-up

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes kubernetes/kubeadm#838

Special notes for your reviewer:
/cc luxas tallclair
Release note:

Add e2e regression tests for the kubelet being secure
@dixudx

This comment has been minimized.

Copy link
Member Author

dixudx commented May 22, 2018

@luxas
Copy link
Member

luxas left a comment

This looks great overall.
Thanks for this valuable contribution @dixudx 👍!
I left a couple of initial comments.

I'd love a review from @kubernetes/sig-auth-pr-reviews, and a comment from @kubernetes/sig-architecture-pr-reviews regarding eventually graduating It("Should make the kubelet's main port 10250 enforce authentication for client requests") a Conformance test as @tallclair commented in kubernetes/kubeadm#838 (comment)

Show resolved Hide resolved test/e2e/auth/node_authn.go Outdated
Show resolved Hide resolved test/e2e/auth/node_authn.go Outdated
Expect(err).NotTo(HaveOccurred())
Expect(len(nodeList.Items)).NotTo(Equal(0))
nodeName = nodeList.Items[0].Name
sa, err := f.ClientSet.CoreV1().ServiceAccounts(ns).Get("default", metav1.GetOptions{})

This comment has been minimized.

@luxas

luxas May 22, 2018

Member

Add a comment why you do this. (I think it's to make sure the ServiceAccount admission controller is enabled etc., so secret generation on SA creation works, right?)

This comment has been minimized.

@dixudx

dixudx May 23, 2018

Author Member

Correct. Add a comment is better.

Show resolved Hide resolved test/e2e/auth/node_authn.go Outdated
@@ -179,4 +181,18 @@ var _ = SIGDescribe("[Feature:NodeAuthorizer]", func() {
err := c.CoreV1().Nodes().Delete("foo", &metav1.DeleteOptions{})
Expect(apierrors.IsForbidden(err)).Should(Equal(true))
})

framework.ConformanceIt("The kubelet's main port 10250 should fail with the Forbidden error", func() {

This comment has been minimized.

@luxas

luxas May 22, 2018

Member

Don't add conformance here please. You can add a comment though to maybe do it in the future though.
cc @tallclair

Show resolved Hide resolved test/e2e/lifecycle/kubelet_security.go Outdated

prefix := "/api/v1"
// make sure kubelet readonly (10255) and cadvisor (4194) ports are disabled via API server proxy
It("should not be able to contact the readonly port 10255", func() { portProxyDisabledTest(f, prefix+"/nodes/", ":10255/pods/") })

This comment has been minimized.

@luxas

luxas May 22, 2018

Member

suggest: should not be able to proxy to the readonly kubelet port 10255 using proxy subresource to be consistent with what you have below

Show resolved Hide resolved test/e2e/lifecycle/kubelet_security.go Outdated
Show resolved Hide resolved test/e2e/lifecycle/kubelet_security.go Outdated

@dixudx dixudx force-pushed the dixudx:add_e2e_kubelet_port branch from 5b4cb58 to b640ca7 May 23, 2018

@luxas

luxas approved these changes May 23, 2018

Copy link
Member

luxas left a comment

Thanks a lot @dixudx!!
I'm fine for the moment, the high level functionality is there, deferring the rest of the review to SIG Auth reviewers.
I'm gonna try running these locally in a minute to make sure everything actually works 😄

/approve
/lgtm

Show resolved Hide resolved test/e2e/auth/node_authn.go Outdated
Show resolved Hide resolved test/e2e/auth/node_authn.go Outdated
Show resolved Hide resolved test/e2e/auth/node_authn.go Outdated
Show resolved Hide resolved test/e2e/auth/node_authz.go Outdated
@liggitt

This comment has been minimized.

Copy link
Member

liggitt commented May 23, 2018

while I like the idea of exercising the described tests, these e2e tests don't actually do that. the only reason CI is green with them added is because they're behind feature tags that don't actually run them (or the API server happens to return the same response we were hoping to see from the kubelet)

@tallclair

This comment has been minimized.

Copy link
Member

tallclair commented May 23, 2018

I recommend proxying the tests through shell commands run in a container and pointed at the kubelets cluster-internal IP address. See the apparmor test for an example of how you might do this:
https://github.com/kubernetes/kubernetes/blob/master/test/e2e/common/apparmor.go#L60

k8s-github-robot pushed a commit that referenced this pull request May 24, 2018

Kubernetes Submit Queue
Merge pull request #64187 from luxas/kubeadm_kubelet_improve_security
Automatic merge from submit-queue (batch tested with PRs 64174, 64187, 64216, 63265, 64223). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubeadm: Improve the kubelet default configuration security-wise

**What this PR does / why we need it**:
 - Disables the readonly port for the kubelets in the cluster
 - Enables delegated SA token authentication for the secure kubelet port (GCE also did this ref: #58178)
 - Follows up #63912 to move the last flag from the system dropin to the ComponentConfig

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes kubernetes/kubeadm#732
Fixes kubernetes/kubeadm#650
Replaces #57997

**Special notes for your reviewer**:
In order to make sure this actually works, or that clusters actually are secure, we're adding e2e tests for this: kubernetes/kubeadm#838 & #64140
Depends on #63912

**Release note**:

```release-note
[action required] kubeadm: kubelets in kubeadm clusters now disable the readonly port (10255). If you're relying on unauthenticated access to the readonly port, please switch to using the secure port (10250). Instead, you can now use ServiceAccount tokens when talking to the secure port, which will make it easier to get access to e.g. the `/metrics` endpoint of the kubelet securely.
```
@kubernetes/sig-cluster-lifecycle-pr-reviews 
@kubernetes/sig-auth-pr-reviews FYI

@dixudx dixudx force-pushed the dixudx:add_e2e_kubelet_port branch from b640ca7 to fe65a18 May 29, 2018

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented May 29, 2018

New changes are detected. LGTM label has been removed.

@k8s-ci-robot k8s-ci-robot removed the lgtm label May 29, 2018

@dixudx dixudx force-pushed the dixudx:add_e2e_kubelet_port branch from fe65a18 to bff5c13 May 29, 2018

@tallclair

This comment has been minimized.

Copy link
Member

tallclair commented Jun 1, 2018

Yeah, this isn't a conformance test, so I don't think it needs to be tied to a version. We can merge it when it's ready.

@dixudx dixudx force-pushed the dixudx:add_e2e_kubelet_port branch from 21f3f54 to c4472c8 Jun 2, 2018

@dixudx

This comment has been minimized.

Copy link
Member Author

dixudx commented Jun 2, 2018

@jberkus @tallclair Addressed the comments. PTAL. Thanks.

Expect(len(nodeList.Items)).NotTo(BeZero())

pickedNode := nodeList.Items[0]
// Internal IPs are meaningless from the e2e test

This comment has been minimized.

@tallclair

tallclair Jun 4, 2018

Member

This comment only applies to the test runner. Since you're execing both the tests through a pod, you can test the internal IPs here.

Expect(len(sa.Secrets)).NotTo(BeZero())
})

It("The kubelet's main port 10250 should fail with no credentials", func() {

This comment has been minimized.

@tallclair

tallclair Jun 4, 2018

Member

nit: s/fail/reject requests/

"--insecure",
fmt.Sprintf("https://%s:%v/metrics", nodeIP, ports.KubeletPort),
},
"401",

This comment has been minimized.

@tallclair

tallclair Jun 4, 2018

Member

I'd prefer to check for 401 OR 403, that way you're not depending on a specific implementation as much.

"-s",
"-I",
"--insecure",
fmt.Sprintf(`--header "Authorization: Bearer %s"`, "`cat /var/run/secrets/kubernetes.io/serviceaccount/token`"),

This comment has been minimized.

@tallclair

tallclair Jun 4, 2018

Member

curl doesn't know how to interpret backticks - you need to run this command in a shell. I.e.:

[]string{"sh", "-c", "curl -sI --insecure --header "Authorization: Bearer `cat /var/run/secrets/kubernetes.io/serviceaccount/token`" + url}
fmt.Sprintf(`--header "Authorization: Bearer %s"`, "`cat /var/run/secrets/kubernetes.io/serviceaccount/token`"),
fmt.Sprintf("https://%s:%v/metrics", nodeIP, ports.KubeletPort),
},
"403",

This comment has been minimized.

@tallclair

tallclair Jun 4, 2018

Member

... or 401

Expect(statusCode).NotTo(Equal(http.StatusOK))
})
It("should not be able to proxy to cadvisor port 4194 using proxy subresource", func() {
result, err := framework.NodeProxyRequest(f.ClientSet, nodeName, "proxy/containers/", 4194)

This comment has been minimized.

@tallclair

tallclair Jun 4, 2018

Member

Are you sure the path should include proxy here?

@tallclair

This comment has been minimized.

Copy link
Member

tallclair commented Jun 4, 2018

There seems to be some confusion about what internal IPs and external IPs are.

External IPs are available to the entire internet.
Internal IPs are from a private address space, meaning they are only meaningful within the private network. In our case, that means running in the cluster.

The E2E test runner runs outside the cluster being tested, which means that it can only see the external IP addresses (actually, it can see the internal IP addresses, but they are resolved within the test runner's private network, so they probably resolve to different devices, if at all). When you run a pod, it runs in the cluster, meaning the pod can see the internal addresses, this is why an tests that look at the internal IPs should be run through a pod.

Hopefully that clears it up?

@dixudx dixudx force-pushed the dixudx:add_e2e_kubelet_port branch from c4472c8 to ff92c01 Jun 5, 2018

pod := createNodeAuthTestPod(f)
for _, nodeIP := range nodeIPs {
// Anonymous authentication is disabled by default
result, err := framework.LookForStringInPodExec(pod.Namespace,

This comment has been minimized.

@tallclair

tallclair Jun 5, 2018

Member

No reason to use this method if you're not looking for the string. How about just:

result := framework.RunHostCmdOrDie(ns, pod.Name, fmt.Sprintf("curl -sIk -o /dev/null -w '%{http_code}' https://%s:%v/metrics", nodeIP, ports.KubeletPort))
Expect(result).To(Or(Equal("401"), Equal("403")), "the kubelet's main port 10250 should reject requests with no credentials")
pod := createNodeAuthTestPod(f)

for _, nodeIP := range nodeIPs {
result, err := framework.LookForStringInPodExec(pod.Namespace,

This comment has been minimized.

@tallclair

tallclair Jun 5, 2018

Member

Same as above comment.

time.Minute)
Expect(err).NotTo(HaveOccurred())

if !strings.Contains(result, "403") && !strings.Contains(result, "401") {

This comment has been minimized.

@tallclair

tallclair Jun 5, 2018

Member

Expect(result).To(And(BeNumerically(">=", 200), BeNumerically("<", 300)), ...)

This comment has been minimized.

@tallclair

tallclair Jun 5, 2018

Member

I don't particularly care for the gomega DSL, so feel free to do these checks explicitly, but semantically this is the condition you should check for.

This comment has been minimized.

@tallclair

tallclair Jun 5, 2018

Member

You might be able to simplify this to just: Expect(result).To(Equal("200"), ...)

This comment has been minimized.

@dixudx

dixudx Jun 6, 2018

Author Member

Here we're using a new SA token when sending requests to the kubelet. The expected result should be 403, and authentication is okay.

Expect(result).To(Equal("200"), ...)

@tallclair I don't quite understand where 200 comes.


pickedNode := nodeList.Items[0]
// Here we only need to care about Internal IP, since the pods running in the cluster
// can see the internal addresses.

This comment has been minimized.

@tallclair

tallclair Jun 5, 2018

Member

This comment still doesn't make sense to me. Why did you drop the external IPs?

This comment has been minimized.

@dixudx

dixudx Jun 6, 2018

Author Member

So we add external IPs back again? I was testing both internal IPs and external IPs before.

BeforeEach(func() {
nodes := framework.GetReadySchedulableNodesOrDie(f.ClientSet)
Expect(len(nodes.Items)).NotTo(BeZero())
node = &nodes.Items[0]

This comment has been minimized.

@tallclair

tallclair Jun 5, 2018

Member

The 0th node is usually the master, which might be configured differently. Still a good idea to check it though, but maybe pick a few (or all of them)?

This comment has been minimized.

@dixudx

dixudx Jun 6, 2018

Author Member

Actually all the nodes here are schedulable and ready. Usually the master node is tainted, which will be filtered due to unschedulable.

So the 0th node is usually not the master. And 0th node is used widely in our e2e test scenarios.

// Pick a node where all pods will run.
nodes := framework.GetReadySchedulableNodesOrDie(f.ClientSet)
Expect(len(nodes.Items)).NotTo(BeZero(), "No available nodes for scheduling")
node := &nodes.Items[0]

@dixudx dixudx force-pushed the dixudx:add_e2e_kubelet_port branch 2 times, most recently from 27c98bf to f71ad8c Jun 6, 2018

@luxas luxas removed the kind/design label Jun 6, 2018

@luxas luxas added this to the v1.11 milestone Jun 6, 2018

@liggitt

This comment has been minimized.

Copy link
Member

liggitt commented Jun 6, 2018

per #64140 (comment) this isn't tied to v1.11 or blocking the release

/milestone clear

@k8s-ci-robot k8s-ci-robot removed this from the v1.11 milestone Jun 6, 2018

Show resolved Hide resolved test/e2e/auth/node_authn.go Outdated

@dixudx dixudx force-pushed the dixudx:add_e2e_kubelet_port branch from f71ad8c to 924df8a Jun 12, 2018

@dixudx

This comment has been minimized.

Copy link
Member Author

dixudx commented Jun 20, 2018

@luxas @tallclair Needs your lgtm. Thanks.

@tallclair

This comment has been minimized.

Copy link
Member

tallclair commented Jun 21, 2018

/lgtm

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Jun 21, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dixudx, luxas, tallclair, timothysc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dixudx

This comment has been minimized.

Copy link
Member Author

dixudx commented Jun 21, 2018

@tallclair @luxas Seems the bot does not want to lgtm.

@luxas luxas added the lgtm label Jun 21, 2018

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

k8s-github-robot commented Jun 21, 2018

/test all

Tests are more than 96 hours old. Re-running tests.

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

k8s-github-robot commented Jun 21, 2018

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Jun 21, 2018

@dixudx: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-bazel-test 924df8a link /test pull-kubernetes-bazel-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

k8s-github-robot commented Jun 21, 2018

Automatic merge from submit-queue (batch tested with PRs 64140, 64898, 65022, 65037, 65027). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 0b3af19 into kubernetes:master Jun 21, 2018

16 of 18 checks passed

pull-kubernetes-bazel-test Job failed.
Details
Submit Queue Required Github CI test is not green: pull-kubernetes-bazel-test
Details
cla/linuxfoundation dixudx authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gke Skipped
pull-kubernetes-e2e-kops-aws Job succeeded.
Details
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped
pull-kubernetes-local-e2e-containerized Skipped
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details

@dixudx dixudx deleted the dixudx:add_e2e_kubelet_port branch Jun 21, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.