Perform challenge callbacks into a node #15125

justinsb · 2023-02-11T16:01:26Z

In order to verify that the caller is running on the specified node, we source the expected IP address from the cloud, and require that the node set up a simple challenge/response server to answer requests.

Because the challenge server runs on a port outside of the nodePort range, this also makes it harder for pods to impersonate their host nodes - though we do combine this with TPM and similar functionality where it is available.

justinsb · 2023-02-11T16:01:48Z

/hold for discussion, I don't think we should merge this lightly :-)

zetaab · 2023-02-11T16:21:34Z

one notable thing: people can run pods in hostNetwork and then its possible to impersonate. But as I see this: why we should accept calls anymore from instances that are already members of kubernetes cluster? If we do not accept that, then its pretty easy to protect against normal pods.

One thing that is coming to my mind: if node lifetime is enough long it might be that kubelet certificates are going to expire and re-register is needed (kops-controller should accept in that case?)? That could be done like using kubectl delete node for node that is in NotReady state and after that it can request new certs, but its not pretty solution

justinsb · 2023-05-04T12:14:25Z

one notable thing: people can run pods in hostNetwork and then its possible to impersonate. But as I see this: why we should accept calls anymore from instances that are already members of kubernetes cluster? If we do not accept that, then its pretty easy to protect against normal pods.

So I think we need to do this PR and prevent the node from re-registering. We need this so that we can have higher-confidence when the node joins (particularly on clouds where we don't get strong node-attestation, for example DigitalOcean). And we want to prevent the node from re-registering to avoid attacks where a pod tries to impersonate a node. We do want to allow some re-registration (as you've pointed out, where the node reboots, or where the cert expires) ... I am thinking we want a machine-key or similar, but I think we've agreed that we can iterate on that!

justinsb · 2023-05-04T18:21:14Z

/test pull-kops-e2e-cni-cilium-ipv6

justinsb · 2023-05-05T11:55:18Z

/retest

Going to look into each of the failures, but they all appear to be unrelated

justinsb · 2023-05-05T12:32:54Z

Test failures matched kubernetes/kubernetes#117363 , i.e. a data race in k8s

In order to verify that the caller is running on the specified node, we source the expected IP address from the cloud, and require that the node set up a simple challenge/response server to answer requests. Because the challenge server runs on a port outside of the nodePort range, this also makes it harder for pods to impersonate their host nodes - though we do combine this with TPM and similar functionality where it is available.

DigitalOcean (and others) will follow shortly. Also create a method for CloudProvider, so that we are more ambivalent towards bootstrapping methods.

hakman · 2023-05-06T13:56:47Z

/retest

justinsb · 2023-05-07T13:00:48Z

I made the changes here to only run this on hetzner (and I'll rebase the digitalocean branch after this merges).

hakman · 2023-05-07T13:24:28Z

/lgtm
/approve

k8s-ci-robot · 2023-05-07T13:24:46Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hakman

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [hakman]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

justinsb · 2023-05-07T14:03:48Z

Thanks for reviewing @hakman

/hold cancel

justinsb · 2023-05-07T15:28:18Z

/test pull-kops-e2e-cni-cilium

(It was the openapi data race again)

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 11, 2023

k8s-ci-robot requested a review from hakman February 11, 2023 16:01

k8s-ci-robot added the area/api label Feb 11, 2023

k8s-ci-robot requested a review from zetaab February 11, 2023 16:01

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 11, 2023

justinsb mentioned this pull request Feb 15, 2023

exit nodeup gracefully if server already exists in k8s #15138

Merged

justinsb force-pushed the node_challenge branch 2 times, most recently from d98b755 to f24fde4 Compare April 26, 2023 10:25

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 29, 2023

justinsb force-pushed the node_challenge branch from f24fde4 to eb7702d Compare May 3, 2023 16:23

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 3, 2023

justinsb force-pushed the node_challenge branch from eb7702d to 2d12e06 Compare May 4, 2023 11:47

k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 4, 2023

justinsb force-pushed the node_challenge branch 4 times, most recently from e87a61a to ee53335 Compare May 4, 2023 14:12

justinsb mentioned this pull request May 4, 2023

DigitalOcean support for node bootstrap #15367

Merged

justinsb force-pushed the node_challenge branch from ee53335 to 9c8ab2f Compare May 5, 2023 02:15

justinsb added 3 commits May 6, 2023 08:03

Add generated code

79ca260

Update expected test output

bd956f2

justinsb force-pushed the node_challenge branch from 9c8ab2f to b06c852 Compare May 6, 2023 12:11

Only use node challenge on hetzner

c89f434

DigitalOcean (and others) will follow shortly. Also create a method for CloudProvider, so that we are more ambivalent towards bootstrapping methods.

justinsb force-pushed the node_challenge branch from b06c852 to c89f434 Compare May 6, 2023 12:57

k8s-ci-robot assigned hakman May 7, 2023

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 7, 2023

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 7, 2023

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 7, 2023

k8s-ci-robot merged commit 68dcc7a into kubernetes:master May 7, 2023
9 checks passed

hakman mentioned this pull request Jul 18, 2023

azure: Verify node identity using VMSS name instead of tags #15659

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perform challenge callbacks into a node #15125

Perform challenge callbacks into a node #15125

justinsb commented Feb 11, 2023

justinsb commented Feb 11, 2023

zetaab commented Feb 11, 2023 •

edited

Loading

justinsb commented May 4, 2023

justinsb commented May 4, 2023

justinsb commented May 5, 2023

justinsb commented May 5, 2023

hakman commented May 6, 2023

justinsb commented May 7, 2023

hakman commented May 7, 2023

k8s-ci-robot commented May 7, 2023

justinsb commented May 7, 2023

justinsb commented May 7, 2023 •

edited

Loading

Perform challenge callbacks into a node #15125

Perform challenge callbacks into a node #15125

Conversation

justinsb commented Feb 11, 2023

justinsb commented Feb 11, 2023

zetaab commented Feb 11, 2023 • edited Loading

justinsb commented May 4, 2023

justinsb commented May 4, 2023

justinsb commented May 5, 2023

justinsb commented May 5, 2023

hakman commented May 6, 2023

justinsb commented May 7, 2023

hakman commented May 7, 2023

k8s-ci-robot commented May 7, 2023

justinsb commented May 7, 2023

justinsb commented May 7, 2023 • edited Loading

zetaab commented Feb 11, 2023 •

edited

Loading

justinsb commented May 7, 2023 •

edited

Loading