Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network policy docs clarifications #39875

Merged
merged 4 commits into from
Nov 9, 2023

Conversation

npinaeva
Copy link
Member

@npinaeva npinaeva commented Mar 9, 2023

  1. Add "Pod lifecycle" section to the network policy docs.
  2. Clarify network policy behaviour for hostNetwork pods
  3. Clarify network policy behaviour for L2/L3 protocols

New sections should clarify some confusing aspect for the users, and become the base to build better-defined API for AdminNetworkPolicy.

The current version is intended for content reviews, and may not completely follow style guidelines, which will be fixed once we agree on the content.

@k8s-ci-robot
Copy link
Contributor

Welcome @npinaeva!

It looks like this is your first PR to kubernetes/website 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/website has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 9, 2023
@k8s-ci-robot k8s-ci-robot added the language/en Issues or PRs related to English language label Mar 9, 2023
@k8s-ci-robot k8s-ci-robot added the sig/docs Categorizes an issue or PR as relevant to SIG Docs. label Mar 9, 2023
@npinaeva
Copy link
Member Author

npinaeva commented Mar 9, 2023

/cc @danwinship @astoycos

@bradtopol
Copy link
Contributor

/retest

@bradtopol
Copy link
Contributor

/recheck

@bradtopol
Copy link
Contributor

@npinaeva This looks like it needs to build again.

git commit --allow-empty -m "empty commit to rerun tests"

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Here's some thoughts.

@sftim
Copy link
Contributor

sftim commented Mar 9, 2023

I'll trigger a build in Netlify (we haven't got that hooked up to /retest in Prow - sorry about that)

Please rebase this against main - there's been a breaking change to the CVE feed format.

@npinaeva
Copy link
Member Author

npinaeva commented Mar 9, 2023

@sftim thanks for your feedback! I made some changes and left some questions, ptal :)

@WillsonHG
Copy link

/easycla

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Mar 9, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Mar 9, 2023
Copy link
Member

@Arhell Arhell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/easycla

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Mar 10, 2023
@netlify
Copy link

netlify bot commented Mar 10, 2023

Pull request preview available for checking

Built without sensitive environment variables

Name Link
🔨 Latest commit 9fd88db
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-io-main-staging/deploys/652ea3a9734fb5000701d829
😎 Deploy Preview https://deploy-preview-39875--kubernetes-io-main-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more thoughts

@npinaeva
Copy link
Member Author

hey @kbhawkey , thanks for asking! Yes I am still working on it, just waiting for some feedback :)


Once the NetworkPolicy is handled by a network plugin,

1. All newly created pods affected by a given NetworkPolicy will be isolated before

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pods...will be isolated before they are started

I believe this bullet point is incorrect. This requirement does not even seem possible for a NetworkPolicy implementation which is disjoint from the CNI (e.g. Azure NPM).

Assuming that isolation here means blocking traffic. Please correct me if I'm misinterpreting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your interpretation, and the bullet point, are correct. A correct implementation of NetworkPolicy must ensure that pods are not able to bypass pre-existing NetworkPolicies at any point in their lifecycle.

If pods aren't fully subject to NetworkPolicy at startup time, then an attacker can exploit that race condition by continuously creating new pods and trying to make connections that should have been blocked, until they eventually succeed, and then they can do whatever bad thing the NetworkPolicy was supposed to have been preventing them from doing, rendering the NetworkPolicy useless.

This requirement does not even seem possible for a NetworkPolicy implementation which is disjoint from the CNI (e.g. Azure NPM).

Note that it's allowed for pods to come up more restricted than they're supposed to be, just not less restricted. So if the NP impl can create a fallback "block everything else" rule that would apply to newly-created pods, then it could do that, and then the pod would come up fully isolated until the NP impl has a chance to fix it up.

If the CNI plugin works in such a way that the NP implementation can't create rules that would apply to a new pod before that pod exists (eg, the CNI plugin is creating a new interface), then there is probably no way to correctly implement NetworkPolicy independently of the CNI plugin. (Though there are small-ish modifications that could be made to the plugin to make this possible. Eg, you could add a flag to the CNI plugin saying "make newly-created pods start out completely isolated", and then it would depend on the existence of some external NP impl to fix the pods up after that.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make newly-created pods start out completely isolated

(aside)
That sounds like a very reasonable workaround, especially given the (alpha) PodReadyToStartContainers condition - the new Pod might be fully isolated, then add packet filtering rules, then remove the full isolation, and finally assert PodReadyToStartContainers. But: it does need the network plugin to play at least some role.

Maybe there's an interface or API that we can define so that a network plugin waits between setting up IP networking for a Pod and asserting that condition. That would help standalone NetworkPolicy implementations to provide the guarantees we expect.

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're willing to update it, the section about SCTPSupport is stale.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please elaborate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now generally available, no need to enable it and no option to disable it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole "SCTP support" section can probably just go away; once you get rid of the stuff about the feature gate, all that's left is a warning that SCTP NetworkPolicies only work if your network plugin supports SCTP, which is kind of a no-brainer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so do you want me to add another commit to remove SCTP support section? This PR doesn't touch SCTP yet, so maybe it would be a better option to have it as another PR (knowing for how ling we've been discussing this one)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added one more commit for SCTP, I left the note about plugin SCTP support, since it looks fine to me in the new "Network traffic filtering" section that talks about protocols. ptal

that pod may be started unprotected, and isolation rules will be applied when
the NetworkPolicy handling is completed.

Once the NetworkPolicy is handled by a network plugin,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Once the NetworkPolicy is handled by a network plugin,
For a cluster with a conformant networking plugin and a conformant implementation of
NetworkPolicy:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not exactly equivalent, but I added this message as a not at the beginning of the Pod lifecycle section

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(IMO we don't need this note; the entire page only applies to plugins that implement NetworkPolicy, and the entire /services-networking/ section only (fully) applies to clusters with conformant networking plugins.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah good point, @sftim what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the entire page only applies to plugins that implement NetworkPolicy

The conceptual explanation is relevant even if your cluster is using a network plugin that has never heard of NetworkPolicy.


We should document somewhere what we expect from a conforming network plugin. Otherwise, it's not fair on plugin authors - they need to know what's expected and shouldn't have to become detectives in meeting that end.

The way we inform implementers might once day be through a separate reference page, but for now this concept page is the closest thing we have.

Comment on lines 343 to 353
Every created NetworkPolicy will be handled by a network plugin eventually, but there is no
way to tell from the Kubernetes API when exactly that happens.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Every created NetworkPolicy will be handled by a network plugin eventually, but there is no
way to tell from the Kubernetes API when exactly that happens.
Every created NetworkPolicy will be handled by a network plugin eventually, but there is no
way to tell from the Kubernetes API when exactly the policy is being enforced for new or
existing Pods.
(As an example: if you write a NetworkPolicy implementation that begins enforcing the policy from
midnight the next day, that conforms to Kubernetes' requirements - but of course could be very
frustrating for cluster administrators).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggestion looks more implementor-focused to me, and I think we were going to outline implementor guideline in a different document, wdyt?
Another note - I don't think the implementation you described will pass conformance tests for network policy, so maybe it is not really conformant?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no conformance tests for NetworkPolicy, but it wouldn't pass the ordinary e2e tests.

I think we were going to outline implementor guideline in a different document

yeah, there's been discussion about that (partly inspired by this PR). https://groups.google.com/g/kubernetes-sig-network/c/Ga9SWGs00k4 / kubernetes/community#7421

Comment on lines 345 to 355
The only way to check if the network policy is applied to a pod, is to run a connectivity check
from that pod.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even then, you might not be able to tell (I think).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is right, e.g. connections may be dropped for another reason, but I am not sure if adding these details will improve or over-complicate this section, wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd omit the sentence. I'm wary about the risk of misleading people.

(I don't want someone to think “Right, so this Pod shouldn't be able to talk to a Pod in namespace untrusted. I tried and failed, so my policy is working. Job done, :shipit:!”)

Because they are applied at Pod level, NetworkPolicies apply equally to init containers,
sidecar containers, and regular containers.

2. Allow rules will be applied eventually after the isolation rules (that includes applying both at the same time).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Allow rules will be applied eventually after the isolation rules (or may be applied at the same time)."

from that pod.

The pods must be resilient against being started up with different network
connectivity than expected. If the pod needs to make sure it has expected network
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Therefore, pods must be resilient..."

The pods must be resilient against being started up with different network
connectivity than expected. If the pod needs to make sure it has expected network
connectivity before being started, you can use an [init container](/docs/concepts/workloads/pods/init-containers/)
that will run required connectivity checks before starting the app containers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"...before letting the app containers start." or "...before kubelet starts the app containers".

```
may apply ingress and egress rules to all connections with nodeA IP, or just ignore
that pod as if it didn't match the selector, or do something else.
2. a `hostNetwork` pod is selected by `ingress` or `egress` rules.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2. a `hostNetwork` pod is selected by a `podSelector` in an `ingress` or `egress` rule.

(since there's the ipBlock case too, but you don't talk about that in this bullet point)

@danwinship
Copy link
Contributor

(left some other comments up above in earlier threads too)

@sftim
Copy link
Contributor

sftim commented Aug 23, 2023

so do you want me to add another commit to remove SCTP support section? This PR doesn't touch SCTP yet, so maybe it would be a better option to have it as another PR (knowing for how ling we've been discussing this one)?

You must fix the mandatory corrctive feedback about undefined behavior and get that update tech-reviewed. You can also address the optional feedback about SCTP if you'd like to.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 24, 2023
@npinaeva
Copy link
Member Author

@danwinship can you please tal at the hostNetwork section?
@sftim can you please check the sctp changes?

Copy link
Contributor

@danwinship danwinship left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(mostly editorial comments, except for the hostNetwork section, which must be fixed)

Comment on lines 324 to 328
When a new NetworkPolicy object is created, it may take some for a network plugin
to handle the new object. If the pod that is affected by a NetworkPolicy
is created before the network plugin has completed NetworkPolicy handling,
that pod may be started unprotected, and isolation rules will be applied when
the NetworkPolicy handling is completed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When a new NetworkPolicy object is created, it may take some for a network plugin
to handle the new object. If the pod that is affected by a NetworkPolicy
is created before the network plugin has completed NetworkPolicy handling,
that pod may be started unprotected, and isolation rules will be applied when
the NetworkPolicy handling is completed.
When a new NetworkPolicy object is created, it may take some time for a network plugin
to handle the new object. If a pod that is affected by a NetworkPolicy
is created before the network plugin has completed NetworkPolicy handling,
that pod may be started unprotected, and isolation rules will be applied when
the NetworkPolicy handling is completed.

that pod may be started unprotected, and isolation rules will be applied when
the NetworkPolicy handling is completed.

Once the NetworkPolicy is handled by a network plugin,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(IMO we don't need this note; the entire page only applies to plugins that implement NetworkPolicy, and the entire /services-networking/ section only (fully) applies to clusters with conformant networking plugins.)

sidecar containers, and regular containers.

2. Allow rules will be applied eventually after the isolation rules (or may be applied at the same time).
In the worst case, a newly created pod may have no network connectivity at all, if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In the worst case, a newly created pod may have no network connectivity at all, if
In the worst case, a newly created pod may have no network connectivity at all when it is first started, if

Therefore, pods must be resilient against being started up with different network
connectivity than expected. If the pod needs to make sure it has expected network
connectivity before being started, you can use an [init container](/docs/concepts/workloads/pods/init-containers/)
that will run required connectivity checks before kubelet starts the app containers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Tim notes, this won't work for checking "negative connectivity", so it might be good to clarify that this is only for the "positive connectivity" case. "If you need to make sure the pod can reach certain destinations before being started, you can use an init container to wait for those destinations to be reachable before kubelet starts the app containers."


## NetworkPolicy and `hostNetwork` pods

NetworkPolicy behaviour for `hostNetwork` pods is undefined, but should be limited with 2 possibilities:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
NetworkPolicy behaviour for `hostNetwork` pods is undefined, but should be limited with 2 possibilities:
NetworkPolicy behaviour for `hostNetwork` pods is undefined, but it should be limited to 2 possibilities:

Comment on lines 364 to 365
- network plugin can distinguish `hostNetwork` pod traffic from any other traffic and will apply
NetworkPolicy to it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- network plugin can distinguish `hostNetwork` pod traffic from any other traffic and will apply
NetworkPolicy to it
- The network plugin can distinguish `hostNetwork` pod traffic from all other traffic
(including being able to distinguish traffic from different `hostNetwork` pods on
the same node), and will apply NetworkPolicy to `hostNetwork` pods just like it does
to pod-network pods.

NetworkPolicy behaviour for `hostNetwork` pods is undefined, but should be limited with 2 possibilities:
- network plugin can distinguish `hostNetwork` pod traffic from any other traffic and will apply
NetworkPolicy to it
- network plugin ignores `hostNetwork` pods
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- network plugin ignores `hostNetwork` pods
- The network plugin cannot properly distinguish `hostNetwork` pod traffic, and so it ignores `hostNetwork` pods when matching `podSelector` and `namespaceSelector`. Traffic to/from `hostNetwork` pods is treated the same as all other traffic to/from the node IP. (This is the most common implementation.)

It's not ignoring the pods entirely; eg, a pod which is isolated for ingress will still reject connections from hostNetwork pods.

...
```

2. a `hostNetwork` pod is selected by a `podSelector` in an `ingress` or `egress` rule.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. a `hostNetwork` pod is selected by a `podSelector` in an `ingress` or `egress` rule.
2. a `hostNetwork` pod is selected by a `podSelector` or `namespaceSelector` in an `ingress` or `egress` rule.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
"Network traffic filtering" section

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like a technical review from @kubernetes/sig-network-misc


{{< note >}}
You must be using a {{< glossary_tooltip text="CNI" term_id="cni" >}} plugin that supports SCTP
protocol NetworkPolicies.
{{< /note >}}

When a `deny all` network policy is defined, it is only guaranteed to deny TCP, UDP and SCTP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit)

Suggested change
When a `deny all` network policy is defined, it is only guaranteed to deny TCP, UDP and SCTP
When a _deny all_ network policy is defined, it is only guaranteed to deny TCP, UDP and SCTP


{{< note >}}
You must be using a {{< glossary_tooltip text="CNI" term_id="cni" >}} plugin that supports SCTP
protocol NetworkPolicies.
{{< /note >}}

When a `deny all` network policy is defined, it is only guaranteed to deny TCP, UDP and SCTP
connections. For other protocols, such as ARP or ICMP, the behaviour is undefined.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optionally)

Suggested change
connections. For other protocols, such as ARP or ICMP, the behaviour is undefined.
connections. For other protocols, such as ARP or ICMP, the behaviour is undefined: it is up to
each network plugin to decide what behavior to adopt and implement.

that pod may be started unprotected, and isolation rules will be applied when
the NetworkPolicy handling is completed.

Once the NetworkPolicy is handled by a network plugin,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the entire page only applies to plugins that implement NetworkPolicy

The conceptual explanation is relevant even if your cluster is using a network plugin that has never heard of NetworkPolicy.


We should document somewhere what we expect from a conforming network plugin. Otherwise, it's not fair on plugin authors - they need to know what's expected and shouldn't have to become detectives in meeting that end.

The way we inform implementers might once day be through a separate reference page, but for now this concept page is the closest thing we have.

@k8s-ci-robot k8s-ci-robot added the sig/network Categorizes an issue or PR as relevant to SIG Network. label Nov 7, 2023
@danwinship
Copy link
Contributor

/lgtm

I'd like a technical review from @kubernetes/sig-network-misc

Sure, that's what I was doing all along. I declare this PR to be technically accurate!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 9, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: a9fc7ecc2e8284aed8052790b331247fec345282

@sftim
Copy link
Contributor

sftim commented Nov 9, 2023

This is good to publish.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sftim

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 9, 2023
@k8s-ci-robot k8s-ci-robot merged commit 9d663cd into kubernetes:main Nov 9, 2023
6 checks passed
@npinaeva npinaeva deleted the netpol-pod-start branch November 14, 2023 09:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/docs Categorizes an issue or PR as relevant to SIG Docs. sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants