Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-2819: CNI traffic shaping support #2808

Closed
wants to merge 3 commits into from

Conversation

uablrek
Copy link

@uablrek uablrek commented Jul 3, 2021

  • One-line PR description: revive old KEP
  • Other comments:

Note: This is not a new KEP. It is for regain tracking of an already-implemented feature (in alpha)

This KEP is revived to regain tracking for the traffic shaping feature. It was introduced in v1.9 (I think) but has not left "alpha" since 2018. The intention is to allow the function to graduate. (kubernetes/community#2202 (comment))

The original KEP was not merged or moved to kubernetes/enhancements.

Links

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 3, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: uablrek
To complete the pull request process, please assign caseydavenport after the PR has been reviewed.
You can assign the PR to them by writing /assign @caseydavenport in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 3, 2021
@uablrek
Copy link
Author

uablrek commented Jul 3, 2021

/cc @thockin

@k8s-ci-robot k8s-ci-robot requested a review from thockin July 3, 2021 13:25
@sftim
Copy link
Contributor

sftim commented Jul 6, 2021

Should we retitle this to KEP-0012?

@uablrek
Copy link
Author

uablrek commented Jul 6, 2021

If the number is still valid, yes. I don't know how the number is assigned. I read it is the trackíng number, but I don't know what that is.

@uablrek uablrek mentioned this pull request Jul 10, 2021
4 tasks
@uablrek uablrek changed the title KEP-0000: cni-bandwidth added KEP-2819: CNI traffic shaping support Jul 10, 2021
Now "CNI traffic shaping support" in KEP, issue and PR.
Comment on lines +77 to +78
"kubernetes.io/ingress-bandwidth": "10M",
"kubernetes.io/egress-bandwidth": "10M"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Co-authored-by: Tim Bannister <tim@scalefactory.com>
@uablrek
Copy link
Author

uablrek commented Sep 2, 2021

/assign @thockin

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 1, 2021
@frittentheke
Copy link

Is this still being pursued? Would be sad so see my proprietary implementations in various networks without going via the CNI.

@uablrek
Copy link
Author

uablrek commented Dec 3, 2021

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 3, 2021
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that Dockershim is out (even if it were to be reverted, that would be temporary) do we need to re-position this in terms of CRI? E.g. kubelet passes these annotations to the runtime via CRI and then containerd or CRI-O (or whatever) handle it in terms of CNI ?

@thockin
Copy link
Member

thockin commented Jan 24, 2022

Is this trying to hit 1.24 ?

@uablrek
Copy link
Author

uablrek commented Jan 25, 2022

Given that Dockershim is out (even if it were to be reverted, that would be temporary) do we need to re-position this in terms of CRI? E.g. kubelet passes these annotations to the runtime via CRI and then containerd or CRI-O (or whatever) handle it in terms of CNI ?

Um, does this only work with Dockershim?

Sorry, I haven't tested this myself only relayed a wish to promote this feature from my stake-holders. I am using cri-o so I would have noticed. If it's only working with dockershim I think we should take a step back and reconsider. I have seen proposals to pass pod annotations to CRI-plugins in a generic way but I cant recall where. Perhaps a KEP? If that is implemented it should be used. If not, this feature (and this KEP) should be deprecated. IMHO this feature should not be implemented with specific code.

@uablrek
Copy link
Author

uablrek commented Jan 25, 2022

Is this trying to hit 1.24 ?

No. Considering the comments above.

@lordofire
Copy link

Instead of handling all the heavy lifting traffic shaping logics on kubernetes CNI directly, will it makes more sense to just provide a mechanism for kubernetes cni to tag (through fields like TOS, etc) the traffic generated per pod through annotation, and offload the traffic shaping logics to the network layer directly? This might not be a good fit for cloud environment, since we do not have privilege to control the traffic shaping logics, but I think it makes it much easier on on-prem setup.

I'm fine with someone adding that mechanism, but I think it could work out fine as extension APIs rather than as part of core Kubernetes. (If it's purely done by extending Kubernetes, no KEP is needed).

Thanks @sftim for the response. Is there any reference on the "extension APIs" you mentioned? I would like to take a look when I'm free.

"type": "bandwidth",
"runtimeConfig": {
"trafficShaping": {
"ingressRate": "X",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the problem with the existing annotation is that they rely on plugin-specific definitions of bandwidth and burst.

We should probably clearly define either

  1. Burst isn't required
  2. Burst has a precise definition (e.g. Bucket size is N percent / ratio of bandwidth)

@sftim
Copy link
Contributor

sftim commented May 24, 2022

Is there any reference on the "extension APIs" you mentioned?

https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/


* [add traffic shaping support](https://github.com/kubernetes/kubernetes/pull/63194)

## Drawbacks
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only drawback is a larger feature: bandwidth is not currently a scheduled resource. Do we want to consider adding node-egress-bandwidth to a pod's resources?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is made complicated by the fact that bandwidth is only accountable as a pod-level resource, rather than a container-level resource (barring serious trickery)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

brief outline of an approach

Someone who wants to can define an extended resource and assign that to a Pod. Pick something like “bits per second” because you can only have whole numbers of these.

Nodes can advertise their actual uplink as the available amount of that resource. Or, if we decide to, we could implement some cleverness around a particular extended resource where this doesn't hold (allowing overcommitment).

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 5, 2022
@uablrek
Copy link
Author

uablrek commented Oct 6, 2022

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 6, 2022
@marosset marosset removed this from the v1.26 milestone Dec 12, 2022
- "@m1093782566"
owning-sig: sig-network
participating-sigs:
status: implemented
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would that mean that this is generally available?

Suggested change
status: implemented
status: implementable


## Alternatives

None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's document some alternative approaches.

Rather than using annotations, we could add the network restrictions to the Pod API.

That could allow the kubelet to report via .status whether the shaping was active. It also enables a richer control over flow shaping.

I'd prefer to see Pods refer indirectly to a custom resource, eg NetworkTrafficQuota, but still have the Pod report enforcement via its .status. (Why? Because you might have nodes where this works and other nodes where it doesn't).

@thockin
Copy link
Member

thockin commented Dec 22, 2022 via email

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 22, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 21, 2023
@lordofire
Copy link

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Apr 27, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 18, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 17, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants