Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endpoints API Object could not support large number of endpoints #73324

Closed
freehan opened this issue Jan 25, 2019 · 8 comments
Closed

Endpoints API Object could not support large number of endpoints #73324

freehan opened this issue Jan 25, 2019 · 8 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@freehan
Copy link
Contributor

freehan commented Jan 25, 2019

What happened:

When a service select many backend pods (e.g. 10k), the corresponding Endpoints object becomes big (>1MB in proto).

On Kube-proxy, the API watcher will start dropping the endpoints object and will not program iptables for the service. #57073

k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: watch of *core.Endpoints ended with: very short watch: 
k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Unexpected watch close - watch lasted less than a second and no items received

On master, if the object is too big. It can result in write failure to etcd.

Error syncing endpoints for service "xxxxx/xxxxxxxxx": etcdserver: request is too large

What you expected to happen:

It should just work.

Proposed Short Term Fix:

  1. Add support in Endpoint controller to truncate the endpoint object to the size supported(configured). With this approach, while all endpoints are not reflected, traffic disruption can be avoided by avoiding stale IP tables state on nodes.
  2. Add events when endpoint object size is approaching limits and when it exceeds limit - resulting in truncation. (Ideally also generate events where we would drop the message)

Environment: OSS K8s

  • Kubernetes version (use kubectl version): 1.11
@freehan freehan added sig/network Categorizes an issue or PR as relevant to SIG Network. kind/feature Categorizes issue or PR as related to a new feature. labels Jan 25, 2019
@freehan freehan added this to the v1.14 milestone Jan 25, 2019
@freehan freehan self-assigned this Jan 25, 2019
@nikopen
Copy link
Contributor

nikopen commented Mar 1, 2019

Hi @freehan , is another PR needed to close this?
Code Freeze is in effect from next Friday.

@thockin thockin added the triage/unresolved Indicates an issue that can not or will not be resolved. label Mar 8, 2019
@liggitt
Copy link
Member

liggitt commented Mar 13, 2019

given this is not a regression, it doesn't seem release blocking

@spiffxp
Copy link
Member

spiffxp commented Mar 13, 2019

/milestone clear
v1.14 release lead here, at this late stage in the release cycle, this doesn't seem like it's destined for this release.

Please come talk to us in #sig-release if you feel this was done in error

@k8s-ci-robot k8s-ci-robot removed this from the v1.14 milestone Mar 13, 2019
@thockin thockin removed the triage/unresolved Indicates an issue that can not or will not be resolved. label Mar 21, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 19, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 19, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@alejandrox1
Copy link
Contributor

This is being tackled in https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/0752-endpointslices
so removing the rotten label to (to distinguish this from issues which are labeled as rotten and do need some work done)
/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Nov 14, 2020
jhrozek added a commit to jhrozek/security-profiles-operator that referenced this issue Mar 11, 2021
This PR introduced a new API secpolnodestatus that aims to address the
following problems:
 - it was impossible to reflect per-node status such as a profile failed
   to install on a single node or a node does not support the selected security
   profile
 - simply adding per-node attributes like a map might not scale, given
   very high number of nodes, the object might become too big for etcd's 1MB
   limit (see also kubernetes/kubernetes#73324)
 - because the per-profile status is written to by several sources (each
   pod in a DaemonSet), the status might appear as "flapping" as different pods
   reach different states at their own pace.

The secpolnodestatus is created, managed and deleted together with finalizers
through an API in a new module. Instead of updating the global state directly,
the DS pods now call the API update method that updates the node status and if
needed, also the global status.

When a policy is deleted, the object is marked as terminating, when the policy
payload is removed, the node status object along with its finalizer is deleted.
Finally, when all finalizers are gone, so is the global policy object.
jhrozek added a commit to jhrozek/security-profiles-operator that referenced this issue Mar 12, 2021
This PR introduced a new API secpolnodestatus that aims to address the
following problems:
 - it was impossible to reflect per-node status such as a profile failed
   to install on a single node or a node does not support the selected security
   profile
 - simply adding per-node attributes like a map might not scale, given
   very high number of nodes, the object might become too big for etcd's 1MB
   limit (see also kubernetes/kubernetes#73324)
 - because the per-profile status is written to by several sources (each
   pod in a DaemonSet), the status might appear as "flapping" as different pods
   reach different states at their own pace.

The secpolnodestatus is created, managed and deleted together with finalizers
through an API in a new module. Instead of updating the global state directly,
the DS pods now call the API update method that updates the node status and if
needed, also the global status.

When a policy is deleted, the object is marked as terminating, when the policy
payload is removed, the node status object along with its finalizer is deleted.
Finally, when all finalizers are gone, so is the global policy object.
jhrozek added a commit to jhrozek/security-profiles-operator that referenced this issue Mar 12, 2021
This PR introduced a new API secpolnodestatus that aims to address the
following problems:
 - it was impossible to reflect per-node status such as a profile failed
   to install on a single node or a node does not support the selected security
   profile
 - simply adding per-node attributes like a map might not scale, given
   very high number of nodes, the object might become too big for etcd's 1MB
   limit (see also kubernetes/kubernetes#73324)
 - because the per-profile status is written to by several sources (each
   pod in a DaemonSet), the status might appear as "flapping" as different pods
   reach different states at their own pace.

The secpolnodestatus is created, managed and deleted together with finalizers
through an API in a new module. Instead of updating the global state directly,
the DS pods now call the API update method that updates the node status and if
needed, also the global status.

When a policy is deleted, the object is marked as terminating, when the policy
payload is removed, the node status object along with its finalizer is deleted.
Finally, when all finalizers are gone, so is the global policy object.
jhrozek added a commit to jhrozek/security-profiles-operator that referenced this issue Mar 15, 2021
This PR introduced a new API secpolnodestatus that aims to address the
following problems:
 - it was impossible to reflect per-node status such as a profile failed
   to install on a single node or a node does not support the selected security
   profile
 - simply adding per-node attributes like a map might not scale, given
   very high number of nodes, the object might become too big for etcd's 1MB
   limit (see also kubernetes/kubernetes#73324)
 - because the per-profile status is written to by several sources (each
   pod in a DaemonSet), the status might appear as "flapping" as different pods
   reach different states at their own pace.

The secpolnodestatus is created, managed and deleted together with finalizers
through an API in a new module. Instead of updating the global state directly,
the DS pods now call the API update method that updates the node status and if
needed, also the global status.

When a policy is deleted, the object is marked as terminating, when the policy
payload is removed, the node status object along with its finalizer is deleted.
Finally, when all finalizers are gone, so is the global policy object.
jhrozek added a commit to jhrozek/security-profiles-operator that referenced this issue Mar 15, 2021
This PR introduced a new API secpolnodestatus that aims to address the
following problems:
 - it was impossible to reflect per-node status such as a profile failed
   to install on a single node or a node does not support the selected security
   profile
 - simply adding per-node attributes like a map might not scale, given
   very high number of nodes, the object might become too big for etcd's 1MB
   limit (see also kubernetes/kubernetes#73324)
 - because the per-profile status is written to by several sources (each
   pod in a DaemonSet), the status might appear as "flapping" as different pods
   reach different states at their own pace.

The secpolnodestatus is created, managed and deleted together with finalizers
through an API in a new module. Instead of updating the global state directly,
the DS pods now call the API update method that updates the node status and if
needed, also the global status.

When a policy is deleted, the object is marked as terminating, when the policy
payload is removed, the node status object along with its finalizer is deleted.
Finally, when all finalizers are gone, so is the global policy object.
jhrozek added a commit to jhrozek/security-profiles-operator that referenced this issue Mar 17, 2021
This PR introduced a new API secpolnodestatus that aims to address the
following problems:
 - it was impossible to reflect per-node status such as a profile failed
   to install on a single node or a node does not support the selected security
   profile
 - simply adding per-node attributes like a map might not scale, given
   very high number of nodes, the object might become too big for etcd's 1MB
   limit (see also kubernetes/kubernetes#73324)
 - because the per-profile status is written to by several sources (each
   pod in a DaemonSet), the status might appear as "flapping" as different pods
   reach different states at their own pace.

The secpolnodestatus is created, managed and deleted together with finalizers
through an API in a new module. Instead of updating the global state directly,
the DS pods now call the API update method that updates the node status and if
needed, also the global status.

When a policy is deleted, the object is marked as terminating, when the policy
payload is removed, the node status object along with its finalizer is deleted.
Finally, when all finalizers are gone, so is the global policy object.
jhrozek added a commit to jhrozek/security-profiles-operator that referenced this issue Mar 17, 2021
This PR introduced a new API secpolnodestatus that aims to address the
following problems:
 - it was impossible to reflect per-node status such as a profile failed
   to install on a single node or a node does not support the selected security
   profile
 - simply adding per-node attributes like a map might not scale, given
   very high number of nodes, the object might become too big for etcd's 1MB
   limit (see also kubernetes/kubernetes#73324)
 - because the per-profile status is written to by several sources (each
   pod in a DaemonSet), the status might appear as "flapping" as different pods
   reach different states at their own pace.

The secpolnodestatus is created, managed and deleted together with finalizers
through an API in a new module. Instead of updating the global state directly,
the DS pods now call the API update method that updates the node status and if
needed, also the global status.

When a policy is deleted, the object is marked as terminating, when the policy
payload is removed, the node status object along with its finalizer is deleted.
Finally, when all finalizers are gone, so is the global policy object.
k8s-ci-robot pushed a commit to kubernetes-sigs/security-profiles-operator that referenced this issue Mar 18, 2021
This PR introduced a new API secpolnodestatus that aims to address the
following problems:
 - it was impossible to reflect per-node status such as a profile failed
   to install on a single node or a node does not support the selected security
   profile
 - simply adding per-node attributes like a map might not scale, given
   very high number of nodes, the object might become too big for etcd's 1MB
   limit (see also kubernetes/kubernetes#73324)
 - because the per-profile status is written to by several sources (each
   pod in a DaemonSet), the status might appear as "flapping" as different pods
   reach different states at their own pace.

The secpolnodestatus is created, managed and deleted together with finalizers
through an API in a new module. Instead of updating the global state directly,
the DS pods now call the API update method that updates the node status and if
needed, also the global status.

When a policy is deleted, the object is marked as terminating, when the policy
payload is removed, the node status object along with its finalizer is deleted.
Finally, when all finalizers are gone, so is the global policy object.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

No branches or pull requests

8 participants