Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --watchers flag to allow controller to respond automatically to Ingress or Service updates #687

Open
wants to merge 1 commit into
base: master
from

Conversation

@jlamillan
Copy link
Contributor

jlamillan commented Aug 24, 2018

This change is related to issue #484. It builds on this pull-request, and introduces a new flag called --events that when enabled, adds the controller's Run function as the event handler so that it's automatically called when a Service and/or Ingress are updated. Note that these "event" driven calls are complementary to the regular polling calls that occur every --interval. For example, we are experimenting with using --events as a way to increase our --interval while still being able to process updates to Services and Ingresses quickly and using the interval to handle of band changes every hour or so.

Some other notes:

  • On start-up, the Informers perform an initial list operation. A BoundedFrequencyRunner is being used to wrap the handler to limit calls to RunOnce generated by events and to avoid a call to RunOnce for every Service/Ingress on initial startup.
  • Similarly, the resync-period of the Informer is currently being set to 0 to further avoid any unnecessary call-backs from the SharedInformers unless a change has occurred. We can continue to rely on --interval to handle regular syncs since not all sources support events.
  • When --events is specified, any k8s client timeout is disabled to avoid periodic warning messages of timeouts that are caused by the persistent HTTP connection mechanism the informers use.
@jlamillan jlamillan force-pushed the jlamillan:jlamillan/add_watch_flag branch from 95323c6 to f8b3ddf Aug 24, 2018
@linki

This comment has been minimized.

Copy link
Contributor

linki commented Aug 30, 2018

@jlamillan Thanks for the pull request. We'll look into it.

@jlamillan jlamillan force-pushed the jlamillan:jlamillan/add_watch_flag branch from f8b3ddf to 068de95 Sep 7, 2018
@jlamillan

This comment has been minimized.

Copy link
Contributor Author

jlamillan commented Sep 7, 2018

Thanks @linki.

Rebasing my your branch onto the latest...

@jlamillan jlamillan force-pushed the jlamillan:jlamillan/add_watch_flag branch 4 times, most recently from e93584a to f0a931a Sep 7, 2018
@jlamillan jlamillan force-pushed the jlamillan:jlamillan/add_watch_flag branch from f0a931a to de83bcb Sep 25, 2018
@jlamillan

This comment has been minimized.

Copy link
Contributor Author

jlamillan commented Sep 25, 2018

Re-based again. @linki eagerly awaiting your thoughts and feedback here. We've been using this change for awhile and it's been working well for us.

@jlamillan jlamillan force-pushed the jlamillan:jlamillan/add_watch_flag branch 2 times, most recently from 8ae3e21 to eed4ed0 Oct 30, 2018
@jlamillan jlamillan force-pushed the jlamillan:jlamillan/add_watch_flag branch from eed4ed0 to 384e95e Nov 7, 2018
@jlamillan jlamillan force-pushed the jlamillan:jlamillan/add_watch_flag branch from 384e95e to 0f03d33 Dec 21, 2018
@jlamillan

This comment has been minimized.

Copy link
Contributor Author

jlamillan commented Dec 21, 2018

/assign @Raffo

@Raffo

This comment has been minimized.

Copy link
Contributor

Raffo commented Dec 27, 2018

I got no bandwidth for this right now, I have to push it to the new year (likely CW2, CW3). If someone wants to review this first, please go on.

@lbernail

This comment has been minimized.

Copy link

lbernail commented Jan 29, 2019

We'd be really interested in using informers too: in our mid-sized clusters (~1000 nodes, ~1500 services, ~1500 endpoints), generating endpoints takes ~5mn (the current code perform a lot of list pods calls, because we use headless services, which are heavy ones for the apiservers)

If I understand the PR correctly, it just uses Informers to trigger a sync loop on events. What would be even better is to use the informer cache to store services and pods (and, yes possibly trigger a sync on update).

In addition, using the pod objects is very inefficient compared to using endpoints (in which case we could native rely on pod Readiness instead of looking at the pod phase). If I understand correctly, we use pods for the publishHostIP case. Maybe we could use endpoints in general and fallback on pods for this specific use case?

@jlamillan

This comment has been minimized.

Copy link
Contributor Author

jlamillan commented Feb 1, 2019

What would be even better is to use the informer cache to store services and pods (and, yes possibly trigger a sync on update).

Good idea, I'll adjust the PR soon...

@lbernail

This comment has been minimized.

Copy link

lbernail commented Feb 3, 2019

Great, I can help if needed (and will definitely test)

Another idea would be to have a resync period (to avoid getting in an inconsistent state over time) but to use a BoundedFrequencyRunner for the main sync function to avoid running it if it was run less than X seconds ago (kube-proxy works like that for instance)

@jlamillan jlamillan force-pushed the jlamillan:jlamillan/add_watch_flag branch 2 times, most recently from 27ccc45 to 0fb60fb Feb 19, 2019
@jlamillan jlamillan changed the title Add --watchers flag to allow controller to respond automatically to Ingress or Service updates WIP: Add --watchers flag to allow controller to respond automatically to Ingress or Service updates Feb 21, 2019
@jlamillan jlamillan force-pushed the jlamillan:jlamillan/add_watch_flag branch 3 times, most recently from 634a0d7 to 02a8045 Feb 21, 2019
@jlamillan

This comment has been minimized.

Copy link
Contributor Author

jlamillan commented Feb 21, 2019

@lbernail, check out this branch, which updates ingress and service sources to use the informer cache instead of making API requests for ingresses and services/pods/nodes, respectively.

If that change looks good, I could possibly submit that as a separate pull-request and use this pull-request to subsequently add the ability to have informers to trigger a sync loop on events via --events flag.

i.e.

commit 1: Use k8s informer cache instead of active API server calls in ingress...
commit 2: Add --events flag to use k8s informers to automatically trigger sync loop…

@jlamillan jlamillan force-pushed the jlamillan:jlamillan/add_watch_flag branch from 09b3ab4 to 0387847 Jul 23, 2019
@jlamillan

This comment has been minimized.

Copy link
Contributor Author

jlamillan commented Jul 23, 2019

/retest

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Jul 23, 2019

@jlamillan: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jlamillan jlamillan force-pushed the jlamillan:jlamillan/add_watch_flag branch from 0387847 to 3376d6a Jul 23, 2019
@jlamillan

This comment has been minimized.

Copy link
Contributor Author

jlamillan commented Jul 29, 2019

@Raffo /cc @njuettner @linki,

the branch is rebased. Let me know if there's anything else you'd like me to take care of.

@johanneswuerbach

This comment has been minimized.

Copy link

johanneswuerbach commented Oct 8, 2019

We would also be interested in seeing this merged, anything I can do to help?

@jlamillan jlamillan force-pushed the jlamillan:jlamillan/add_watch_flag branch from 801a1d5 to 29cc70e Oct 8, 2019
@jlamillan

This comment has been minimized.

Copy link
Contributor Author

jlamillan commented Oct 8, 2019

Re-based and re-assigning to @njuettner since he's already reviewed this and @Raffo hasn't had the bandwidth.

@jlamillan

This comment has been minimized.

Copy link
Contributor Author

jlamillan commented Oct 8, 2019

/unassign @Raffo

@jlamillan

This comment has been minimized.

Copy link
Contributor Author

jlamillan commented Oct 8, 2019

/assign @njuettner

@njuettner

This comment has been minimized.

Copy link
Member

njuettner commented Nov 19, 2019

@jlamillan do you mind fixing the conflicting files and then I'm happy to merge it.

@jlamillan jlamillan force-pushed the jlamillan:jlamillan/add_watch_flag branch from 29cc70e to e5d8857 Nov 20, 2019
@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Nov 20, 2019

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jlamillan
To complete the pull request process, please assign linki
You can assign the PR to them by writing /assign @linki in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jlamillan

This comment has been minimized.

Copy link
Contributor Author

jlamillan commented Nov 20, 2019

Done @njuettner.

As a reminder - --events defaults to false. At some point, we should consider to having it default to true.

@SEJeff

This comment has been minimized.

Copy link

SEJeff commented Dec 5, 2019

@jlamillan can you fix the conflicts?

@jlamillan jlamillan force-pushed the jlamillan:jlamillan/add_watch_flag branch 2 times, most recently from 7e735a0 to 29018c2 Dec 6, 2019
@jlamillan

This comment has been minimized.

Copy link
Contributor Author

jlamillan commented Dec 6, 2019

@jlamillan can you fix the conflicts?

done.

@njuettner

This comment has been minimized.

Copy link
Member

njuettner commented Jan 9, 2020

@jlamillan sorry again, do you mind fixing the conflicts again?

… on adds/updates/deletes for supported ingress and service sources.
@jlamillan jlamillan force-pushed the jlamillan:jlamillan/add_watch_flag branch from 29018c2 to c324019 Jan 10, 2020
@FrederikNS

This comment has been minimized.

Copy link

FrederikNS commented Jan 13, 2020

@njuettner: It seems that this has been rebased, and is ready to be merged. It would be a bit of a shame if @jlamillan would have to rebase again a 7th time.

@johanneswuerbach

This comment has been minimized.

Copy link

johanneswuerbach commented Jan 15, 2020

Sorry for the nudge @SEJeff / @njuettner , but we are also really looking forward to see this merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.