Skip to content

Kubernetes: cleanup channel handling#932

Merged
vdemeester merged 2 commits intotraefik:masterfrom
yvespp:master
Dec 8, 2016
Merged

Kubernetes: cleanup channel handling#932
vdemeester merged 2 commits intotraefik:masterfrom
yvespp:master

Conversation

@yvespp
Copy link
Contributor

@yvespp yvespp commented Dec 3, 2016

This pull includes #929 and cleans up the channel handling of the Kubernetes Provider:

  • Only use one channel for all Kubernetes watches
  • Re-use stop channel from the provider
  • Skip all events that have already been handled by the provider

@philk
Copy link
Contributor

philk commented Dec 5, 2016

Hah, I'd started to work on a similar idea for cleaning up the channels, right down to eventHandlerFunc 😆

This looks great, I'll close my PR.

@yvespp
Copy link
Contributor Author

yvespp commented Dec 5, 2016

Feel free to improve it! :-)
I also tried to get rid of the go routine in the client and only use one select but then it dead locks almost immediately on HasSynced().

Maybe there is a nicer way to check for HasSynced() before passing the event on?

@yvespp
Copy link
Contributor Author

yvespp commented Dec 6, 2016

@emilevauge can you have a look at this?
People are using master and running into the deadlock #929.

Copy link
Member

@emilevauge emilevauge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thanks a lot @yvespp & @philk 🎉
/cc @errm

@macat
Copy link

macat commented Dec 6, 2016

I just deployed this PR to our cluster and it does fix the issue! Thanks a lot! @yvespp

@yvespp
Copy link
Contributor Author

yvespp commented Dec 6, 2016

@philk actually fixed it! 😃

@macat
Copy link

macat commented Dec 6, 2016

Well, thanks to @philk!

@philk
Copy link
Contributor

philk commented Dec 6, 2016

Team effort 😉

@emilevauge
Copy link
Member

@yvespp Could you rebase please?
Hey @containous/traefik, we need a LGTM here :)

philk and others added 2 commits December 7, 2016 20:12
On a reasonably sized cluster:
63 nodes
87 services
90 endpoints

The initialization of the k8s provider would hang.

I tracked this down to the ResourceEventHandlerFuncs. Once you reach the
channel buffer size (10) the k8s Informer gets stuck. You can't read or
write messages to the channel anymore. I think this is probably a lock
issue somewhere in k8s but the more reasonable solution for the traefik
usecase is to just drop events when the queue is full since we only use
the events for signalling, not their content, thus dropping an event
doesn't matter.
Only use one channel for all watches
Re-use stop channel from the provider
Skip events that have already been handled by the provider, builds on 007f8cc
Copy link
Contributor

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🐯

@vdemeester vdemeester merged commit c500873 into traefik:master Dec 8, 2016
@ldez ldez added this to the 1.2 milestone Oct 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants