Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod get with watch looses events #41

Closed
hekike opened this issue Dec 6, 2016 · 10 comments
Closed

Pod get with watch looses events #41

hekike opened this issue Dec 6, 2016 · 10 comments

Comments

@hekike
Copy link
Contributor

hekike commented Dec 6, 2016

I use watch in the following format:

kubernetes.api.ns.po.get({
      qs: {
        watch: true,
        labelSelector: PodManager.serviceNameLabel
      }
    }
})

After a while it stops listening to events.
Do you have an idea?

Thanks!

@silasbw
Copy link
Contributor

silasbw commented Dec 6, 2016

I haven't seen this. I'll try to repro later tonight. Thanks for the report @hekike .

@silasbw
Copy link
Contributor

silasbw commented Dec 7, 2016

@hekike I'm having trouble reproducing. What version is your kube api? can you post the snippet that comes after the one you posted above? e.g., something like: stream.on('data', stuff => console.log(stuff));

@jcrugzz
Copy link
Collaborator

jcrugzz commented Dec 7, 2016

@silasbw my only thought is that we are hitting an issue with there not being keep-alive by default for the http-agent. TCP socket might be just closing from underneath?

@silasbw
Copy link
Contributor

silasbw commented Dec 8, 2016

Hmm, maybe. I wouldn't expect keep-alive to matter because we're only sending a single request (the GET in this example) and receiving one (long) response. I'd expect keep-alive to make a difference if we were trying to issue multiple request over the same HTTP connection (1).

@hekike one possibility: is that that the kube apiserver is intentionally closing the HTTP connection according to --min-request-timeout (2). Are you using the default value there of 1800 seconds? Are your watch connections lasting at least that long?

@hekike
Copy link
Contributor Author

hekike commented Dec 8, 2016

@silasbw Yes, I'm using the default one and I would like to keep it open for much longer. Basically for forever. Shouldn't we add a re-connect logic to watch? But in this case reconnect would fire ADD event's after every connection. What do you think? Or I just should simply fetch with GET periodically and not over complicate things.

@silasbw
Copy link
Contributor

silasbw commented Dec 9, 2016

@hekike we should consider re-connecting, but I'm not sure if I know a good solution.

If we wanted to implement re-connection logic, what would be a good general purpose API? If we want something that "give me all the events via a stream" i worry it would be complicated to support the all part. Isn't there this race?

  1. connect, get stream
  2. events created, read from stream
  3. disconnection
  4. event created
  5. reconnected (but miss event created in 4.)

Implementing something that ensures we don't miss the event in 4. seems challenging. If we communicated re-connection attempts, application specific logic could deal with potentially loosing events, but then the API is "give me most of the events (via a stream?) and let me know when I might have missed some". At that point, I'm not sure there's much benefit.

Thoughts?

@hekike
Copy link
Contributor Author

hekike commented Dec 10, 2016

Yes, you are right, it's not an easy one. But the current "watch until you can" is also not very good. It's misleading.

Would it be crazy to create two watch streams in the bg (re-connect them frequently but separately after each other) and always switch to the live one? Something like the concept of blue-green deployments? It's still not bullet proof, but probably would solve the connection timeout issues.

@silasbw
Copy link
Contributor

silasbw commented Dec 13, 2016

I like the blue-green approach but I think it would be challenging to get right. How would we synchronize the two streams? With pods, for example, would we ignore the first N ADD events the happen when we open a new watch stream with the expectation that, that would help us synchronize the old and new streams? How would we know what N is? What does the Kubernetes API guarantee about events sent to different streams? In practice they're comparable, but is that something the API is committed to?

Can you provide a more complete example of what you're implementing? I wonder if there's a higher-level abstraction we could provide that would be useful for your implementation. For example, if you'd like to cache the state of objects locally, or be notified when objects change, we could write an abstraction on top of watching that does automatic reconnects and provides a different API (e.g., automatically updates a cache to read from, or emits an event when an object changes).

@hekike
Copy link
Contributor Author

hekike commented Dec 19, 2016

You are right, these are hard questions. I'm still thinking about a proper solution. In my use-case I solved it with re-fetching the pod endpoint periodically and maintaining it locally. But it's not really the point of the watch API. For me it was enough to know the running pods in the namespace instead of knowing the exact changes.

We can figure out a complex solution for this, but it's also up to you that what's the scope of this library. Maybe it would be enough to add a notice to the README that watch doesn't live for forever and be careful to use it. What do you think?

@silasbw
Copy link
Contributor

silasbw commented Dec 20, 2016

Good suggestion about noting in the README.md:

#51

Another thing we could do is add some examples: links to real application using kubernetes-client and add some toy examples to this repo. A toy example illustrating useful ways to leverage watching (and handle disconnects) could be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants