Watching third party resources gives client error: "Unable to decode an event from the watch stream: got invalid watch event type:" #26003

Open
mjg59 opened this Issue May 21, 2016 · 10 comments

Comments

mjg59 commented May 21, 2016

Creating a third party resource type from https://gist.github.com/mjg59/b24995501a7064b0b2e1768b7fe71426 , adding an object (https://gist.github.com/mjg59/98b85a924fddecbd08018f1e2f60e17d) and then running https://gist.github.com/mjg59/3c4a2ef23bc0844342774d1ed71f46fc gives a stream of:

New policy
E0521 00:33:16.589647 31839 streamwatcher.go:109] Unable to decode an event from the watch stream: got invalid watch event type:
W0521 00:33:16.589717 31839 reflector.go:343] k8s.io/kubernetes/cmd/watchtest/watchtest.go:70: watch of *api.Node ended with: very short watch
Updated policy
E0521 00:33:17.592341 31839 streamwatcher.go:109] Unable to decode an event from the watch stream: got invalid watch event type:
W0521 00:33:17.592365 31839 reflector.go:343] k8s.io/kubernetes/cmd/watchtest/watchtest.go:70: watch of *api.Node ended with: very short watch
Updated policy
E0521 00:33:18.595440 31839 streamwatcher.go:109] Unable to decode an event from the watch stream: got invalid watch event type:
W0521 00:33:18.595475 31839 reflector.go:343] k8s.io/kubernetes/cmd/watchtest/watchtest.go:70: watch of *api.Node ended with: very short watch

repeating forever. This is with current master and sitepod/kubernetes@d0e9c02 applied.

Contributor

borismattijssen commented May 23, 2016

I think that's because the dynamic client returns UnstructuredList with Unstructured items, when listing. These do not match api.Node.
@sjenning is working on this.

Contributor

sjenning commented Jun 1, 2016

I'm not working on this ATM. However, this is a big limitation of the dynamic client right now so if anyone wants to take it on, please feel free.

Contributor

fabiand commented Nov 4, 2016

We are seeing this as well.

wfarr commented Nov 4, 2016

There's a relevant issue in kubernetes/client-go#8 where, thanks to @caesarxuchao, we have some work-arounds that have been working well for me thus far (there are also some examples of working code linked in that issue).

Contributor

krmayankk commented Jan 24, 2017

i am also seeing that watching tpr, causes the tpr to always change, even when no update has been performed. Is this a known issue ? I am seeing this on 1.3.6 ?

Contributor

nilebox commented Mar 9, 2017

@wfarr @caesarxuchao do you know what's the exact fix or workaround for this problem? I followed the issue client-go#8 but don't seem to find what is the workaround for this periodical error in the log.

Contributor

nilebox commented Mar 9, 2017

@caesarxuchao it would be nice if you added an example with watch on TPR. There are many unobvious issues on the way.

For example, you can't use cache.NewListWatchFromClient because it relies on global codec which is unaware of custom TPR types (see the workaround in my repo and the issue comment).

The "Unable to decode an event from the watch stream..." error could be another one.

Such issues are hard to cover in the docs (since they are deep technical details), but having a proper example would be really useful.

Member

ash2k commented Mar 10, 2017

It seems I have found the cause of the issue and the fix.

"kind":"Status",
"apiVersion":"v1",
"metadata":{},
"status":"Failure",
"message":"401: The event in requested index is outdated and cleared (the requested history has been cleared [5918/5894]) [6917]",
"reason":"Expired",
"code":410
}

This is the event that is received by a watch when it cannot be re-established. It cannot be deserialized by a custom Scheme object I'm using because it does not know anything about Status type. So the fix is obvious - register the type. It turns out there is a method to register all known types, including unversioned types. Take a look at atlassian/smith#19

Can someone please confirm this is the right thing to do?

Member

ash2k commented Mar 10, 2017

Hm, it seems the problem with the original code in this report is slightly different but the cause is the same. In my case a new client is used for specific type with new Scheme object. To fix the code in this report a new Scheme should be created and used for unversioned objects which dynamic client uses. This is my understanding but I might be wrong.
Please someone who actually knows how kube works tell us how to use this client.

@enisoc enisoc added this to Backlog in CustomResourceDefinition Jul 13, 2017

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment