Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while watching pods: too old resource version with v2.4.5 #226

Closed
raja-gola opened this issue Apr 8, 2020 · 13 comments · Fixed by kubernetes/kubernetes#92974
Closed

Comments

@raja-gola
Copy link

Error while watching pods: too old resource version: 23630883 (23632006) (RuntimeError)

complete stacktrace

#<Thread:0x00007f53f192fb70@/usr/local/bundle/gems/fluent-plugin-kubernetes_metadata_filter-2.4.5/lib/fluent/plugin/filter_kubernetes_metadata.rb:274 run> terminated with exception (report_on_exception is true):
/usr/local/bundle/gems/fluent-plugin-kubernetes_metadata_filter-2.4.5/lib/fluent/plugin/kubernetes_metadata_watch_pods.rb:43:in `rescue in set_up_pod_thread': undefined method `<' for nil:NilClass (NoMethodError)
        from /usr/local/bundle/gems/fluent-plugin-kubernetes_metadata_filter-2.4.5/lib/fluent/plugin/kubernetes_metadata_watch_pods.rb:38:in `set_up_pod_thread'
        from /usr/local/bundle/gems/fluent-plugin-kubernetes_metadata_filter-2.4.5/lib/fluent/plugin/filter_kubernetes_metadata.rb:274:in `block in configure'
/usr/local/bundle/gems/fluent-plugin-kubernetes_metadata_filter-2.4.5/lib/fluent/plugin/kubernetes_metadata_watch_pods.rb:133:in `block in process_pod_watcher_notices': Error while watching pods: too old resource version: 23630883 (23632006) (RuntimeError)
        from /usr/local/bundle/gems/kubeclient-4.6.0/lib/kubeclient/watch_stream.rb:28:in `block in each'
        from /usr/local/bundle/gems/http-4.4.1/lib/http/response/body.rb:37:in `each'
        from /usr/local/bundle/gems/kubeclient-4.6.0/lib/kubeclient/watch_stream.rb:25:in `each'
        from /usr/local/bundle/gems/fluent-plugin-kubernetes_metadata_filter-2.4.5/lib/fluent/plugin/kubernetes_metadata_watch_pods.rb:110:in `process_pod_watcher_notices'
        from /usr/local/bundle/gems/fluent-plugin-kubernetes_metadata_filter-2.4.5/lib/fluent/plugin/kubernetes_metadata_watch_pods.rb:40:in `set_up_pod_thread'
        from /usr/local/bundle/gems/fluent-plugin-kubernetes_metadata_filter-2.4.5/lib/fluent/plugin/filter_kubernetes_metadata.rb:274:in `block in configure'
Unexpected error undefined method `<' for nil:NilClass
  /usr/local/bundle/gems/fluent-plugin-kubernetes_metadata_filter-2.4.5/lib/fluent/plugin/kubernetes_metadata_watch_pods.rb:43:in `rescue in set_up_pod_thread'
  /usr/local/bundle/gems/fluent-plugin-kubernetes_metadata_filter-2.4.5/lib/fluent/plugin/kubernetes_metadata_watch_pods.rb:38:in `set_up_pod_thread'
  /usr/local/bundle/gems/fluent-plugin-kubernetes_metadata_filter-2.4.5/lib/fluent/plugin/filter_kubernetes_metadata.rb:274:in `block in configure'
@cjdmax
Copy link

cjdmax commented Apr 10, 2020

similar error with 2.4.5 on 1.15

@raja-gola
Copy link
Author

I have downgraded the plugin to v2.4.1 in fluent-operator Gemfile and I don't see any of these errors from past 2 days. BTW, it is on k8s version 1.16.3. So definitely an issue with 2.4.5

@smo921
Copy link

smo921 commented May 15, 2020

It looks like this is addressed in 2.4.6.

v2.4.5...v2.4.6#diff-1ef0b670f3d0a49f0c40eff0977bd52dR32

@lechen26
Copy link

we have the same exact issue with GKE 1.16.8-gke.15.

@Ghazgkull
Copy link
Contributor

I'm seeing the same error with v2.4.6

@Ghazgkull
Copy link
Contributor

Ghazgkull commented Jul 8, 2020

Digging into the "too old resource version" error a bit, I believe the problem lies in the handling of type ERROR watch responses here:

The Kubernetes concepts documentation explains this case and how clients should handle it in this document: https://kubernetes.io/docs/reference/using-api/api-concepts/#efficient-detection-of-changes

Here's the relevant excerpt:

A given Kubernetes server will only preserve a historical list of changes for a limited time. Clusters using etcd3 preserve changes in the last 5 minutes by default. When the requested watch operations fail because the historical version of that resource is not available, clients must handle the case by recognizing the status code 410 Gone, clearing their local cache, performing a list operation, and starting the watch from the resourceVersion returned by that new list operation. Most client libraries offer some form of standard tool for this logic. (In Go this is called a Reflector and is located in the k8s.io/client-go/cache package.)

@Ghazgkull
Copy link
Contributor

@jcantrill I suspect that the fix is to add special handling for status code 410 Gone in the ERROR block, and to handle this case similarly to the way the plugins handles DELETE.

@Ghazgkull
Copy link
Contributor

The kubeclient library that this plugin uses to perform the watch also explains that Whenever you ask for a specific version, you must be prepared for an 410 "Gone" error if the server no longer recognizes it.

See: https://github.com/abonas/kubeclient#starting-watch-version

@Ghazgkull
Copy link
Contributor

Ghazgkull commented Jul 10, 2020

According to the good folks on the kubeclient project, the way to check for the 410 status in the notice is:

notice['object']['code'] == 410

@Ghazgkull
Copy link
Contributor

@jcantrill I've submitted a PR which I believe should solve this problem based on my learnings explained in the comments above.

One question for you: Should I include a minor version bump in my PR if I'd like this to go into a new release? Or do the maintainers handle that process yourselves?

@jcantrill
Copy link
Contributor

@jcantrill I've submitted a PR which I believe should solve this problem based on my learnings explained in the comments above.

One question for you: Should I include a minor version bump in my PR if I'd like this to go into a new release? Or do the maintainers handle that process yourselves?

We'll bump the version when we publish

@jcantrill
Copy link
Contributor

https://rubygems.org/gems/fluent-plugin-kubernetes_metadata_filter/versions/2.5.1

@Ghazgkull
Copy link
Contributor

Awesome. Thanks, @jcantrill. Active maintainers like you are a treasure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants