Fix events plugin #48

maimaisie · 2019-06-19T00:06:41Z

After discovering the issue with k8s watch and list API regarding the resource version parameter, we redesigned the plugin flow. Now, the main thread acts as a monitor that loops forever (until the plugin is told to shut down) to update configmap and preemptively recreate the watch thread periodically. In more detail, it mainly does these things:

Pull latest resource version from the List API and store in configmap every configmap_update_interval_seconds (10s by default).
Check if the watch thread is running. If it's not, create the watcher thread using the resource version from the the configmap to resume from an earlier point in time if
1. this is the first iteration (when the plugin just started and the watch thread does not exist)
2. the watcher thread somehow died (thread is not alive)
Check if it has been watch_interval_seconds since the last time we recreated the watch thread. If so, first pull latest resource version, close the watch thread by calling kubeclient method to close the http connection, then create a new watch stream using the RV we just pulled

We listen for SIGTERM so on graceful shutdown we can break out of the while loop in start_monitor (apparently if the while loop keeps running, the shutdown sequence won't get triggered)

FYI @rvmiller89 @lei-sumo @frankreno

rvmiller89 · 2019-06-19T16:59:58Z

fluent-plugin-events/lib/fluent/plugin/in_events.rb

-      def close
-        @watcher.each &:finish
+      def stop
+        log.debug "Clean up before stopping completely"


can we be sure to document in the README how to enable debug logs for a fluentd plugin (is there a way to only enable debug logs for a specific plugin?) so that the customer/support team can enable these debug logs as needed?

Yes, in the README Sam put in troubleshooting steps that includes turning on debug logs for metrics. We'll make that generic to all three data types afterwards

👍 Thanks

Thanks, I've just updated the plugin with steps to enable debug logs for a specific plugin

samjsong

Overall LGTM, one nit. Question though, what is the benefit of periodically recreating the watcher thread?

fluent-plugin-events/lib/fluent/plugin/in_events.rb

maimaisie · 2019-06-19T18:02:24Z

@samjsong Good question. We found that if we don't specify the timeout parameter on the watch stream, it will be closed by the API server anyway after around 50 minutes even if events are continuously being generated. So now we specify a timeout on the watch stream and preemptively close it before the timeout so we can pull the latest RV before we close it. If we only rely on the timeout and let the API server close it, we wound't have a way to do anything before it gets closed. (we can keep a timer ourselves but we don't know if it will be in sync with the k8s api server)

fluent-plugin-events/lib/fluent/plugin/in_events.rb

* make a true cli * update messages * rename env vars * remove codified defaults * remove debug line

maimaisie added 3 commits June 14, 2019 15:52

re-architect

d009065

Fix monitor thread and data loss issue on graceful shutdown

709a6a4

Change info to debug log

676ff06

maimaisie requested review from yuting-liu and samjsong June 19, 2019 00:06

Use saved rv if watch thread gets closed from the server

234ed30

rvmiller89 reviewed Jun 19, 2019

View reviewed changes

samjsong reviewed Jun 19, 2019

View reviewed changes

yuting-liu reviewed Jun 19, 2019

View reviewed changes

fluent-plugin-events/lib/fluent/plugin/in_events.rb Show resolved Hide resolved

fluent-plugin-events/lib/fluent/plugin/in_events.rb Show resolved Hide resolved

samjsong reviewed Jun 19, 2019

View reviewed changes

fluent-plugin-events/lib/fluent/plugin/in_events.rb Outdated Show resolved Hide resolved

Rename maps to map

23f0ad0

yuting-liu approved these changes Jun 19, 2019

View reviewed changes

samjsong approved these changes Jun 19, 2019

View reviewed changes

maimaisie merged commit d32a2cb into master Jun 19, 2019

maimaisie deleted the maisie-fixes branch June 19, 2019 22:42

psaia pushed a commit to psaia/sumologic-kubernetes-collection that referenced this pull request May 25, 2021

Update sync tool to support adding/removing individuals (SumoLogic#48)

70cf2b0

* make a true cli * update messages * rename env vars * remove codified defaults * remove debug line

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix events plugin #48

Fix events plugin #48

maimaisie commented Jun 19, 2019

rvmiller89 Jun 19, 2019

maimaisie Jun 19, 2019

rvmiller89 Jun 19, 2019

samjsong Jun 19, 2019

samjsong left a comment

maimaisie commented Jun 19, 2019

Fix events plugin #48

Fix events plugin #48

Conversation

maimaisie commented Jun 19, 2019

rvmiller89 Jun 19, 2019

Choose a reason for hiding this comment

maimaisie Jun 19, 2019

Choose a reason for hiding this comment

rvmiller89 Jun 19, 2019

Choose a reason for hiding this comment

samjsong Jun 19, 2019

Choose a reason for hiding this comment

samjsong left a comment

Choose a reason for hiding this comment

maimaisie commented Jun 19, 2019