-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix events plugin #48
Conversation
def close | ||
@watcher.each &:finish | ||
def stop | ||
log.debug "Clean up before stopping completely" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we be sure to document in the README how to enable debug logs for a fluentd plugin (is there a way to only enable debug logs for a specific plugin?) so that the customer/support team can enable these debug logs as needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in the README Sam put in troubleshooting steps that includes turning on debug logs for metrics. We'll make that generic to all three data types afterwards
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've just updated the plugin with steps to enable debug logs for a specific plugin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, one nit. Question though, what is the benefit of periodically recreating the watcher thread?
@samjsong Good question. We found that if we don't specify the timeout parameter on the watch stream, it will be closed by the API server anyway after around 50 minutes even if events are continuously being generated. So now we specify a timeout on the watch stream and preemptively close it before the timeout so we can pull the latest RV before we close it. If we only rely on the timeout and let the API server close it, we wound't have a way to do anything before it gets closed. (we can keep a timer ourselves but we don't know if it will be in sync with the k8s api server) |
* make a true cli * update messages * rename env vars * remove codified defaults * remove debug line
After discovering the issue with k8s watch and list API regarding the resource version parameter, we redesigned the plugin flow. Now, the main thread acts as a monitor that loops forever (until the plugin is told to shut down) to update configmap and preemptively recreate the watch thread periodically. In more detail, it mainly does these things:
List
API and store in configmap everyconfigmap_update_interval_seconds
(10s by default).watch_interval_seconds
since the last time we recreated the watch thread. If so, first pull latest resource version, close the watch thread by calling kubeclient method to close the http connection, then create a new watch stream using the RV we just pulledWe listen for SIGTERM so on graceful shutdown we can break out of the while loop in
start_monitor
(apparently if the while loop keeps running, the shutdown sequence won't get triggered)FYI @rvmiller89 @lei-sumo @frankreno