-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thread Errors - Deadlock? Other source? #143
Comments
I have never seen this error. |
fluent-plugin-biquery version v1.0.0 with this manually patched eaec2d1 With FluentD 14 (td-agent 3) Failue happens whether using multiple workers or just one. If I remove the customer_id buffer param then it runs fine. I have 5 eventType handlers like these. One streams, the others load. If flush interval is set higher the system fails faster. It eventually locks up and Forwarders see the Aggregator (config below) as down.
4 identical load types
Forwarder looks like this. The aggregator seizes up and fails regardless of flush_interval and thread setting.
|
To provide some more info. I changed messages to have the customer_id in the tag and to route based on that. So, instead of using message data only the tag is used for processing. This issue has since gone away. It seems that the deadlock occurs when the messages are forwarded undifferentiated and have to be split at the aggregator level. |
I believe I've found the underlying source of the thread issues, I'm just not sure if it is a fluentd issue or this plugin. What I've found is that if you have a lot of reorganization of forwarded chunks then fluend becomes unstable and has many issues. When I have my chunks all aligned to 999k everything runs very well. I tried reducing the forwarded chunks to 100k on a few test servers and the system became unstable. Changing back to 999k all around restored stability. It seems something in the rechunking logic is causing deadlocks or other issues. |
Hmm.. |
I believe this may have been the source of this issue: fluent/fluentd#1549 |
We had similar issue, but fluentd-0.14.22 resolves it. |
I'm seeing errors in fluent-plugin-bigquery that seem similar to deadlock issues that have been reporting in general with FluentD 14. However, since the errors are specifically referencing fluent-plugin-bigquery I wanted to drop them in here. Things to know: Restart takes a long time and finally dies with these errors.
The text was updated successfully, but these errors were encountered: