-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add exponential backoff when connecting or reconnecting and allow plugin to start without making initial connection #12111
Conversation
Looks like I missed some linter errors. I'd love it if someone could review the setting names and descriptions in addition to the code. I added the retry settings in common code so they are available for outputs.kafka to use too. I haven't tested them there but I'd expect them to work. There are other retry timeouts and other settings in sarama (search for retry in https://github.com/Shopify/sarama/blob/main/config.go). I'm relatively sure Metadata.Retry.* are the only ones needed to change connection/reconnection timeouts, but I would love it if anyone could shed more light on what the other ones are for (Admin, Producer, Consumer, Consumer.Group.Rebalance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few comments @reimda. Nothing big besides the linter issue.
ab356a3
to
c98f191
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update @reimda! Only one question about the warning...
5dcc12a
to
787dbe2
Compare
Download PR build artifacts for linux_amd64.tar.gz, darwin_amd64.tar.gz, and windows_amd64.zip. 📦 Click here to get additional PR build artifactsArtifact URLs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks @reimda for tackling this!
This PR adds the ability to start the kafka_consumer input even if the kafka broker is down. The default behavior of the plugin is to require connection for a successful plugin start, but users can set
connection_strategy = "defer"
to choose to connect after the plugin has started.Since starting up without a connection means the plugin will immediately start to retry making a connection, this PR also adds settings to configure retries through sarama's Metadata.Retry.* settings. Previously telegraf always used the sarama defaults of 3 retries and 250ms delay before each retry. Now the Metadata.Retry.* settings are exposed through kafka_consumer's telegraf config so users can choose how many times to retry and configure the retry delay.
The PR also adds exponential backoff of connection retries by implementing a sarama Metadata.Retry.BackoffFunc. Telegraf's default backoff is constant for backward compatibility, but users can set
metadata_retry_type = "exponential"
to choose exponential backoff.To test the new settings, I extended the integration test to run with both connection strategies and added two new tests around exponential backoff, one just checking that the BackoffFunc math is right and another that tries to connect to an unopen port on localhost to make sure backoff is configured in sarama correctly
related influxdata/feature-requests#461