Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add exponential backoff when connecting or reconnecting and allow plugin to start without making initial connection #12111

Merged
merged 8 commits into from
Nov 3, 2022

Conversation

reimda
Copy link
Contributor

@reimda reimda commented Oct 26, 2022

This PR adds the ability to start the kafka_consumer input even if the kafka broker is down. The default behavior of the plugin is to require connection for a successful plugin start, but users can set connection_strategy = "defer" to choose to connect after the plugin has started.

Since starting up without a connection means the plugin will immediately start to retry making a connection, this PR also adds settings to configure retries through sarama's Metadata.Retry.* settings. Previously telegraf always used the sarama defaults of 3 retries and 250ms delay before each retry. Now the Metadata.Retry.* settings are exposed through kafka_consumer's telegraf config so users can choose how many times to retry and configure the retry delay.

The PR also adds exponential backoff of connection retries by implementing a sarama Metadata.Retry.BackoffFunc. Telegraf's default backoff is constant for backward compatibility, but users can set metadata_retry_type = "exponential" to choose exponential backoff.

To test the new settings, I extended the integration test to run with both connection strategies and added two new tests around exponential backoff, one just checking that the BackoffFunc math is right and another that tries to connect to an unopen port on localhost to make sure backoff is configured in sarama correctly

related influxdata/feature-requests#461

@telegraf-tiger telegraf-tiger bot added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label Oct 26, 2022
@reimda
Copy link
Contributor Author

reimda commented Oct 26, 2022

Looks like I missed some linter errors.

I'd love it if someone could review the setting names and descriptions in addition to the code. I added the retry settings in common code so they are available for outputs.kafka to use too. I haven't tested them there but I'd expect them to work.

There are other retry timeouts and other settings in sarama (search for retry in https://github.com/Shopify/sarama/blob/main/config.go). I'm relatively sure Metadata.Retry.* are the only ones needed to change connection/reconnection timeouts, but I would love it if anyone could shed more light on what the other ones are for (Admin, Producer, Consumer, Consumer.Group.Rebalance)

Copy link
Contributor

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments @reimda. Nothing big besides the linter issue.

plugins/common/kafka/config.go Outdated Show resolved Hide resolved
plugins/common/kafka/config.go Outdated Show resolved Hide resolved
plugins/inputs/kafka_consumer/kafka_consumer.go Outdated Show resolved Hide resolved
plugins/inputs/kafka_consumer/kafka_consumer_test.go Outdated Show resolved Hide resolved
@reimda reimda added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Oct 31, 2022
Copy link
Contributor

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update @reimda! Only one question about the warning...

plugins/common/kafka/config.go Outdated Show resolved Hide resolved
@telegraf-tiger
Copy link
Contributor

telegraf-tiger bot commented Nov 1, 2022

Copy link
Contributor

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks @reimda for tackling this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants