-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(inputs.mqtt_consumer): Implement startup error behaviors #15486
Conversation
Cool feature, looks like it could fix the issue I'm experiencing, which I described here: |
I just did a quick test with the build artifacts above and additionally setting
but I'm still getting the following log msgs:
How are the changes in this PR supposed to change the behaviour in case the connection to the MQTT broker gets interrupted for a while? |
@da-phil the initial connection is influenced by the |
supports options for specifying the behavior when experiencing startup errors | ||
using the `startup_error_behavior` setting. Available values are: | ||
|
||
- `error`: Telegraf with stop and exit in case of startup errors. This is the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- `error`: Telegraf with stop and exit in case of startup errors. This is the | |
- `error`: Telegraf will stop and exit in case of startup errors. This is the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@da-phil could you please put up a PR against docs/includes/startup_error_behavior.md
as this is included here and run make docs
in the telegraf root dir!?
## Interval and ping timeout for keep-alive messages | ||
## The sum of those options defines when a connection loss is detected. | ||
## Note: The keep-alive interval needs to be in second granularity e.g. 1m | ||
## but not 100ms. | ||
# keep_alive = "60s" | ||
# ping_timeout = "10s" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think startup_error_behavior
should be also mentioned in this example config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I don't think so. We do not mention global options but rather link to them in the README. This is to not run into cases where we do change/add global settings and then have to check and update all 300+ plugins...
@srebhan gotcha, so I set Let me again explain the issue I'm facing: |
@da-phil could we please get this PR merged first and not overload it with additional requests? Unfortunately this PR touches too much code already... Please open an issue stating what you need and I can take a look after this is merged. I guess we need to experiment with the options and e.g. use auto-reconnect... |
@srebhan sure, I'm not blocking the PR, just wanted to give some feedback to behaviour which might be related to the changes in this PR. Then you guys can merge the PR and I'll create an issue once I'm back from vacation. |
Download PR build artifacts for linux_amd64.tar.gz, darwin_arm64.tar.gz, and windows_amd64.zip. 🥳 This pull request decreases the Telegraf binary size by -1.55 % for linux amd64 (new size: 239.8 MB, nightly size 243.5 MB) 📦 Click here to get additional PR build artifactsArtifact URLs |
Summary
This PR allows to use the startup-error-behavior options
error
,retry
andignore
. It furthermore provided integration test for the different options and one for a "normal" startup.Checklist
Related issues
resolves #10694