Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(inputs.prometheus): moved from watcher to informer #10932

Merged
merged 6 commits into from
Apr 8, 2022

Conversation

mesksr
Copy link
Contributor

@mesksr mesksr commented Apr 3, 2022

Required for all PRs:

Resolves:
#10148, #10878 and #10928

Issues:

  • telegraf stops monitoring pods after 30 minutes
  • high CPU utilization

watcher has a default timeout of 30 minutes, leading to the above mentioned problems.

Fix:

  • Replaced watcher with informer. Actions that are triggered for "add", "update/modify" and "delete" events remain same.
  • Added a new config cache_refresh_interval that is used by informer to rebase its cache.

@telegraf-tiger
Copy link
Contributor

telegraf-tiger bot commented Apr 3, 2022

Thanks so much for the pull request!
🤝 ✒️ Just a reminder that the CLA has not yet been signed, and we'll need it before merging. Please sign the CLA when you get a chance, then post a comment here saying !signed-cla

@telegraf-tiger telegraf-tiger bot added the fix pr to fix corresponding bug label Apr 3, 2022
@mesksr mesksr changed the title fix: prometheus input plugin, moved from watcher to informer fix: [inputs:prometheus] moved from watcher to informer Apr 3, 2022
@telegraf-tiger
Copy link
Contributor

telegraf-tiger bot commented Apr 3, 2022

Thanks so much for the pull request!
🤝 ✒️ Just a reminder that the CLA has not yet been signed, and we'll need it before merging. Please sign the CLA when you get a chance, then post a comment here saying !signed-cla

@mesksr
Copy link
Contributor Author

mesksr commented Apr 3, 2022

!signed-cla

@MyaLongmire
Copy link
Contributor

MyaLongmire commented Apr 4, 2022

To get circleci passing you need to run make fmt and then check in the changes. Let me know if that doesn't work :)

@mesksr
Copy link
Contributor Author

mesksr commented Apr 4, 2022

Thanks MyaLongmire !
I've updated the PR.

@sspaink sspaink changed the title fix: [inputs:prometheus] moved from watcher to informer fix(inputs.prometheus): moved from watcher to informer Apr 4, 2022
Copy link
Contributor

@MyaLongmire MyaLongmire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this pr!

@mesksr
Copy link
Contributor Author

mesksr commented Apr 5, 2022

Moving debug because of this known issue in informer:
kubernetes/kubernetes#87284

An extra log was reported when the pod was deleted.

Logs (before fix):

2022-04-03T21:18:10Z D! [inputs.prometheus] registered a delete request for "xyz-exporter-5d6bfd8696-vfjvg" in namespace "xyz"
2022-04-03T21:18:10Z D! [inputs.prometheus] will stop scraping for "http://192.168.101.227:9275/metrics"
2022-04-03T21:18:10Z D! [inputs.prometheus] registered a delete request for "xyz-exporter-5d6bfd8696-vfjvg" in namespace "xyz"  

Logs (after fix):

2022-04-03T21:18:10Z D! [inputs.prometheus] registered a delete request for "xyz-exporter-5d6bfd8696-vfjvg" in namespace "xyz"
2022-04-03T21:18:10Z D! [inputs.prometheus] will stop scraping for "http://192.168.101.227:9275/metrics"

@mesksr
Copy link
Contributor Author

mesksr commented Apr 5, 2022

@MyaLongmire,
Is there anything more needed from my end to merge this PR?
Can you please share an estimate on how long will it take to merge this?

@MyaLongmire
Copy link
Contributor

@mesksr all prs must be approved by 2 team members before they can be merged :)

@shubrajp
Copy link

shubrajp commented Apr 6, 2022

@powersj
Can you please review this?
We need this fix for #10878

Copy link
Contributor

@powersj powersj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the change!

@powersj
Copy link
Contributor

powersj commented Apr 7, 2022

I was catching up on PRs and issues while I have been out and noticed that in #10928 (comment) the user noticed some RBAC error messages. Is that something that needs to be resolved in this PR before merging?

@shubrajp
Copy link

shubrajp commented Apr 7, 2022

Hi @powersj...
I have replied to @simoelmou on #10928...
He most probably has to modify ClusterRole to get it working...

@powersj
Copy link
Contributor

powersj commented Apr 7, 2022

Thanks for doing that. I appreciate it as I wasn't certain if this was related to the PR or if the user needed to configure something.

…s_pods_namespace, kubernetes_label_selector and kubernetes_label_selector
@telegraf-tiger
Copy link
Contributor

telegraf-tiger bot commented Apr 8, 2022

Download PR build artifacts for linux_amd64.tar.gz, darwin_amd64.tar.gz, and windows_amd64.zip.
Downloads for additional architectures and packages are available below.

⚠️ This pull request increases the Telegraf binary size by 1.12 % for linux amd64 (new size: 145.0 MB, nightly size 143.4 MB)

📦 Click here to get additional PR build artifacts

Artifact URLs

DEB RPM TAR GZ ZIP
amd64.deb aarch64.rpm darwin_amd64.tar.gz windows_amd64.zip
arm64.deb armel.rpm darwin_arm64.tar.gz windows_i386.zip
armel.deb armv6hl.rpm freebsd_amd64.tar.gz
armhf.deb i386.rpm freebsd_armv7.tar.gz
i386.deb ppc64le.rpm freebsd_i386.tar.gz
mips.deb riscv64.rpm linux_amd64.tar.gz
mipsel.deb s390x.rpm linux_arm64.tar.gz
ppc64el.deb x86_64.rpm linux_armel.tar.gz
riscv64.deb linux_armhf.tar.gz
s390x.deb linux_i386.tar.gz
linux_mips.tar.gz
linux_mipsel.tar.gz
linux_ppc64le.tar.gz
linux_riscv64.tar.gz
linux_s390x.tar.gz
static_linux_amd64.tar.gz

@powersj powersj merged commit 777f8bf into influxdata:master Apr 8, 2022
@shastrisaurabh
Copy link

When can we expect these changes to be in release?

@powersj
Copy link
Contributor

powersj commented Apr 12, 2022

The next bug fix release, v1.22.2, is scheduled for the week of April 25. Until then you can use a nightly to test the change.

sspaink pushed a commit that referenced this pull request Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix pr to fix corresponding bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants