-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding a configurable scrapeTimeout for prometheus operator. #539
Adding a configurable scrapeTimeout for prometheus operator. #539
Conversation
Can one of the admins verify this patch? |
Can one of the admins verify this patch? |
Note that generally we will also run all of this on CI, however, we're having some trouble with our jenkins instances right now, should be working soon again though. As a side note a section in the readme on developing might be helpful for future reference. Do you want to create a PR to add such a section 🙂 ? |
Happy to. I'll see if I can get my mini kube working today or tomorrow.
Cheers,
Gavin
On Aug 7, 2017 8:49 AM, "Frederic Branczyk" <notifications@github.com> wrote:
make test executes all unit tests, and you can execute the e2e tests on a
local minikube by compiling the static binary (which is what is used for
the container images) with make crossbuild and then build the container
image with the docker host from within minikube by running eval $(minikube
docker-env), then you can build the container using make container and then
finally run the e2e tests using make e2e-tests.
Note that generally we will also run all of this on CI, however, we're
having some trouble with our jenkins instances right now, should be working
soon again though.
As a side note a section in the readme on developing might be helpful for
future reference. Do you want to create a PR to add such a section 🙂 ?
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#539 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AATHHN3reh4BZc97yA92bT_VAWSDfJK9ks5sVwfKgaJpZM4Os-JI>
.
|
My opinion so far has been that the scrape timeout being as long as the scrape interval should generally be sane for all setups. |
Yes -- in this case we're querying an appliance (vsphere) which is
responsible for 10s - 100s of machines. We're querying its SOAP API and it
takes about 20 seconds to pull down the complete metrics set.
Also, I've found similar issues when polling SNMP traps on some embedded
devices where the response simply takes 10s of seconds. A universal
scrapTimeout precludes any such metrics.
Given these circumstances, I think it's pretty reasonable to give the
ServiceMonitor an opportunity to override the default for the managed
Prometheus service.
…On Mon, Aug 7, 2017 at 11:14 AM, Fabian Reinartz ***@***.***> wrote:
My opinion so far has been that the scrape timeout being as long as the
scrape interval should generally be sane for all setups.
Can you elaborate on the exact use case that makes you want to set it
explicitly?
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#539 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AATHHEiDDUfX0AqQ4ZROgi8DjLpL6-7Wks5sVynwgaJpZM4Os-JI>
.
|
In that case I think the functionality from this PR #537 might already all we need. I'd personally also prefer not to bloat the Prometheus object itself. |
Sure. That PR didn't exist when I began work on this branch. I'd be happy with either one getting merged. I'd argue that making the API Spec attribute name match the configuration attribute in Prometheus is a good idea, however (i.e. |
Closing in favor of #537 |
Any chance of getting this functionality into the operator? It would help setting the global timeout for the serviceMonitors that doesn't specify it. A use case would be the vast amounts of RaspberryPi clusters where some scraping times-out due to low performance. I can send a new PR addressing this. Cc. @geerlingguy @brancz |
I'm not entirely opposed to it, but I do have to wonder how much that will actually help you, as ServiceMonitors/PodMonitors can specify scrape timeouts themselves which will always take precedence. |
Many targets don't have scrape_timeout definitions like the core ones (kube-apiserver, kube-scheduler, kube-controller) so having the hability to change the default value is desired. Opened #3250 to address this. |
Hi CoreOS team,
I couldn't find a comprehensive developers guide for ensuring this works. I figured I'd make a local build locally and test it out in our QA cluster. If there's something faster and easier than just running
go test
in the pkg/prometheus subdir, I'd love to hear about it.Cheers!