Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upSample/s difference between 1.1.0 and 1.2.1 #2128
Comments
This comment has been minimized.
This comment has been minimized.
|
It looks like the 1.2.1 has more targets. Can you see what the difference is there? |
brian-brazil
added
the
kind/question
label
Oct 27, 2016
This comment has been minimized.
This comment has been minimized.
|
Ohh, sorry. This graph is a bit confusing. |
This comment has been minimized.
This comment has been minimized.
|
There looks to be a difference in target scrapes. My best guess is you're having more timeouts. Could you binary search the config to narrow things down? |
This comment has been minimized.
This comment has been minimized.
|
The configuration is identical (process flags and prometheus.conf). I'm seeing that "Scrape duration (0.9, 0.99)" for 1.2.1 is slightly higher. That's the prometheus_target_interval_length_seconds metric (I borrowed it from Grafana's Prometheus example dashboard). I guess it is possible that 1.2.1 is inside a slower network/host (EC2). The whole reason I'm doing this check is because I was already in 1.2.1 in both instances and had a very problematic scrape/s metric (variance between 10K and 60K). I traced it back to the upgrade to 1.2.1, so I downgraded one instance to compare them (and reset the datadir completely). What can I use to see how target_interval_length_seconds is affected? Any other similar metrics/flags/logs? |
grobie
closed this
Mar 5, 2017
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 23, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
lightpriest commentedOct 27, 2016
•
edited
This is more of a general question than a bug report.
We're running two identical instances of Prometheus, one with 1.1.0 and a mirror which we upgraded to 1.2.1. I've noticed that 1.1.0 consistently reports a larger sample rate (rate(prometheus_local_storage_ingested_samples_total[5m])). Is it something I should worry about? Is it a regression or just a change in the sampling count code?
They are completely identical. Same configuration, same flags and same job and target definitions. There is a slight delay until file_sd gets the updated file contents (10 minutes max), though.
In the attached screenshot, what's marked "mirror" in the legend is 1.2.1 and the default one is 1.1.0.
The focus should be on the Sample rate graph.