Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue on Prometheus 2.7.x #5173

Closed
dmitriy-lukyanchikov opened this Issue Feb 1, 2019 · 23 comments

Comments

Projects
None yet
6 participants
@dmitriy-lukyanchikov
Copy link

dmitriy-lukyanchikov commented Feb 1, 2019

Bug Report

What did you do?
Update to the latest Prometheus 2.7.1 from 2.6.0

What did you expect to see?
I expected to see new prometheus with the same queries speed and new features

What did you see instead? Under which circumstances?
I see that some queries run more than 10 times slower. Screen shots are attached

Environment

  • System information:

Linux 4.4.0-141-generic x86_64

  • Prometheus version:

2.7.1

  • Logs:
level=info ts=2019-02-01T11:10:36.801327342Z caller=main.go:243 msg="Starting Prometheus" version="(version=2.6.1, branch=HEAD, revision=b639fe140c1f71b2cbad3fc322b17efe60839e7e)"
level=info ts=2019-02-01T11:10:36.801453386Z caller=main.go:244 build_context="(go=go1.11.4, user=root@4c0e286fe2b3, date=20190115-19:12:04)"
level=info ts=2019-02-01T11:10:36.80150992Z caller=main.go:245 host_details="(Linux 4.4.0-141-generic #167-Ubuntu SMP Wed Dec 5 10:40:15 UTC 2018 x86_64 dlukyanchikov-prom-dev-01 (none))"
level=info ts=2019-02-01T11:10:36.801839851Z caller=main.go:246 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2019-02-01T11:10:36.80188953Z caller=main.go:247 vm_limits="(soft=unlimited, hard=unlimited)"
level=info ts=2019-02-01T11:10:36.804326282Z caller=main.go:561 msg="Starting TSDB ..."
level=info ts=2019-02-01T11:10:36.804477793Z caller=web.go:429 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2019-02-01T11:10:36.806642944Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1548948308481 maxt=1548979200000 ulid=01D2KQ4ACFS1R9658MG8TY9CS1
level=info ts=2019-02-01T11:10:36.806803417Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1549000800000 maxt=1549008000000 ulid=01D2M4JGP0B8CD2TGPY8G1DGJ4
level=info ts=2019-02-01T11:10:36.806974809Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1548979200000 maxt=1549000800000 ulid=01D2M5812NFPFE4N8SHWS0GRD0
level=info ts=2019-02-01T11:10:36.807098002Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1549008000000 maxt=1549015200000 ulid=01D2MBE7YYSP1SKT5EZ7BAVCDT
level=warn ts=2019-02-01T11:14:56.633901809Z caller=head.go:434 component=tsdb msg="unknown series references" count=341591
level=info ts=2019-02-01T11:15:10.234213133Z caller=main.go:571 msg="TSDB started"
level=info ts=2019-02-01T11:15:10.234451939Z caller=main.go:631 msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
level=info ts=2019-02-01T11:15:10.328796713Z caller=main.go:657 msg="Completed loading of configuration file" filename=/etc/prometheus/prometheus.yml
level=info ts=2019-02-01T11:15:10.328956607Z caller=main.go:530 msg="Server is ready to receive web requests."
level=info ts=2019-02-01T11:15:54.681596505Z caller=main.go:631 msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml

Prometheus 2.7.1
screen shot 2019-02-01 at 1 07 27 pm

Prometheus 2.6.0
screen shot 2019-02-01 at 1 40 15 pm

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 1, 2019

That query is going to touch every series in the index lookup, which is not going to be fast. Try it without the name matcher.

Also, this is not the sort of query you should be using often.

@dmitriy-lukyanchikov

This comment has been minimized.

Copy link
Author

dmitriy-lukyanchikov commented Feb 1, 2019

I give this query as example, not sure if this is 10 time slower it is good, so you think its not an issue? or you mean i need to test another query?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 1, 2019

This is a query that it's not important to be fast, if you have a query that's not doing a wildcard on name with this issue I'm interested.

@dmitriy-lukyanchikov

This comment has been minimized.

Copy link
Author

dmitriy-lukyanchikov commented Feb 1, 2019

I made some tests with production data loaded to dev server and here is new result, its still slower but not 10 times, but 2-5 times at least.

i run this query

avg(haproxy_backend_http_total_time_average_seconds{backend!="stats",backend!="auth",job="consul-services",host_name=~".+"}
  * on(host_name) group_left(brand,env,dc,dc_cluster,product,provider,in_sd) (avg by (host_name) (
  server_labels{host_name=~".+"}
  ))
)  by (backend)

we use this query on graph in grafana dashboard, and directly on prometheus it runs with next results comparing to 2.6.x version

  1. 2.6.0 version instant query - 883ms

  2. 2.6.0 version range query 1d - 23403ms

  3. 2.7.1 version instant query - 3967ms

  4. 2.7.1 version range query 1d - 72077ms

I think its pretty serious speed dropdown, no changes to configuration or sources, also historical data only 1 day. If you need some additional test to be done from our side as proof please write to me i will give any information you need, because for now its blocking us from upgrading to new version. Also i run this tests at least 3 times to be sure that this is clear result.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 1, 2019

Do you know which subpart of that query is slower?

That query is also broken in two ways, you're aggregating away the labels you're trying to add in and you can't average averages.

@dmitriy-lukyanchikov

This comment has been minimized.

Copy link
Author

dmitriy-lukyanchikov commented Feb 1, 2019

I think you miss the point, the same query is much slower in new prometheus, its not a problem with optimizing the query its problem that we cannot run new prometheus because its slower than previous version. If you think that this is because of second query i can run another test without it and result is pretty the same with query like this

avg(haproxy_backend_http_total_time_average_seconds{backend!="stats",backend!="auth",job="consul-services",host_name=~".+"})  by (backend)

if you want to discuss how query can be optimized i think its a good idea but problem not in optimization for now.

@allengeorge

This comment has been minimized.

Copy link

allengeorge commented Feb 6, 2019

Unfortunately we're able to replicate similar behavior on an internal 2.7.1 prometheus install.

We're trying to run the following query:

count by(customer_id)(scheduled_executions_run_time_ms{customer_id="#####",status=~"(?i)Success",prod=~"(?i)True",event_type="XXXXXX",job_name="XXXXXX"})

We were getting cases where our queries and our dashboards were timing out. A simple, naive test was running that query in a loop and recording the runtimes on a server that was not servicing any other queries. This gave us results that were (1) slower, and (2) varied wildly:

3.48274707794
3.12657499313
6.48707699776
6.58115696907
6.20486307144
6.79951906204
6.89865684509
22.9217040539
7.77658200264
9.90158009529
9.90662908554
9.290102005
7.32188606262
6.27733278275
5.22911000252
20.3817799091
6.29669594765
8.69726109505
13.6073069572
9.87991309166
10.4150719643
9.59437084198
31.6918962002
17.5051178932
19.8027350903

Downgrading to 2.6.1 and running the same query under the same conditions resulted in the following runtimes:

0.027251958847
0.0113677978516
0.0104110240936
0.00987195968628
0.00994086265564
0.0097188949585
0.0101051330566
0.00956201553345
0.00993418693542
0.00989484786987
0.00975203514099
0.00984692573547
0.009526014328
0.00955319404602
0.00958800315857
0.0103988647461
0.0097279548645
0.00966000556946
0.0098569393158
0.00966715812683
0.00977182388306
0.00974488258362
0.0101010799408
0.00993800163269
@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 7, 2019

Can you give me an idea of the cardinality of each of those labels, and the Prometheus overall?

@dmitriy-lukyanchikov

This comment has been minimized.

Copy link
Author

dmitriy-lukyanchikov commented Feb 7, 2019

Can you give me an idea of the cardinality of each of those labels, and the Prometheus overall?

For our prometheus the cardinality usually not more than 10, but average number i think something like 5-6, total number of timeseries is near 5 millions and storage rotation is 3 weeks

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 7, 2019

You have no more than 10 customer_ids?

@allengeorge

This comment has been minimized.

Copy link

allengeorge commented Feb 7, 2019

Yes, we do have more than 10 customer ids, but due to the nature of our market it’s not massive. As in, we’re not talking 10s of thousands of unique customer ids here.

And, while not ideal (we read the warnings about label cardinality) we had to make some internal tradeoffs and opted with this approach. FWIW, this metric structure has been working well since the 1.x releases until 2.7, at which point we experienced significant slowdown and variability.

@hoffie

This comment has been minimized.

Copy link

hoffie commented Feb 13, 2019

I am seeing this as well and I think I have come up with a self-contained reproducer.

I think this boils down to a change in performance characteristics regarding label matchers.

  • The example below uses a regexp which matches all non-empty strings (=~".+"). This shows the problem.
  • !="" matchers show it as well.
  • regexp matchers with static parts such as "1.+" don't show the problem

I haven't done any further tests yet. Bisecting may help (if noone has an upfront idea what commit could have introduced this). I'll try to continue debugging as time permits.

# Generate some fake metrics
$ for i in 1 2; do for m in {1..200}; do
  for x in {1..1000}; do
    echo 'example_metric'$m'{a="'"${x}"'"} 1'
  done
done > metrics$i; done

# Set up a web server which exposes those fake metrics
$ python2 -m SimpleHTTPServer

# Prepare and start prometheus 2.6.1 and 2.7.1
$ cat > prometheus.yml
global:
  scrape_interval:     5s  
  evaluation_interval: 5s

scrape_configs:
  - job_name: 'example1'
    metrics_path: '/metrics1'
    static_configs:
    - targets: ['127.0.0.1:8000']

  - job_name: 'example2'
    metrics_path: '/metrics2'
    static_configs:
    - targets: ['127.0.0.1:8000']
^D

$ ./prometheus-2.6.1 --web.listen-address 127.0.0.1:9096 --storage.tsdb.path data-2.6.1 &
$ ./prometheus-2.7.1 --web.listen-address 127.0.0.1:9097 --storage.tsdb.path data-2.7.1 &

# Benchmark
$ time for i in {1..100}; do curl -s 'http://localhost:9096/api/v1/query?query=example_metric1%7Ba%3D~%22.%2B%22%2Cjob%3D%22example1%22%7D' -o /dev/null; done

real	0m4,766s
user	0m0,942s
sys	0m0,550s

$ time for i in {1..100}; do curl -s 'http://localhost:9097/api/v1/query?query=example_metric1%7Ba%3D~%22.%2B%22%2Cjob%3D%22example1%22%7D' -o /dev/null; done

real	0m22,171s
user	0m1,334s
sys	0m0,697s
@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 14, 2019

It's almost certainly prometheus/tsdb@296f943, however I can't reproduce with a micro-benchmark so something bigger is going on.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 14, 2019

Okay, have it on the micro-benchmark now. Had missed a for loop.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 14, 2019

Have a try of this branch: https://github.com/prometheus/tsdb/tree/index-perf Only postings.go has changed.

@hoffie

This comment has been minimized.

Copy link

hoffie commented Feb 14, 2019

Thanks for your work, highly appreciated!

I have built three prometheus binaries with go1.11.5:

  1. prometheus@v2.7.1 with postings.go replaced by your version on port 9097
  2. prometheus@master (unmodified) running on port 9098
  3. prometheus@master with tsdb@index-perf on port 9099

I can confirm that only unpatched master has the above mentioned performance problem. Both patched 2.7.1 and master with your tsdb branch show highly improved performance which is equal or even better than on 2.6.1.

So, at least for this case, this seems to be a huge improvement. I have not done any other tests nor am I able to run a test on our production environment with a patched version. However, based on my tests (documented below) I'm quite confident that the suggested change will help.

Looking forward to getting this merged and released! :)

$ make build
$ cp prometheus prometheus-b7594f650f348b9ac108ea631b63e99a88b9413d

$ GO111MODULE=on go mod edit -require github.com/prometheus/tsdb@index-perf
$ GO111MODULE=on go mod vendor
$ make build
$ cp prometheus prometheus-b7594f650f348b9ac108ea631b63e99a88b9413d-tsdb-v0.4.1-0.20190214171337-fd372973ae24

$ git checkout -f .
$ git checkout v2.7.1
$ wget https://raw.githubusercontent.com/prometheus/tsdb/index-perf/index/postings.go -O vendor/github.com/prometheus/tsdb/index/postings.go
$ make build
$ cp prometheus ~/tmp/prom-2.7-performance-regression/prometheus-2.7.1-postings.go-patched

$ ./prometheus-2.7.1-postings.go-patched --web.listen-address 127.0.0.1:9097 --storage.tsdb.path data-2.7.1-postings.go-patched &
$ ./prometheus-b7594f650f348b9ac108ea631b63e99a88b9413d --web.listen-address 127.0.0.1:9098 --storage.tsdb.path data-b7594f650f348b9ac108ea631b63e99a88b9413d &
$ ./prometheus-b7594f650f348b9ac108ea631b63e99a88b9413d-tsdb-v0.4.1-0.20190214171337-fd372973ae24 --web.listen-address 127.0.0.1:9099 --storage.tsdb.path data-b7594f650f348b9ac108ea631b63e99a88b9413d-tsdb-v0.4.1-0.20190214171337-fd372973ae24 &

$ time for i in {1..100}; do curl -s 'http://localhost:9097/api/v1/query?query=example_metric1%7Ba%3D~%22.%2B%22%2Cjob%3D%22example1%22%7D' -o /dev/null; done

real	0m3,608s
user	0m0,914s
sys	0m0,360s

$ time for i in {1..100}; do curl -s 'http://localhost:9098/api/v1/query?query=example_metric1%7Ba%3D~%22.%2B%22%2Cjob%3D%22example1%22%7D' -o /dev/null; done

real	0m21,882s
user	0m1,322s
sys	0m0,694s

$ time for i in {1..100}; do curl -s 'http://localhost:9099/api/v1/query?query=example_metric1%7Ba%3D~%22.%2B%22%2Cjob%3D%22example1%22%7D' -o /dev/null; done

real	0m3,935s
user	0m0,848s
sys	0m0,420s
@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 14, 2019

Great, thanks for all the feedback. I'll look at getting in proper benchmarks for this tomorrow and raising a PR.

@eahydra

This comment has been minimized.

Copy link

eahydra commented Feb 19, 2019

Hey, I have analogous performance issue that I update from Prometheus v2.5.0 to Prometheus v2.7.1.
In my environment, I used curl command with -w "%{time_total}" to do query:

sum(job:node:count{site=~"a|b", resource_pool=~"a1|a2|a3|a4|a5|a6|a7|a8|a9|a10|a11|a12|a13|a14|a15|a16"}) by (resource_pool, site)

In Prometheus v2.5.0, the %{time_total} is 0.001s, but in Prometheus v2.7.1, the %{time_total} is 4.248s.

In this case, the metric job:node:count does not exist , it's faked, but the label resource_pool and site is exist.

The result of this difference seems to be because the 2.5.0 version first checks the metric and then checks the label, but in the 2.7.1 version, the label is checked first and then the metric is checked.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 19, 2019

The result of this difference seems to be because the 2.5.0 version first checks the metric and then checks the label, but in the 2.7.1 version, the label is checked first and then the metric is checked.

The intersection is done left to right, that hasn't changed.

@eahydra

This comment has been minimized.

Copy link

eahydra commented Feb 19, 2019

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 21, 2019

I've put that out as a PR now with some additional changes, if anyone wants to test: prometheus/tsdb#531

@hoffie

This comment has been minimized.

Copy link

hoffie commented Feb 21, 2019

Looks good, thanks!

$ ./prometheus-2.6.1 --web.listen-address 127.0.0.1:9096 --storage.tsdb.path data-2.6.1
$ ./prometheus-2.8.0_pre-5fbda4c9d72b4519adbc9447f0a0023565d03b14 --web.listen-address 127.0.0.1:9098 --storage.tsdb.path data-2.8.0_pre-5fbda4c9d72b4519adbc9447f0a0023565d03b14
$ ./prometheus-2.8.0_pre-5fbda4c9d72b4519adbc9447f0a0023565d03b14-tsdb-v0.4.1-0.20190221160417-10f171afa6d5 --web.listen-address 127.0.0.1:9099 --storage.tsdb.path data-2.8.0_pre-5fbda4c9d72b4519adbc9447f0a0023565d03b14-tsdb-v0.4.1-0.20190221160417-10f171afa6d5

$ time for i in {1..100}; do curl -s 'http://localhost:9096/api/v1/query?query=example_metric1%7Ba%3D~%22.%2B%22%2Cjob%3D%22example1%22%7D' -o /dev/null; done

real	0m3,574s
user	0m0,982s
sys	0m0,507s

$ time for i in {1..100}; do curl -s 'http://localhost:9098/api/v1/query?query=example_metric1%7Ba%3D~%22.%2B%22%2Cjob%3D%22example1%22%7D' -o /dev/null; done

real	0m22,113s
user	0m1,370s
sys	0m0,669s

$ time for i in {1..100}; do curl -s 'http://localhost:9099/api/v1/query?query=example_metric1%7Ba%3D~%22.%2B%22%2Cjob%3D%22example1%22%7D' -o /dev/null; done

real	0m3,498s
user	0m1,021s
sys	0m0,409s
@eahydra

This comment has been minimized.

Copy link

eahydra commented Feb 26, 2019

I've put that out as a PR now with some additional changes, if anyone wants to test: prometheus/tsdb#531

It works fine for my case.

Use Prometheus v2.7.1, query sum(job:node:count{site=~"a|b", resource_pool=~"a1|a2|a3|a4|a5|a6|a7|a8|a9|a10|a11|a12|a13|a14|a15|a16"}) by (resource_pool, site) cost 3 second,

use Prometheus master branch latest code, just cost 0.001 second.

👍👍👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.