Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worse query performance of 2.1 than 1.8.2 observed #3771

Closed
roengram opened this Issue Jan 31, 2018 · 11 comments

Comments

Projects
None yet
6 participants
@roengram
Copy link

roengram commented Jan 31, 2018

What did you do?
Compare query performance between 1.8.2 and 2.1.0.
Query: count(metric2{label1="v1"})

What did you expect to see?
Similar or better performance than 1.8.2.

What did you see instead? Under which circumstances?
2.1.0 shows about 1/2 TPS than 1.8.2, when 300 concurrent users are querying.

Environment

  • System information:

    Linux 3.13.0 x86_64 (Centos7-based docker running on a Joyent Triton container)
    Container has 64GB RAM with 128GB swap memory, 800GB SSD

  • Prometheus version:

    1.8.2: prometheus, version 1.8.2 (branch: HEAD, revision: 5211b96)
    2.1.0: prometheus, version 2.1.0 (branch: HEAD, revision: 85f23d8)

  • Alertmanager version:

    N/A

  • Prometheus configuration file:

global:
  scrape_timeout: 60s
  external_labels:
    prom: prom-2.1

scrape_configs:

- job_name: job_1
  honor_labels: true
  metrics_path: /metrics_1000_15
  scrape_interval: 40s
  file_sd_configs:
    - files:
      - /griffin/prom/conf.d/targets_1000.yml

Basically each prometheus is scraping 1000 targets every 40 sec, each exports 1000 time series per scrape, and each time series has 15 labels. The sample value is randomly generated between -2^32 ~ 2^32.

  • Alertmanager configuration file:

    N/A
    
  • Logs:

level=info ts=2018-01-31T07:04:52.608000426Z caller=main.go:225 msg="Starting Prometheus" version="(version=2.1.0, branch=HEAD, revision=85f23d82a045d103ea7f3c89a91fba4a93e6367a)"
level=info ts=2018-01-31T07:04:52.608102773Z caller=main.go:226 build_context="(go=go1.9.2, user=root@6e784304d3ff, date=20180119-12:01:23)"
level=info ts=2018-01-31T07:04:52.608127694Z caller=main.go:227 host_details="(Linux 3.13.0 BrandZ virtual linux x86_64 cf71da7fa017 )"
level=info ts=2018-01-31T07:04:52.608164717Z caller=main.go:228 fd_limits="(soft=65536, hard=65536)"
level=info ts=2018-01-31T07:04:52.613690597Z caller=main.go:499 msg="Starting TSDB ..."
level=info ts=2018-01-31T07:04:52.614640861Z caller=web.go:383 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2018-01-31T07:05:20.177360395Z caller=main.go:509 msg="TSDB started"
level=info ts=2018-01-31T07:05:20.177635591Z caller=main.go:585 msg="Loading configuration file" filename=/griffin/prom/prometheus.yml
level=info ts=2018-01-31T07:05:20.178831559Z caller=main.go:486 msg="Server is ready to receive web requests."
level=info ts=2018-01-31T07:05:20.178851424Z caller=manager.go:59 component="scrape manager" msg="Starting scrape manager..."
level=info ts=2018-01-31T07:05:20.268231732Z caller=compact.go:387 component=tsdb msg="compact blocks" count=1 mint=1517371200000 maxt=1517378400000
level=info ts=2018-01-31T07:05:40.878286592Z caller=head.go:348 component=tsdb msg="head GC completed" duration=667.990406ms
level=info ts=2018-01-31T07:05:47.012690499Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=6.134198563s
level=info ts=2018-01-31T07:05:47.615045371Z caller=compact.go:387 component=tsdb msg="compact blocks" count=3 mint=1517356800000 maxt=1517378400000
level=info ts=2018-01-31T09:00:00.065031591Z caller=compact.go:387 component=tsdb msg="compact blocks" count=1 mint=1517378400000 maxt=1517385600000
level=info ts=2018-01-31T09:00:21.023672893Z caller=head.go:348 component=tsdb msg="head GC completed" duration=636.293717ms
level=info ts=2018-01-31T09:00:27.929158127Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=6.905299083s

Background

Before upgrading the current 1.8.2 to 2.1.0, we are doing performance tests. 2.1.0 is much better than 1.8.2 in terms of scraping performance. But query performance is much worse than 1.8.2.

Both prometheus is scraping 1K targets every 40 sec, each exporting 1K samples with 15 labels. Samples are gauge whose values are dynamically generated between -2^32 ~ 2^32. This amounts to 25K/sec. With the given container spec (64GB RAM, 800GB SSD), both prometheus shows less than 1% of CPU utilization, and 11% (2.1.0) and 21% (1.8.2) of MEM utilization when not queried.

The samples are simply metric1{label1="v1", label2="v2"..., label15="v15"}, metric2{...} ... metric1000{...}.

Performance Test

I used nGrinder to apply stress to Prometheus. nGrinder creates 300 vusers, each querying prometheus. Timeout is 6 sec, but no test cases below incurred timeout.

  • Query: metric1

2.1.0 shows ~330 TPS while 1.8.2 shows ~270 TPS. So far so good.

  • Query: count(metric2{label1="v1"})

1.8.2 shows about ~1.3K TPS while 2.1.0 ~470 TPS.
2.1.0 shows much higher CPU usage while similar MEM utilization.

CPU:
cpu

MEM (RSS):
memrss

MEM (virtual):
memvirt

Goroutines and Gothread:
goroutinenthread

HTTP query duration (endpoint: /query):
reqdur

FS read/write bandwidth:
fsop

NW connection:
nwconn

CPU pprof:
https://drive.google.com/open?id=18qiiciQYLrT3TGpz9fcCl-q_1oYjPU96

HEAP pprof:
https://drive.google.com/open?id=1_zkIaGWknWEpaRnIG-w19wgCjpB8IFbX

Goroutine pprof:
https://drive.google.com/open?id=12dK4KesLeedTFWV4MwMqmf6gIGo6vuPb

Questions

CPU profiling indicates that 2.1.0 spends a lot of time reading data. I'm not sure but this may be related to the variable bit encoding that 2.1.0 introduced to reduce storage foot prints. Anyway, the query filters and aggregates only the current values, which should be residing in memory for both Prometheus (from chunks or mmapped files). So it's very hard for me to understand this results.

2.1.0 shows much larger active/established network connections, and about twice many goroutines. Could this be a cause?

Any help will be welcomed.


I edited several typos, uploaded images with more system metrics.

@TheyDroppedMe

This comment has been minimized.

Copy link

TheyDroppedMe commented Jan 31, 2018

I am also seeing decreased performance with 2.1.0 compared to 2.0.0

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Feb 1, 2018

Could you check whether this is still the case when you build Prometheus from head of master? We recently fixed a performance issue that was introduced in v2.1.0 (#3715). A patch release should be coming soon.

@roengram

This comment has been minimized.

Copy link
Author

roengram commented Feb 2, 2018

Thank you for your comment.

I've conducted the same test with the master version of prometheus. It's version is as follows:

prometheus, version 2.1.0 (branch: master, revision: 5169ccf25850045d7d84469df8675bf9c239c017)
  build user:       hyejung@hyejung
  build date:       20180202-02:02:16
  go version:       go1.9.2

Unfortunately, the result is more or less same. Query TPS and system metrics are all very similar to the case of the previous 2.1.0.

From the cpu profiling I suspect that this is because prometheus 2 always has to decode data, which has been encoded by taking delta and xoring. I read from internet articles that prometheus 2 does not use memory chunks but relies on page cache, and it uses some techniques involving delta/xor. As I understand, this means that the ingested metric is always stored directly to the file (mmapped) in an encoded form. So when Prometheus 2 needs to read out, it has to decode it first.

Admittedly this is a wild guess. But if it's true, I would be happily waste more storage for lower CPU consumption.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 2, 2018

300 relatively heavy PromQL queries per second is quite a bit, in practice that is the load of hundreds of dashboards being refreshed every 10 seconds. In this sort of case recording rules are recommended to reduce load, as you're hitting 1k metrics per query. I'm not sure this is a case we should be trading off RAM for. What sort of CPU is it?

@roengram

This comment has been minimized.

Copy link
Author

roengram commented Feb 2, 2018

I'm testing on a private cloud enviornment, so it's hard to tell how much CPU power I actually use in terms of a real HW. At least I can tell that the prometheus container uses 30 vcpus and 64GB RAM on a 56 core 256GB RAM machine. If you need more detail, please let me know.

I kept on testing and even more strange results is when I tested a metadata query /api/v1/series?match[]=metric1{label1="v1"}. In the same container and scrape setting as the above test, 1.8.2 yielded 530 TPS while 2.1 showed 11 TPS when 50 concurrent users are querying.

I attach some data about the above test.

CPU:
cpu

pprof CPU profile: https://drive.google.com/open?id=1KKaYbbsDP9nbK0YID9Dy-zQK2WzX9JHJ
pprof heap: https://drive.google.com/open?id=1mW4y5usb-twJ9JSI5eX096nag5NS1N5S
pprof goroutine: https://drive.google.com/open?id=1wP5n9wdqFoTN_ZdIE60fUyDh-pAWY5_4

As @brian-brazil said, the workload I applied may not be very realistic, and we may never see real such workload on our system. But what I'm concerning is that the performance gap between 1.8 and 2.1 on the same container spec because it will eventually determine whether to upgrade.

I'd like to know if this is expected of prometheus 2 or just a bug.

@ddewar2k

This comment has been minimized.

Copy link

ddewar2k commented Feb 9, 2018

Don't know if this is related or not, but we are running prometheus 2.1 in an OpenShift cluster that has 370 pods. node_exporter and kube-state-metrics exporter scraping metrics on all. With no grafana running against it the cpu usage is lower then heapster also running. But with each grafana browser tab running the kubernetes cluster monitoring dashboard we see pretty big jumps in CPU. With no grafana pointed at prometheus we see .097 cores of CPU. With 1 grafana browser we see 1.5-2 cores and with 5 browsers we hit 8 cores.
image

@roengram

This comment has been minimized.

Copy link
Author

roengram commented Feb 9, 2018

I've compared query performance of 1.8.2 and 2.1. And the results are as follows:

  • api/v1/series?match[]=metric1{label1="v1"}
    • TPS (1.8): 260
    • TPS (2.1): 266
  • api/v1/query?query=count(metric2{label1="v1"}
    • TPS (1.8): 2,115
    • TPS (2.1): 361
  • api/v1/query?query=sum(metric1)by(label1)
    • TPS (1.8): 2,144
    • TPS (2.1): 777

Both prometheus was running on m4.4xlarge.
All in all, query performance difference is still there.

@xThomo

This comment has been minimized.

Copy link

xThomo commented Feb 9, 2018

@roengram @TheyDroppedMe above mentioned there was also a drop in performance from 2.0 to 2.1. Do you have the numbers for 2.0 as well? Maybe there are other issues beyond #3715

@ddewar2k

This comment has been minimized.

Copy link

ddewar2k commented Feb 13, 2018

Note: the dashboard we are using is: Kubernetes cluster monitoring (via Prometheus)
And its the CPU related queries that seem to be a problem:

This query is used in one panel.
sum (rate (container_cpu_usage_seconds_total{image!="",name="^k8s_.*",kubernetes_io_hostname="^$Node$"}[$interval])) by (pod_name)

These queries are used in another panel.
sum (rate (container_cpu_usage_seconds_total{image!="",name="^k8s_.*",container_name!="POD",kubernetes_io_hostname="^$Node$"}[$interval])) by (container_name, pod_name)

sum (rate (container_cpu_usage_seconds_total{image!="",name!"^k8s_.*",kubernetes_io_hostname="^$Node$"}[$interval])) by (kubernetes_io_hostname, name, image)

sum (rate (container_cpu_usage_seconds_total{rkt_container_name!="",kubernetes_io_hostname=~"^$Node$"}[$interval])) by (kubernetes_io_hostname, rkt_container_name)

Do these queries look onerous, or is it just the number of samples that is driving the high CPU usage?

I am going to guess many folks doing contianer work will use this dashboard.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jun 12, 2018

Prometheus 2.3.0 includes a significant rewrite of PromQL which gives good performance improvements. There are unlikely to be many significant performance improvements beyond that. If you have ideas to optimise specific queries that seem slow, please let us know.

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.