Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FYI: Prometheus federation master: 60s flamegraph #2010

Closed
mjtrangoni opened this Issue Sep 20, 2016 · 5 comments

Comments

Projects
None yet
2 participants
@mjtrangoni
Copy link
Contributor

mjtrangoni commented Sep 20, 2016

What did you do?

Just to let you know: I created a 60s FlameGraph of our Prometheus federation master using go-torch.

What did you expect to see?

I am surprised by the GC cpu time.

What did you see instead? Under which circumstances?

62% cpu time spent in Garbage Collector. Is this expected?This is just FYI. Feel free to close this issue if this is expected behavior.

Environment
CentOS Linux release 7.2.1511 (Core)

  • System information:
    Linux 3.10.0-327.28.3.el7.x86_64 x86_64
  • Prometheus version:
    prometheus, version 1.1.3 (branch: v1.1.3_build, revision: ac374aa)
    build date: 20160916-12:58:14
    go version: go1.7.1
  • Master (Slave0):
    • Ingested Samples: 2.889k/s
    • Time Series: 479.4k
    • Memory: 4.983GiB
  • Slave1:
    • Ingested Samples: 5.38k/s
    • Time Series: 80.94k
    • Memory: 3.769GiB
  • Slave2:
    • Ingested Samples: 13.83k/s
    • Time Series: 333.2k
    • Memory: 4.375GiB

This zip contains the FlameGraph svg:

prometheus_federation_60s.zip

How to read the FlameGraph: "The x-axis shows the stack profile population, sorted alphabetically (it is not the passage of time), and the y-axis shows stack depth. Each rectangle represents a stack frame. The wider a frame is is, the more often it was present in the stacks. The top edge shows what is on-CPU, and beneath it is its ancestry. The colors are usually not significant, picked randomly to differentiate frames." Source

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Sep 20, 2016

  • What's your total CPU usage? (If you are light on CPU usage, then 62% is just a large chunk of a very small CPU usage.)
  • Does the flame graph look differently of you take it multiple times? (GC might just have kicked in for this one, and it might look differently at another time.)
@mjtrangoni

This comment has been minimized.

Copy link
Contributor Author

mjtrangoni commented Sep 22, 2016

Hi @beorn7,

  • What's your total CPU usage? (If you are light on CPU usage, then 62% is just a large chunk of a very small CPU usage.)

    Prometheus uses between 5% and %500. It depends on the amount of threads opens. See

    # ps -eLf | grep /usr/sbin/prometheus | wc -l
    39
    # cat /proc/cpuinfo  | grep 'model name' | uniq -c
     24 model name      : Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz
    # vmstat 1
    procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
    r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
    0  1      0 67071196   1812 23299532    0    0    11   275   21    6  5  0 92  4  0
    1  2      0 67068400   1812 23299564    0    0     0  5840 1293 2100  1  0 93  6  0
    0  2      0 67059408   1812 23299608    0    0     0  4624 2085 2628  4  0 87  8  0
    0  1      0 67044472   1812 23299564    0    0     0  3072 1415 1871  2  0 90  7  0
    2  1      0 67071800   1812 23299568    0    0     0  3068 1338 2338  1  0 94  4  0
    0  2      0 67086432   1812 23299596    0    0     0  2820 1118 1994  1  0 95  4  0
    1  2      0 67125568   1812 23299564    0    0     0 11656 1536 2493  1  0 90  8  0
    0  2      0 67126040   1812 23299624    0    0     0  2760  869 1600  0  0 91  8  0
    1  1      0 67122192   1812 23299644    0    0     0  1412 1197 2431  1  0 94  5  0
    1  1      0 67113952   1812 23299660    0    0     0  1536 1277 1975  2  0 94  4  0
    0  1      0 67141696   1812 23299576    0    0     0  1796 1562 2769  1  0 94  4  0
    0  1      0 67143440   1812 23299568    0    0     0  1928  792 1576  0  0 95  5  0
    0  1      0 67114048   1812 23299600    0    0     0  2048 2678 2635  7  0 89  4  0
    4  2      0 67092572   1812 23299544    0    0     0  4016 2448 2324  6  0 87  6  0
    
  • Does the flame graph look differently of you take it multiple times? (GC might just have kicked in for this one, and it might look differently at another time.)

    I just realized that -log.level=debug was active during my first test. That's why I made two FlameGraphs with and without the debug mode. See this zip prometheus_federation_600s.zip.

    In the new FlameGraphs the GC cpu usage is not as big as before.

    A co-worker remarked that Prometheus uses golang/protobuf instead of gogo/protobuf. See e.g. in go_serialization_benchmarks. Any thought about that?

    benchmark                                   iter           time/iter      bytes alloc               allocs
    ---------                                   ----           ---------      -----------               ------
    BenchmarkGoprotobufMarshal-8                2000000        746 ns/op      312 B/op        4 allocs/op
    BenchmarkGoprotobufUnmarshal-8              2000000        978 ns/op      432 B/op        9 allocs/op
    BenchmarkGogoprotobufMarshal-8             10000000        211 ns/op       64 B/op        1 allocs/op
    BenchmarkGogoprotobufUnmarshal-8            5000000        289 ns/op       96 B/op        3 allocs/op
    
@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Sep 22, 2016

In general, I would expect a server that does nothing else but scraping a light load and then federate everything further up to have a CPU workload with a significant part of GC, as it's mostly shoveling data around without a let to do otherwise for the CPU.
Your "normal mode" flame graph doesn't look suspicious to me.

Having said all that, https://github.com/gogo/protobuf looks like the package I have been looking for all the time, while I was upset how allocation-wasteful the standard protobuf implementation is. I just had a first glance, and I didn't know the package before. This is definitely something we should look into. Thanks for the pointer.

The practical impact might be limited, as the protobuf encoding and decoding is rarely a bottle neck, but every little helps, and more idiomatic code anyway. @juliusv Perhaps that's something that can be stuffed into the proto3 transition, too?

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Sep 22, 2016

I've filed prometheus/client_model#15 about the gogo/protobuf idea. Will close this here as there is no real issue left.

@beorn7 beorn7 closed this Sep 22, 2016

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.