Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grafana-server use 100% cpu #16118

Closed
aderumier opened this issue Mar 20, 2019 · 4 comments
Closed

grafana-server use 100% cpu #16118

aderumier opened this issue Mar 20, 2019 · 4 comments
Labels
needs investigation for unconfirmed bugs. use type/bug for confirmed bugs, even if they "need" more investigating needs more info Issue needs more information, like query results, dashboard or panel json, grafana version etc

Comments

@aderumier
Copy link

since some days, grafana is using 100% cpu (if i put 2cores, 4cores , or 8cores in the vm, all cores are used).
restarting the server, the cpu is at 100% just after the start.
(Even without external access to gui)

I don't see nothing special in log

t=2019-03-20T19:37:47+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:37:56+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:38:06+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:38:16+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:38:26+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:38:36+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:38:46+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:38:56+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:39:06+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:39:16+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:39:26+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:39:37+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:39:47+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:39:56+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:40:06+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:40:17+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:40:26+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:40:37+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:40:46+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:40:56+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:41:06+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:41:16+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:41:27+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:41:36+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:41:47+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0
t=2019-03-20T19:41:56+0100 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=0

I have 1300 dashboard in database. (It's a multi-organisational grafana, with around 50 organistations with 30 dashboards each)

here a perf top, seem to be related to json parsing.

    9,64%     9,54%  grafana-server  grafana-server     [.] encoding/json.(*decodeState).scanWhile                                                                                                                                          ◆
+    9,01%     9,01%  grafana-server  grafana-server     [.] encoding/json.(*Decoder).readValue                                                                                                                                              ▒
+    6,71%     6,71%  grafana-server  grafana-server     [.] runtime.mallocgc                                                                                                                                                                ▒
+    6,60%     6,60%  grafana-server  grafana-server     [.] encoding/json.stateEndValue                                                                                                                                                     ▒
+    6,60%     6,60%  grafana-server  grafana-server     [.] runtime.scanobject                                                                                                                                                              ▒
+    6,39%     6,39%  grafana-server  grafana-server     [.] runtime.findObject                                                                                                                                                              ▒
+    5,77%     5,77%  grafana-server  grafana-server     [.] encoding/json.stateInString                                                                                                                                                     ▒
+    4,09%     4,09%  grafana-server  grafana-server     [.] runtime.mapassign_faststr                                                                                                                                                       ▒
+    3,04%     3,04%  grafana-server  grafana-server     [.] crypto/md5.block                                                                                                                                                                ▒
+    3,04%     3,04%  grafana-server  grafana-server     [.] runtime.memclrNoHeapPointers                                                                                                                                                    ▒
+    2,83%     2,83%  grafana-server  grafana-server     [.] encoding/json.unquoteBytes                                                                                                                                                      ▒
+    2,62%     2,62%  grafana-server  grafana-server     [.] encoding/json.stateBeginString                                                                                                                                                  ▒
+    2,62%     2,62%  grafana-server  grafana-server     [.] runtime.heapBitsSetType                                                                                                                                                         ▒
+    2,41%     2,41%  grafana-server  grafana-server     [.] runtime.greyobject                                                                                                                                                              ▒
+    2,31%     2,20%  grafana-server  grafana-server     [.] runtime.gcWriteBarrier                                                                                                                                                          ▒
+    2,10%     2,10%  grafana-server  grafana-server     [.] runtime.wbBufFlush1                                                                                                                                                             ▒
+    2,10%     1,99%  grafana-server  grafana-server     [.] encoding/json.stateBeginValue                                                                                                                                                   ▒
+    1,57%     1,57%  grafana-server  grafana-server     [.] runtime.memhash                                                                                                                                                                 ▒
+    1,57%     0,10%  grafana-server  grafana-server     [.] syscall.Syscall                                                                                                                                                                 ▒
+    1,57%     0,00%  grafana-server  [kernel.kallsyms]  [k] entry_SYSCALL_64_after_hwframe                                                                                                                                                  ▒
+    1,57%     0,21%  grafana-server  [kernel.kallsyms]  [k] do_syscall_64                                                                                                                                                                   ▒
+    1,47%     1,47%  grafana-server  grafana-server     [.] encoding/json.stateBeginStringOrEmpty                                                                                                                                           ▒
+    1,36%     1,26%  grafana-server  grafana-server     [.] runtime.memmove                                                                                                                                                                 ▒
+    0,94%     0,94%  grafana-server  grafana-server     [.] runtime.bulkBarrierPreWrite                                                                                                                                                     ▒
+    0,84%     0,84%  grafana-server  grafana-server     [.] encoding/json.(*decodeState).objectInterface                                                                                                                                    ▒
+    0,84%     0,84%  grafana-server  grafana-server     [.] runtime.evacuate_faststr                                                                                                                                                        ▒
+    0,73%     0,73%  grafana-server  grafana-server     [.] encoding/json.stateBeginValueOrEmpty                                                                                                                                            ▒
+    0,73%     0,73%  grafana-server  grafana-server     [.] encoding/json.unquote                                                                                                                                                           ▒
+    0,73%     0,73%  grafana-server  grafana-server     [.] runtime.scanblock                                                                                                                                                               ▒
+    0,63%     0,00%  grafana-server  [kernel.kallsyms]  [k] ksys_read                                                                                                                                                                       ▒
+    0,63%     0,00%  grafana-server  [kernel.kallsyms]  [k] vfs_read                                                                                                                                                                        ▒
     0,63%     0,21%  grafana-server  [kernel.kallsyms]  [k] xfs_file_read_iter                                                                                                                                                              ▒
+    0,63%     0,00%  grafana-server  [kernel.kallsyms]  [k] new_sync_read                                                                                                                                                                   ▒
+    0,52%     0,52%  grafana-server  grafana-server     [.] encoding/json.(*decodeState).arrayInterface                                                                                                                                     ▒
+    0,52%     0,52%  grafana-server  grafana-server     [.] encoding/json.(*decodeState).literalInterface                                                                                                                                   ▒
+    0,52%     0,52%  grafana-server  grafana-server     [.] runtime.slicebytetostring                                                                                                                                                       ▒
     0,52%     0,21%  grafana-server  [kernel.kallsyms]  [k] exit_to_usermode_loop                                                                                                                                                           ▒
+    0,52%     0,52%  grafana-server  grafana-server     [.] runtime.gcDrain                                                                                                                                                                 ▒
     0,42%     0,42%  grafana-server  grafana-server     [.] runtime.gentraceba

Environment:

  • Grafana version: 6.0.2 or 5.4.3
  • Data source type & version: influxdb
  • OS Grafana is installed on: debian stretch
  • User OS & Browser: don't now
  • Grafana plugins: no
  • Others:
@torkelo
Copy link
Member

torkelo commented Mar 20, 2019

how many alert rules? is provisioning of dashboards being used?

Grafana version: 6.0.2 or 5.4.3

both ??

@torkelo torkelo added needs more info Issue needs more information, like query results, dashboard or panel json, grafana version etc needs investigation for unconfirmed bugs. use type/bug for confirmed bugs, even if they "need" more investigating labels Mar 20, 2019
@aderumier
Copy link
Author

Hi, Thanks for helping,

how many alert rules?
alert rules are not used.

is provisioning of dashboards being used?
yes
I have 20 jsons files in /var/lib/grafana/dashboards/ (not updated since months)

theses dashboard are shared betweens 85 organisations
(I have also 85 yml with provider for each organisation in /etc/grafana/provisioning/dashboards, and 85 datasources yaml in /etc/grafana/provisioning/datasources)

Grafana version: 6.0.2 or 5.4.3
both ??

yes both. The problem happen with a running 5.4.0 (working fine since week), then I have tried to upgrade to 5.4.3, same problem, and upgrade to 6.0.2, same problem.

@torkelo
Copy link
Member

torkelo commented Apr 1, 2019

have you specified updateIntervalSeconds in the dashboard yaml files?

theses dashboard are shared betweens 85 organisations

You share the same json files between orgs? Do the files have a dashboard with uid? If so that could be very problematic. Make sure non of the json files have an id or uid property. and set updateIntervalSeconds so some high value like 360 (given you have so many different provisioning setups). Preferably each org should have their own mapped files, you cannot share json files between different provosioned orgs, try to create a seperate folder for each org and place copies of each json in those.

@torkelo torkelo closed this as completed Apr 1, 2019
@aderumier
Copy link
Author

Hi,

. and set updateIntervalSeconds so some high value like 360

Indeed, it's was 3s by organisation, that's explain why the cpu was so high.
I have setup it to 1h, no more cpu problem.

Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs investigation for unconfirmed bugs. use type/bug for confirmed bugs, even if they "need" more investigating needs more info Issue needs more information, like query results, dashboard or panel json, grafana version etc
Projects
None yet
Development

No branches or pull requests

2 participants