Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add new JSON for Grafana dashboard #39

Merged
merged 2 commits into from Apr 1, 2020
Merged

Conversation

MichelDiz
Copy link
Contributor

@MichelDiz MichelDiz commented Mar 25, 2020

This is a work in progress as some metrics have issues see => dgraph-io/dgraph#4772

ping @sleto-it


This change is Reviewable

@sleto-it sleto-it requested a review from danielmai March 25, 2020 18:36
Copy link
Contributor

@danielmai danielmai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add rate panels for both queries and mutations.

I like the gauges showing the memory usage for all the different instances. But, the thresholds can be better. It's always showing it in the Red threshold for me.

Reviewable status: 0 of 1 files reviewed, 8 unresolved discussions (waiting on @danielmai and @MichelDiz)


scripts/grafana_dashboard.json, line 151 at r1 (raw file):

          "timeFrom": null,
          "timeShift": null,
          "title": "Zero and Alpha",

This Panel is empty for me.

image.png

Can we remove this if this isn't showing anything? Or, do I need to use a specific version of Grafana? I'm currently using 6.1.6.


scripts/grafana_dashboard.json, line 224 at r1 (raw file):

          "targets": [
              {
                  "expr": "sum(dgraph_num_edges_total{instance=~'$Instance'})",

This is the number of processed mutations. So, it can be in a chart just like the "Processed Queries" chart.


scripts/grafana_dashboard.json, line 346 at r1 (raw file):

          "options": {},
          "pluginVersion": "6.6.1",
          "postfix": " /queries",

This metric is cut off for me.

image.png


scripts/grafana_dashboard.json, line 438 at r1 (raw file):

          "targets": [
              {
                  "expr": "dgraph_memory_inuse_bytes+dgraph_memory_idle_bytes{instance=~'$Instance'}",

In Grafana the legend doesn't show what this metric is. We should give a label.

Inuse+Idle ({{instance}})


scripts/grafana_dashboard.json, line 449 at r1 (raw file):

                  "intervalFactor": 2,
                  "legendFormat": "",
                  "metric": "dgraph_memory_proc_bytes",

We can give a cleaner label for metrics.

This one can be

`Proc ({{instance}})


scripts/grafana_dashboard.json, line 828 at r1 (raw file):

              {
                  "expr": "dgraph_num_queries_total",
                  "format": "heatmap",

This should be Time series


scripts/grafana_dashboard.json, line 1111 at r1 (raw file):

              {
                  "expr": "dgraph_raft_applied_index{instance=~'$Instance'}",
                  "format": "heatmap",

This should be the time-series format.


scripts/grafana_dashboard.json, line 1301 at r1 (raw file):

          "targets": [
              {
                  "expr": "go_threads",

go_goroutines is a better metric to look at than go_threads.

Copy link
Contributor Author

@MichelDiz MichelDiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the version you are using must have some issues. Check this docker compose https://gist.github.com/MichelDiz/42954e321620159c872c35c20e9d85c6 I'm using this to test the grafana.

Also, I have added a modified JSON based on your comments https://gist.github.com/MichelDiz/42954e321620159c872c35c20e9d85c6#file-grafana_dgraphv2-json

Take a look at this screenshot dgraph-io/dgraph#4759 (comment)

Reviewable status: 0 of 1 files reviewed, 8 unresolved discussions (waiting on @danielmai)


scripts/grafana_dashboard.json, line 151 at r1 (raw file):

Previously, danielmai (Daniel Mai) wrote…

This Panel is empty for me.

image.png

Can we remove this if this isn't showing anything? Or, do I need to use a specific version of Grafana? I'm currently using 6.1.6.

I'm using version 6.7.X (is the latest one). I have tested and all panels are working fine. It shows the "state/health" of each instance.


scripts/grafana_dashboard.json, line 224 at r1 (raw file):

Previously, danielmai (Daniel Mai) wrote…

This is the number of processed mutations. So, it can be in a chart just like the "Processed Queries" chart.

This is intended to write on the screen "92 mi /RDFs" - By the way, I believe that this one has inaccurate values.


scripts/grafana_dashboard.json, line 346 at r1 (raw file):

Previously, danielmai (Daniel Mai) wrote…

This metric is cut off for me.

image.png

I increased that part.


scripts/grafana_dashboard.json, line 438 at r1 (raw file):

Previously, danielmai (Daniel Mai) wrote…

In Grafana the legend doesn't show what this metric is. We should give a label.

Inuse+Idle ({{instance}})

Done.


scripts/grafana_dashboard.json, line 449 at r1 (raw file):

Previously, danielmai (Daniel Mai) wrote…

We can give a cleaner label for metrics.

This one can be

`Proc ({{instance}})

Done.


scripts/grafana_dashboard.json, line 828 at r1 (raw file):

Previously, danielmai (Daniel Mai) wrote…

This should be Time series

Done.


scripts/grafana_dashboard.json, line 1111 at r1 (raw file):

Previously, danielmai (Daniel Mai) wrote…

This should be the time-series format.

Done.


scripts/grafana_dashboard.json, line 1301 at r1 (raw file):

Previously, danielmai (Daniel Mai) wrote…

go_goroutines is a better metric to look at than go_threads.

go_goroutines are in a gauge panel

@MichelDiz
Copy link
Contributor Author

@danielmai here a screenshot with the changes you suggested.

Captura de Tela 2020-03-27 às 23 38 10

Copy link
Contributor

@danielmai danielmai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should simplify the dashboard, and then we can merge this PR:

Change:

  • Use a timeseries graph for goroutines
  • Change "/RDFs" to "/edges inserted". This metric counts the edges of mutations (including across replicas. So, it can look 3x higher in a 3-Alpha group). We should clarify this panel. "92/RDFs" can sound like there are 92 edges in Dgraph itself (that can be queried)., which isn't the case.

Add:

  • Timeseries graph for dgraph_num_edges_total
  • Timeseries graph for dgraph_num_queries_total

Remove:

  • Timeseries graph for go_threads (not really useful).

Reviewable status: 0 of 1 files reviewed, 8 unresolved discussions (waiting on @danielmai)

Removed some unuseful metrics.
Copy link
Contributor Author

@MichelDiz MichelDiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 8 unresolved discussions (waiting on @danielmai)


scripts/grafana_dashboard.json, line 151 at r1 (raw file):

Previously, MichelDiz (Michel Conrado) wrote…

I'm using version 6.7.X (is the latest one). I have tested and all panels are working fine. It shows the "state/health" of each instance.

Done.


scripts/grafana_dashboard.json, line 224 at r1 (raw file):

Previously, MichelDiz (Michel Conrado) wrote…

This is intended to write on the screen "92 mi /RDFs" - By the way, I believe that this one has inaccurate values.

Done.


scripts/grafana_dashboard.json, line 346 at r1 (raw file):

Previously, MichelDiz (Michel Conrado) wrote…

I increased that part.

Done.


scripts/grafana_dashboard.json, line 1301 at r1 (raw file):

Previously, MichelDiz (Michel Conrado) wrote…

go_goroutines are in a gauge panel

Done.

@MichelDiz MichelDiz requested a review from danielmai April 1, 2020 02:12
@MichelDiz MichelDiz merged commit 489d4b2 into master Apr 1, 2020
@MichelDiz MichelDiz deleted the micheldiz/newGrafanaJSON branch April 1, 2020 03:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants