Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sum of loaned data does not display #1404

Closed
sjvrijn opened this issue May 31, 2021 · 3 comments
Closed

Sum of loaned data does not display #1404

sjvrijn opened this issue May 31, 2021 · 3 comments

Comments

@sjvrijn
Copy link
Contributor

sjvrijn commented May 31, 2021

Describe the bug
I am trying to display a combined utilization value for some of the multi-gpu machines I monitor using this example from the documentation, but the placeholder/sum is not shown in my graphs. There are no connection issues, and no error in the graph log either.

I expect to be missing something in my configuration, but having gone through the documentation multiple times, I can't find what I'm missing or doing wrong, so any help would be much appreciated.

To Reproduce

  • A server redstone.example.com with two NVIDIA gpus and the nvidia_gpu_[utilization] plugin

  • /etc/munin/munin.conf: (trimmed away all irrelevant machines/categories)

cgitmpdir /var/tmp
includedir /etc/munin/conf.d

# Generate all graphs on cron for faster page loads
graph_strategy cron

# Configure cgi url for dynazoom
#cgiurl_graph /munin-cgi/munin-cgi-graph

# Generate all HTML periodically on cron
html_strategy cron


[Servers;]
        node_order GPU

[Servers;GPU;]
        node_order Summary

[Servers;GPU;Summary]
        update no

        # works as expected
        cpuload.graph_title GPU machines - CPU utilization
        cpuload.graph_args --base 1000 -l 0 -u 100 -r
        cpuload.graph_vlabel CPU
        cpuload.graph_scale no
        cpuload.graph_category system
        cpuload.graph_info Current CPU utilization
        cpuload.graph_order \
                redstone=Servers;GPU;redstone.example.com:cpu.idle \
        cpuload.redstone.cdef 100,redstone,32,/,-
        cpuload.redstone.draw LINE

        # does not display `redstone`
        gpu_util.graph_title GPU machines - GPU utilization
        gpu_util.graph_args --base 1000 -l 0 -u 100 -r
        gpu_util.graph_vlabel GPU
        gpu_util.graph_category system
        gpu_util.graph_order \
                test=Servers;GPU;redstone.example.com:nvidia_gpu_utilization.utilization0 \
                redstone \
        gpu_util.redstone.sum \
                Servers;GPU;redstone.example.com:nvidia_gpu_utilization.utilization0 \
                Servers;GPU;redstone.example.com:nvidia_gpu_utilization.utilization1


[Servers;GPU;runite.example.com]
        address runite.example.com

Expected behavior
An entry redstone that tracks the sum of the two GPU utilization values, akin to the donald_disk example from http://guide.munin-monitoring.org/en/latest/example/graph/aggregation-combined.html

Screenshots & Logs
graphs:
munin_sum_issue

/var/log/munin/munin-update.log:

2021/05/31 11:45:00 [INFO]: Starting munin-update
[...]
2021/05/31 11:45:09 [INFO] starting work in 3045 for redstone.example.com (redstone.example.com:4949).
[...]
2021/05/31 11:45:20 [INFO]: Munin-update finished for node Servers;GPU;redstone.example.com (11.10 sec)
[...]
2021/05/31 11:45:23 [INFO] Reaping Munin::Master::UpdateWorker<Servers;GPU;redstone.example.com>.  Exit value/signal: 0/0
[...]
2021/05/31 11:45:49 [INFO]: Munin-update finished (48.90 sec)

/var/log/munin/munin-graph.log:

2021/05/31 11:46:10 Starting munin-graph
2021/05/31 11:46:55 Munin-graph finished (44.50 sec)

(The 45 second duration is because my full config tracks ~30 machines instead of just redstone)

Desktop

  • CentOS Linux release 7.9.2009 (Core)
  • Munin Version 2.0.66 (Release 1.el7)
@sjvrijn
Copy link
Contributor Author

sjvrijn commented Jun 4, 2021

I have found the issue: the placeholder redstone requires a label to be defined before it shows up in the graphs.
I.e. a line gpu_util.redstone.label redstone had to be added

@sumpfralle
Copy link
Collaborator

Just for my better understanding:

Graph is missing?

        gpu_util.graph_title GPU machines - GPU utilization
        gpu_util.graph_args --base 1000 -l 0 -u 100 -r
        gpu_util.graph_vlabel GPU
        gpu_util.graph_category system
        gpu_util.graph_order \
                test=Servers;GPU;redstone.example.com:nvidia_gpu_utilization.utilization0 \
                redstone \
        gpu_util.redstone.sum \
                Servers;GPU;redstone.example.com:nvidia_gpu_utilization.utilization0 \
                Servers;GPU;redstone.example.com:nvidia_gpu_utilization.utilization1

Graph is visible?

(.label is added before .sum)

        gpu_util.graph_title GPU machines - GPU utilization
        gpu_util.graph_args --base 1000 -l 0 -u 100 -r
        gpu_util.graph_vlabel GPU
        gpu_util.graph_category system
        gpu_util.graph_order \
                test=Servers;GPU;redstone.example.com:nvidia_gpu_utilization.utilization0 \
                redstone \
        gpu_util.redstone.label foo
        gpu_util.redstone.sum \
                Servers;GPU;redstone.example.com:nvidia_gpu_utilization.utilization0 \
                Servers;GPU;redstone.example.com:nvidia_gpu_utilization.utilization1

Do I understand you correctly?
Is this related to .cdef vs. .sum (see your first example)?
Thank you!

@sjvrijn
Copy link
Contributor Author

sjvrijn commented Jul 23, 2021

@sumpfralle Yes, with the gpu_util.redstone.label foo line added the entry becomes visible on the graph.
Difference is not due to cdef vs sum, but because any entry displayed in a graph needs two things: data and a label. When creating an entry using an initial placeholder name under graph_order, both still need to be specified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants