I0412 09:28:38.013326 1 gpu.go:47] Trying to initialize GPU collector using dcgm W0412 09:28:38.014623 1 gpu_dcgm.go:104] There is no DCGM daemon running in the host: libdcgm.so not Found W0412 09:28:38.014897 1 gpu_dcgm.go:108] Could not start DCGM. Error: libdcgm.so not Found I0412 09:28:38.015031 1 gpu.go:54] Error initializing dcgm: not able to connect to DCGM: libdcgm.so not Found I0412 09:28:38.015193 1 gpu.go:47] Trying to initialize GPU collector using nvidia-nvml I0412 09:28:38.015502 1 gpu.go:54] Error initializing nvidia-nvml: failed to init nvml. ERROR_LIBRARY_NOT_FOUND I0412 09:28:38.015652 1 gpu.go:47] Trying to initialize GPU collector using dummy I0412 09:28:38.015796 1 gpu.go:51] Using dummy to obtain gpu power I0412 09:28:38.046794 1 exporter.go:155] Kepler running on version: release-0.7.8 I0412 09:28:38.046874 1 config.go:280] using gCgroup ID in the BPF program: true I0412 09:28:38.046897 1 config.go:282] kernel version: 5.15 I0412 09:28:38.046931 1 exporter.go:167] LibbpfBuilt: true, BccBuilt: false I0412 09:28:38.046940 1 exporter.go:186] EnabledBPFBatchDelete: true I0412 09:28:38.046977 1 rapl_msr_util.go:129] failed to open path /dev/cpu/0/msr: no such file or directory I0412 09:28:38.047027 1 power.go:72] Unable to obtain power, use estimate method I0412 09:28:38.047035 1 redfish.go:169] failed to get redfish credential file path I0412 09:28:38.052031 1 acpi.go:67] Could not find any ACPI power meter path. Is it a VM? I0412 09:28:38.052049 1 power.go:72] using none to obtain power I0412 09:28:38.052079 1 exporter.go:201] Initializing the GPU collector I0412 09:28:38.052443 1 watcher.go:66] Using in cluster k8s config I0412 09:28:38.152938 1 watcher.go:137] k8s APIserver watcher was started I0412 09:28:38.153044 1 prometheus_collector.go:92] Registered Container Prometheus metrics I0412 09:28:38.153153 1 prometheus_collector.go:97] Registered VM Prometheus metrics I0412 09:28:38.153222 1 prometheus_collector.go:101] Registered Node Prometheus metrics libbpf: loading /var/lib/kepler/bpfassets/amd64_kepler.bpf.o libbpf: elf: section(3) kprobe/finish_task_switch, size 2776, link 0, flags 6, type=1 libbpf: sec 'kprobe/finish_task_switch': found program 'kprobe__finish_task_switch' at insn offset 0 (0 bytes), code size 347 insns (2776 bytes) libbpf: elf: section(4) .relkprobe/finish_task_switch, size 432, link 32, flags 40, type=9 libbpf: elf: section(5) tracepoint/irq/softirq_entry, size 144, link 0, flags 6, type=1 libbpf: sec 'tracepoint/irq/softirq_entry': found program 'kepler_irq_trace' at insn offset 0 (0 bytes), code size 18 insns (144 bytes) libbpf: elf: section(6) .reltracepoint/irq/softirq_entry, size 16, link 32, flags 40, type=9 libbpf: elf: section(7) kprobe/mark_page_accessed, size 104, link 0, flags 6, type=1 libbpf: sec 'kprobe/mark_page_accessed': found program 'kprobe__mark_page_accessed' at insn offset 0 (0 bytes), code size 13 insns (104 bytes) libbpf: elf: section(8) .relkprobe/mark_page_accessed, size 16, link 32, flags 40, type=9 libbpf: elf: section(9) kprobe/set_page_dirty, size 104, link 0, flags 6, type=1 libbpf: sec 'kprobe/set_page_dirty': found program 'kprobe__set_page_dirty' at insn offset 0 (0 bytes), code size 13 insns (104 bytes) libbpf: elf: section(10) .relkprobe/set_page_dirty, size 16, link 32, flags 40, type=9 libbpf: elf: section(11) .data, size 4, link 0, flags 3, type=1 libbpf: elf: section(12) .bss, size 4, link 0, flags 3, type=8 libbpf: elf: section(13) .maps, size 416, link 0, flags 3, type=1 libbpf: elf: section(14) license, size 4, link 0, flags 3, type=1 libbpf: license of /var/lib/kepler/bpfassets/amd64_kepler.bpf.o is GPL libbpf: elf: section(23) .BTF, size 35505, link 0, flags 0, type=1 libbpf: elf: section(25) .BTF.ext, size 2764, link 0, flags 0, type=1 libbpf: elf: section(32) .symtab, size 1320, link 1, flags 0, type=2 libbpf: looking for externs among 55 symbols... libbpf: collected 0 externs total libbpf: map 'processes': at sec_idx 13, offset 0. libbpf: map 'processes': found type = 1. libbpf: map 'processes': found key [6], sz = 4. libbpf: map 'processes': found value [10], sz = 112. libbpf: map 'processes': found max_entries = 32768. libbpf: map 'pid_time': at sec_idx 13, offset 32. libbpf: map 'pid_time': found type = 1. libbpf: map 'pid_time': found key [6], sz = 4. libbpf: map 'pid_time': found value [12], sz = 8. libbpf: map 'pid_time': found max_entries = 32768. libbpf: map 'cpu_cycles_event_reader': at sec_idx 13, offset 64. libbpf: map 'cpu_cycles_event_reader': found type = 4. libbpf: map 'cpu_cycles_event_reader': found key [2], sz = 4. libbpf: map 'cpu_cycles_event_reader': found value [6], sz = 4. libbpf: map 'cpu_cycles_event_reader': found max_entries = 128. libbpf: map 'cpu_cycles': at sec_idx 13, offset 96. libbpf: map 'cpu_cycles': found type = 2. libbpf: map 'cpu_cycles': found key [6], sz = 4. libbpf: map 'cpu_cycles': found value [12], sz = 8. libbpf: map 'cpu_cycles': found max_entries = 128. libbpf: map 'cpu_ref_cycles_event_reader': at sec_idx 13, offset 128. libbpf: map 'cpu_ref_cycles_event_reader': found type = 4. libbpf: map 'cpu_ref_cycles_event_reader': found key [2], sz = 4. libbpf: map 'cpu_ref_cycles_event_reader': found value [6], sz = 4. libbpf: map 'cpu_ref_cycles_event_reader': found max_entries = 128. libbpf: map 'cpu_ref_cycles': at sec_idx 13, offset 160. libbpf: map 'cpu_ref_cycles': found type = 2. libbpf: map 'cpu_ref_cycles': found key [6], sz = 4. libbpf: map 'cpu_ref_cycles': found value [12], sz = 8. libbpf: map 'cpu_ref_cycles': found max_entries = 128. libbpf: map 'cpu_instructions_event_reader': at sec_idx 13, offset 192. libbpf: map 'cpu_instructions_event_reader': found type = 4. libbpf: map 'cpu_instructions_event_reader': found key [2], sz = 4. libbpf: map 'cpu_instructions_event_reader': found value [6], sz = 4. libbpf: map 'cpu_instructions_event_reader': found max_entries = 128. libbpf: map 'cpu_instructions': at sec_idx 13, offset 224. libbpf: map 'cpu_instructions': found type = 2. libbpf: map 'cpu_instructions': found key [6], sz = 4. libbpf: map 'cpu_instructions': found value [12], sz = 8. libbpf: map 'cpu_instructions': found max_entries = 128. libbpf: map 'cache_miss_event_reader': at sec_idx 13, offset 256. libbpf: map 'cache_miss_event_reader': found type = 4. libbpf: map 'cache_miss_event_reader': found key [2], sz = 4. libbpf: map 'cache_miss_event_reader': found value [6], sz = 4. libbpf: map 'cache_miss_event_reader': found max_entries = 128. libbpf: map 'cache_miss': at sec_idx 13, offset 288. libbpf: map 'cache_miss': found type = 2. libbpf: map 'cache_miss': found key [6], sz = 4. libbpf: map 'cache_miss': found value [12], sz = 8. libbpf: map 'cache_miss': found max_entries = 128. libbpf: map 'task_clock_ms_event_reader': at sec_idx 13, offset 320. libbpf: map 'task_clock_ms_event_reader': found type = 4. libbpf: map 'task_clock_ms_event_reader': found key [2], sz = 4. libbpf: map 'task_clock_ms_event_reader': found value [6], sz = 4. libbpf: map 'task_clock_ms_event_reader': found max_entries = 128. libbpf: map 'task_clock': at sec_idx 13, offset 352. libbpf: map 'task_clock': found type = 2. libbpf: map 'task_clock': found key [6], sz = 4. libbpf: map 'task_clock': found value [12], sz = 8. libbpf: map 'task_clock': found max_entries = 128. libbpf: map 'cpu_freq_array': at sec_idx 13, offset 384. libbpf: map 'cpu_freq_array': found type = 2. libbpf: map 'cpu_freq_array': found key [6], sz = 4. libbpf: map 'cpu_freq_array': found value [6], sz = 4. libbpf: map 'cpu_freq_array': found max_entries = 128. libbpf: map 'amd64_ke.data' (global data): at sec_idx 11, offset 0, flags 400. libbpf: map 13 is "amd64_ke.data" libbpf: map 'amd64_ke.bss' (global data): at sec_idx 12, offset 0, flags 400. libbpf: map 14 is "amd64_ke.bss" libbpf: sec '.relkprobe/finish_task_switch': collecting relocation for section(3) 'kprobe/finish_task_switch' libbpf: sec '.relkprobe/finish_task_switch': relo #0: insn #0 against 'sample_rate' libbpf: prog 'kprobe__finish_task_switch': found data map 13 (amd64_ke.data, sec 11, off 0) for insn 0 libbpf: sec '.relkprobe/finish_task_switch': relo #1: insn #4 against 'counter_sched_switch' libbpf: prog 'kprobe__finish_task_switch': found data map 14 (amd64_ke.bss, sec 12, off 0) for insn 4 libbpf: sec '.relkprobe/finish_task_switch': relo #2: insn #35 against 'cpu_cycles_event_reader' libbpf: prog 'kprobe__finish_task_switch': found map 2 (cpu_cycles_event_reader, sec 13, off 64) for insn #35 libbpf: sec '.relkprobe/finish_task_switch': relo #3: insn #52 against 'cpu_cycles' libbpf: prog 'kprobe__finish_task_switch': found map 3 (cpu_cycles, sec 13, off 96) for insn #52 libbpf: sec '.relkprobe/finish_task_switch': relo #4: insn #66 against 'cpu_cycles' libbpf: prog 'kprobe__finish_task_switch': found map 3 (cpu_cycles, sec 13, off 96) for insn #66 libbpf: sec '.relkprobe/finish_task_switch': relo #5: insn #71 against 'cpu_ref_cycles_event_reader' libbpf: prog 'kprobe__finish_task_switch': found map 4 (cpu_ref_cycles_event_reader, sec 13, off 128) for insn #71 libbpf: sec '.relkprobe/finish_task_switch': relo #6: insn #83 against 'cpu_ref_cycles' libbpf: prog 'kprobe__finish_task_switch': found map 5 (cpu_ref_cycles, sec 13, off 160) for insn #83 libbpf: sec '.relkprobe/finish_task_switch': relo #7: insn #95 against 'cpu_ref_cycles' libbpf: prog 'kprobe__finish_task_switch': found map 5 (cpu_ref_cycles, sec 13, off 160) for insn #95 libbpf: sec '.relkprobe/finish_task_switch': relo #8: insn #100 against 'cpu_instructions_event_reader' libbpf: prog 'kprobe__finish_task_switch': found map 6 (cpu_instructions_event_reader, sec 13, off 192) for insn #100 libbpf: sec '.relkprobe/finish_task_switch': relo #9: insn #116 against 'cpu_instructions' libbpf: prog 'kprobe__finish_task_switch': found map 7 (cpu_instructions, sec 13, off 224) for insn #116 libbpf: sec '.relkprobe/finish_task_switch': relo #10: insn #130 against 'cpu_instructions' libbpf: prog 'kprobe__finish_task_switch': found map 7 (cpu_instructions, sec 13, off 224) for insn #130 libbpf: sec '.relkprobe/finish_task_switch': relo #11: insn #135 against 'cache_miss_event_reader' libbpf: prog 'kprobe__finish_task_switch': found map 8 (cache_miss_event_reader, sec 13, off 256) for insn #135 libbpf: sec '.relkprobe/finish_task_switch': relo #12: insn #147 against 'cache_miss' libbpf: prog 'kprobe__finish_task_switch': found map 9 (cache_miss, sec 13, off 288) for insn #147 libbpf: sec '.relkprobe/finish_task_switch': relo #13: insn #161 against 'cache_miss' libbpf: prog 'kprobe__finish_task_switch': found map 9 (cache_miss, sec 13, off 288) for insn #161 libbpf: sec '.relkprobe/finish_task_switch': relo #14: insn #169 against 'cpu_freq_array' libbpf: prog 'kprobe__finish_task_switch': found map 12 (cpu_freq_array, sec 13, off 384) for insn #169 libbpf: sec '.relkprobe/finish_task_switch': relo #15: insn #182 against 'cpu_freq_array' libbpf: prog 'kprobe__finish_task_switch': found map 12 (cpu_freq_array, sec 13, off 384) for insn #182 libbpf: sec '.relkprobe/finish_task_switch': relo #16: insn #194 against 'cpu_freq_array' libbpf: prog 'kprobe__finish_task_switch': found map 12 (cpu_freq_array, sec 13, off 384) for insn #194 libbpf: sec '.relkprobe/finish_task_switch': relo #17: insn #217 against 'cpu_freq_array' libbpf: prog 'kprobe__finish_task_switch': found map 12 (cpu_freq_array, sec 13, off 384) for insn #217 libbpf: sec '.relkprobe/finish_task_switch': relo #18: insn #226 against 'pid_time' libbpf: prog 'kprobe__finish_task_switch': found map 1 (pid_time, sec 13, off 32) for insn #226 libbpf: sec '.relkprobe/finish_task_switch': relo #19: insn #235 against 'pid_time' libbpf: prog 'kprobe__finish_task_switch': found map 1 (pid_time, sec 13, off 32) for insn #235 libbpf: sec '.relkprobe/finish_task_switch': relo #20: insn #247 against 'pid_time' libbpf: prog 'kprobe__finish_task_switch': found map 1 (pid_time, sec 13, off 32) for insn #247 libbpf: sec '.relkprobe/finish_task_switch': relo #21: insn #252 against 'task_clock_ms_event_reader' libbpf: prog 'kprobe__finish_task_switch': found map 10 (task_clock_ms_event_reader, sec 13, off 320) for insn #252 libbpf: sec '.relkprobe/finish_task_switch': relo #22: insn #266 against 'task_clock' libbpf: prog 'kprobe__finish_task_switch': found map 11 (task_clock, sec 13, off 352) for insn #266 libbpf: sec '.relkprobe/finish_task_switch': relo #23: insn #279 against 'task_clock' libbpf: prog 'kprobe__finish_task_switch': found map 11 (task_clock, sec 13, off 352) for insn #279 libbpf: sec '.relkprobe/finish_task_switch': relo #24: insn #285 against 'processes' libbpf: prog 'kprobe__finish_task_switch': found map 0 (processes, sec 13, off 0) for insn #285 libbpf: sec '.relkprobe/finish_task_switch': relo #25: insn #309 against 'processes' libbpf: prog 'kprobe__finish_task_switch': found map 0 (processes, sec 13, off 0) for insn #309 libbpf: sec '.relkprobe/finish_task_switch': relo #26: insn #341 against 'processes' libbpf: prog 'kprobe__finish_task_switch': found map 0 (processes, sec 13, off 0) for insn #341 libbpf: sec '.reltracepoint/irq/softirq_entry': collecting relocation for section(5) 'tracepoint/irq/softirq_entry' libbpf: sec '.reltracepoint/irq/softirq_entry': relo #0: insn #6 against 'processes' libbpf: prog 'kepler_irq_trace': found map 0 (processes, sec 13, off 0) for insn #6 libbpf: sec '.relkprobe/mark_page_accessed': collecting relocation for section(7) 'kprobe/mark_page_accessed' libbpf: sec '.relkprobe/mark_page_accessed': relo #0: insn #4 against 'processes' libbpf: prog 'kprobe__mark_page_accessed': found map 0 (processes, sec 13, off 0) for insn #4 libbpf: sec '.relkprobe/set_page_dirty': collecting relocation for section(9) 'kprobe/set_page_dirty' libbpf: sec '.relkprobe/set_page_dirty': relo #0: insn #4 against 'processes' libbpf: prog 'kprobe__set_page_dirty': found map 0 (processes, sec 13, off 0) for insn #4 libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': 0 libbpf: map 'processes': created successfully, fd=9 libbpf: map 'pid_time': created successfully, fd=10 libbpf: map 'cpu_cycles_event_reader': created successfully, fd=11 libbpf: map 'cpu_cycles': created successfully, fd=12 libbpf: map 'cpu_ref_cycles_event_reader': created successfully, fd=13 libbpf: map 'cpu_ref_cycles': created successfully, fd=14 libbpf: map 'cpu_instructions_event_reader': created successfully, fd=15 libbpf: map 'cpu_instructions': created successfully, fd=16 libbpf: map 'cache_miss_event_reader': created successfully, fd=17 libbpf: map 'cache_miss': created successfully, fd=18 libbpf: map 'task_clock_ms_event_reader': created successfully, fd=19 libbpf: map 'task_clock': created successfully, fd=20 libbpf: map 'cpu_freq_array': created successfully, fd=21 libbpf: map 'amd64_ke.data': created successfully, fd=22 libbpf: map 'amd64_ke.bss': created successfully, fd=23 libbpf: sec 'kprobe/finish_task_switch': found 2 CO-RE relocations libbpf: CO-RE relocating [58] struct pt_regs: found target candidate [172] struct pt_regs in [vmlinux] libbpf: prog 'kprobe__finish_task_switch': relo #0: [58] struct pt_regs.di (0:14 @ offset 112) libbpf: prog 'kprobe__finish_task_switch': relo #0: matching candidate #0 [172] struct pt_regs.di (0:14 @ offset 112) libbpf: prog 'kprobe__finish_task_switch': relo #0: patched insn #15 (LDX/ST/STX) off 112 -> 112 libbpf: CO-RE relocating [62] struct task_struct: found target candidate [128] struct task_struct in [vmlinux] libbpf: prog 'kprobe__finish_task_switch': relo #1: [62] struct task_struct.tgid (0:86 @ offset 2780) libbpf: prog 'kprobe__finish_task_switch': relo #1: matching candidate #0 [128] struct task_struct.tgid (0:76 @ offset 2500) libbpf: prog 'kprobe__finish_task_switch': relo #1: patched insn #16 (ALU/ALU64) imm 2780 -> 2500 libbpf: sec 'tracepoint/irq/softirq_entry': found 1 CO-RE relocations libbpf: CO-RE relocating [405] struct trace_event_raw_softirq: found target candidate [14591] struct trace_event_raw_softirq in [vmlinux] libbpf: prog 'kepler_irq_trace': relo #0: [405] struct trace_event_raw_softirq.vec (0:1 @ offset 12) libbpf: prog 'kepler_irq_trace': relo #0: matching candidate #0 [14591] struct trace_event_raw_softirq.vec (0:1 @ offset 8) libbpf: prog 'kepler_irq_trace': relo #0: patched insn #3 (LDX/ST/STX) off 12 -> 8 libbpf: prog 'kprobe__finish_task_switch': failed to create kprobe 'finish_task_switch+0x0' perf event: No such file or directory I0412 09:28:38.245447 1 libbpf_attacher.go:128] failed to attach kprobe/finish_task_switch: failed to attach finish_task_switch k(ret)probe to program kprobe__finish_task_switch: no such file or directory. Try finish_task_switch.isra.0 W0412 09:28:38.273940 1 libbpf_attacher.go:187] could not attach perf event cpu_cycles_event_reader: failed to open bpf perf event on cpu 0: no such file or directory. Are you using a VM? W0412 09:28:38.274170 1 libbpf_attacher.go:187] could not attach perf event cpu_ref_cycles_event_reader: failed to open bpf perf event on cpu 0: no such file or directory. Are you using a VM? W0412 09:28:38.274235 1 libbpf_attacher.go:187] could not attach perf event cpu_instructions_event_reader: failed to open bpf perf event on cpu 0: no such file or directory. Are you using a VM? W0412 09:28:38.274270 1 libbpf_attacher.go:187] could not attach perf event cache_miss_event_reader: failed to open bpf perf event on cpu 0: no such file or directory. Are you using a VM? I0412 09:28:38.274386 1 libbpf_attacher.go:195] Successfully load eBPF module from libbpf object I0412 09:28:38.274472 1 process_energy.go:114] Using the Ratio/DynPower Power Model to estimate Process Platform Power I0412 09:28:38.274566 1 process_energy.go:115] Process feature names: [bpf_cpu_time_ms] I0412 09:28:38.274663 1 process_energy.go:124] Using the Ratio/DynPower Power Model to estimate Process Component Power I0412 09:28:38.274717 1 process_energy.go:125] Process feature names: [bpf_cpu_time_ms bpf_cpu_time_ms bpf_cpu_time_ms gpu_compute_util] I0412 09:28:38.275449 1 node_platform_energy.go:52] Using the Regressor/AbsPower Power Model to estimate Node Platform Power I0412 09:28:38.275702 1 node_component_energy.go:56] Using the Regressor/AbsPower Power Model to estimate Node Component Power I0412 09:28:38.275873 1 exporter.go:270] Started Kepler in 229.061067ms