Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics missing from intel_powerstat #13098

Closed
azw71 opened this issue Apr 17, 2023 · 6 comments · Fixed by #14263
Closed

Metrics missing from intel_powerstat #13098

azw71 opened this issue Apr 17, 2023 · 6 comments · Fixed by #14263
Assignees
Labels
docs Issues related to Telegraf documentation and configuration descriptions plugin/input 1. Request for new input plugins 2. Issues/PRs that are related to input plugins

Comments

@azw71
Copy link

azw71 commented Apr 17, 2023

Relevant telegraf.conf

[[inputs.intel_powerstat]]
  package_metrics = ["current_power_consumption", "current_dram_power_consumption", "thermal_design_power", "max_turbo_frequency", "cpu_base_frequency"]
  cpu_metrics = ["cpu_frequency", "cpu_busy_frequency", "cpu_temperature", "cpu_c0_state_residency", "cpu_c1_state_residency", "cpu_c6_state_residency"]

Logs from Telegraf

2023-04-17T13:06:13Z I! Loading config file: /etc/telegraf/telegraf.conf
2023-04-17T13:06:13Z I! Starting Telegraf 1.26.1
2023-04-17T13:06:13Z I! Available plugins: 235 inputs, 9 aggregators, 27 processors, 22 parsers, 57 outputs, 2 secret-stores
2023-04-17T13:06:13Z I! Loaded inputs: cpu disk diskio exec intel_powerstat kernel mem net processes sensors smart swap system
2023-04-17T13:06:13Z I! Loaded aggregators:
2023-04-17T13:06:13Z I! Loaded processors:
2023-04-17T13:06:13Z I! Loaded secretstores:
2023-04-17T13:06:13Z W! Outputs are not used in testing mode!
2023-04-17T13:06:13Z I! Tags enabled: host=fatblock
2023-04-17T13:06:13Z D! [agent] Initializing plugins
2023-04-17T13:06:13Z D! [agent] Starting service inputs
> powerstat_package,active_cores=1,host=fatblock,package_id=0 max_turbo_frequency_mhz=4200i 1681736773000000000
> powerstat_package,active_cores=2,host=fatblock,package_id=0 max_turbo_frequency_mhz=4100i 1681736773000000000
> powerstat_package,active_cores=3,host=fatblock,package_id=0 max_turbo_frequency_mhz=4100i 1681736773000000000
> powerstat_package,active_cores=4,host=fatblock,package_id=0 max_turbo_frequency_mhz=4000i 1681736773000000000
> powerstat_package,host=fatblock,package_id=0 cpu_base_frequency_mhz=3600i 1681736773000000000
> powerstat_package,host=fatblock,package_id=0 thermal_design_power_watts=65 1681736773000000000
> powerstat_core,core_id=2,cpu_id=2,host=fatblock,package_id=0 cpu_frequency_mhz=4012.05 1681736773000000000
> powerstat_core,core_id=0,cpu_id=0,host=fatblock,package_id=0 cpu_frequency_mhz=4012.04 1681736773000000000
> powerstat_core,core_id=2,cpu_id=2,host=fatblock,package_id=0 cpu_temperature_celsius=42i 1681736773000000000
> powerstat_core,core_id=3,cpu_id=3,host=fatblock,package_id=0 cpu_frequency_mhz=4000 1681736773000000000
> powerstat_core,core_id=1,cpu_id=1,host=fatblock,package_id=0 cpu_frequency_mhz=4012.52 1681736773000000000
> powerstat_core,core_id=0,cpu_id=0,host=fatblock,package_id=0 cpu_temperature_celsius=38i 1681736773000000000
> powerstat_core,core_id=3,cpu_id=3,host=fatblock,package_id=0 cpu_temperature_celsius=38i 1681736773000000000
> powerstat_core,core_id=1,cpu_id=1,host=fatblock,package_id=0 cpu_temperature_celsius=37i 1681736773000000000
2023-04-17T13:06:13Z D! [agent] Stopping service inputs
2023-04-17T13:06:13Z D! [agent] Input channel closed
2023-04-17T13:06:13Z D! [agent] Stopped Successfully

System info

Telegraf 1.26.1, Debian Bullseye with Kernel 6.1

Docker

No response

Steps to reproduce

  1. run telegraf with powerstat plugin

...

Expected behavior

powerstat plugin should deliver power consumption metrics and information about c0/c1/c6 residency

Actual behavior

Metrics are missing without additional info or error

Additional info

Please document supported CPUs for uncore metrics, see
https://github.com/torvalds/linux/blob/master/drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c

cat /proc/cpuinfo

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 158
model name : Intel(R) Core(TM) i3-9100 CPU @ 3.60GHz
stepping : 11
microcode : 0xf0
cpu MHz : 1100.005
cache size : 6144 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 22
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities
vmx flags : vnmi preemption_timer invvpid ept_x_only ept_ad ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest ple pml ept_mode_based_exec
...

@azw71 azw71 added the bug unexpected problem or unintended behavior label Apr 17, 2023
@azw71
Copy link
Author

azw71 commented Apr 17, 2023

`
andi@fatblock:/tmp$ lsmod | egrep -i "msr|rapl"

msr 16384 0
intel_rapl_msr 20480 0
intel_rapl_common 32768 1 intel_rapl_msr
rapl 20480 0

andi@fatblock:/tmp$ uname -a
Linux fatblock 6.1.0-0.deb11.5-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.12-1~bpo11+1 (2023-03-05) x86_64 GNU/Linux
`

@p-zak
Copy link
Collaborator

p-zak commented Apr 17, 2023

powerstat plugin should deliver power consumption metrics and information about c0/c1/c6 residency

@azw71
All metrics collected by Intel PowerStat plugin are collected in fixed intervals. Metrics that reports processor C-state residency or power are calculated over elapsed intervals. When starting to measure metrics, plugin skips first iteration of metrics if they are based on deltas with previous value.
(https://github.com/influxdata/telegraf/tree/master/plugins/inputs/intel_powerstat#metrics)

I see that you run Telegraf in testing mode which means that only one iteration of gathering metrics was run.
Please, run Telegraf in normal mode and see if all metrics are gathered properly.

Please document supported CPUs for uncore metrics, see
https://github.com/torvalds/linux/blob/master/drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c

I believe that uncore metrics are documented in https://github.com/influxdata/telegraf/blob/master/plugins/inputs/intel_powerstat/README.md
Please, let me know if something is missing.

@azw71
Copy link
Author

azw71 commented Apr 18, 2023

Thanks for your explanation, the missing measured values are actually available after restarting Telegraf.

My note regarding the documentation refers to the fact that the intel_uncore_frequency module cannot be used on my CPU, there is a message "no such device" when starting with modprobe. A few xeon CPUs are listed in the linked kernel source code that differ from the CPUs mentioned in the powerstat module.

@p-zak
Copy link
Collaborator

p-zak commented Apr 18, 2023

Yes, you are right.

intel-uncore-frequency module can be only loaded for these models:

Model number Processor name
0x55 Intel Skylake-X
0x6A Intel IceLake-X
0x6C Intel IceLake-D
0x47 Intel Broadwell-G
0x4F Intel Broadwell-X
0x56 Intel Broadwell-D
0x8F Intel Sapphire Rapids X
0xCF Intel Emerald Rapids X

Plugin will be updated in the upcoming months, this information will be put to README and/or code.

Thanks for findings this!

@Hipska Hipska added docs Issues related to Telegraf documentation and configuration descriptions plugin/input 1. Request for new input plugins 2. Issues/PRs that are related to input plugins and removed bug unexpected problem or unintended behavior labels May 8, 2023
@powersj
Copy link
Contributor

powersj commented Nov 7, 2023

@p-zak given the user's issue was resolved after not running in test mode, is this issue left open to document the models supported?

@powersj powersj added the waiting for response waiting for response from contributor label Nov 7, 2023
@p-zak
Copy link
Collaborator

p-zak commented Nov 8, 2023

@powersj Exactly, it can be closed after changes to README are delivered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Issues related to Telegraf documentation and configuration descriptions plugin/input 1. Request for new input plugins 2. Issues/PRs that are related to input plugins
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants