Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Performance Counters input plugin doesn't gather data from multiple instances #4280

Closed
vlastahajek opened this issue Jun 13, 2018 · 23 comments
Labels
area/windows Related to windows plugins (win_eventlog, win_perf_counters, win_services) platform/windows

Comments

@vlastahajek
Copy link
Contributor

Relevant telegraf.conf:

[[inputs.win_perf_counters]]
  ## By default this plugin returns basic CPU and Disk statistics.
  ## See the README file for more examples.
  ## Uncomment examples below or write your own as you see fit. If the system
  ## being polled for data does not have the Object at startup of the Telegraf
  ## agent, it will not be gathered.
  ## Settings:
  # PrintValid = false # Print All matching performance counters
  PreVistaSupport=false
  UseWildcardsExpansion=false

  [[inputs.win_perf_counters.object]]
    ObjectName = "Process"
    #Instances = ["chrome", "chrome#1", "chrome#2"]
    Instances = ["*"]
    Counters = [
      "% Processor Time"
    ]
    Measurement = "win_proc"

System info:

Telegraf 1.7.0
Window 10 64bit

Expected behavior:

> win_proc,host=T480,instance=aesm_service,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=AppleMobileDeviceService,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=ApplicationFrameHost#1,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=ApplicationFrameHost,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=ApsInsSvc,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=audiodg,objectname=Process Percent_Processor_Time=1.547394037246704 1528893106000000000
> win_proc,host=T480,instance=Calculator,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=cmd#1,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=cmd#2,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=cmd,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=conhost#1,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=conhost#2,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=conhost#3,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=conhost#4,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=conhost#5,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=conhost#6,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=conhost#7,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=conhost#8,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=conhost,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=csrss#1,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=csrss#2,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=csrss,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=ctfmon#1,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=ctfmon,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=dasHost,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=DbxSvc,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=dllhost#1,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=dllhost#2,objectname=Process Percent_Processor_Time=0 1528893106000000000
> win_proc,host=T480,instance=dllhost,objectname=Process Percent_Processor_Time=0 1528893106000000000

Actual behavior:

> win_proc,host=T480,instance=aesm_service,objectname=Process Percent_Processor_Time=0 1528892485000000000
> win_proc,host=T480,instance=AppleMobileDeviceService,objectname=Process Percent_Processor_Time=0 1528892485000000000
> win_proc,host=T480,instance=ApplicationFrameHost,objectname=Process Percent_Processor_Time=0 1528892485000000000
> win_proc,host=T480,instance=ApsInsSvc,objectname=Process Percent_Processor_Time=0 1528892485000000000
> win_proc,host=T480,instance=audiodg,objectname=Process Percent_Processor_Time=1.5502903461456299 1528892485000000000
> win_proc,host=T480,instance=Calculator,objectname=Process Percent_Processor_Time=0 1528892485000000000
> win_proc,host=T480,instance=cmd,objectname=Process Percent_Processor_Time=0 1528892485000000000
> win_proc,host=T480,instance=conhost,objectname=Process Percent_Processor_Time=0 1528892485000000000
> win_proc,host=T480,instance=csrss,objectname=Process Percent_Processor_Time=0 1528892485000000000
> win_proc,host=T480,instance=ctfmon,objectname=Process Percent_Processor_Time=0 1528892485000000000
> win_proc,host=T480,instance=dasHost,objectname=Process Percent_Processor_Time=0 1528892485000000000
> win_proc,host=T480,instance=DbxSvc,objectname=Process Percent_Processor_Time=0 1528892485000000000
> win_proc,host=T480,instance=dllhost,objectname=Process Percent_Processor_Time=0 1528892485000000000

Additional info:

Problem is that PdhGetFormattedCounterArray returns instance name without an instance index.
And in this case values grouping causes that each instance overwrites value of previous.

@vlastahajek
Copy link
Contributor Author

As PdhGetFormattedCounterArray doesn't return counter handle neither it is impossible to determine instance index using available data.
A workaround could be at the point of adding value determine if there is already a value and in case true create temporary index for current instance and assign new value.

The flaw here is that there is no guarantee that the instances will be returned in the same order next time, so the same temporary index could be used for different instances.

Of course, the best solution is to use wildcards expansion 🥇

@vlastahajek vlastahajek changed the title Widows Performance Counters input plugin doesn't gather data from multiple instances Windows Performance Counters input plugin doesn't gather data from multiple instances Jun 13, 2018
@danielnelson danielnelson added the area/windows Related to windows plugins (win_eventlog, win_perf_counters, win_services) label Jun 13, 2018
@danielnelson
Copy link
Contributor

Can you create a unittest that shows this behavior?

@vlastahajek
Copy link
Contributor Author

Sure, np.

@vlastahajek
Copy link
Contributor Author

@danielnelson
Copy link
Contributor

I know we have had problems with multiple instances for a long time now, but has this issue gotten worse since 1.6 or is it essentially the same behavior as before?

@vlastahajek
Copy link
Contributor Author

PR #4036 caused this..

@PeterKelecom
Copy link

I'm running into this issue. Any idea when / if it'll be resolved?

@vlastahajek
Copy link
Contributor Author

@PeterKelecom What version of Telegraf are you using? If 1.7.x, try setting UseWildcardsExpansion = true

@PeterKelecom
Copy link

@vlastahajek That works, thanks!

@andryua
Copy link

andryua commented Oct 18, 2018

@PeterKelecom What version of Telegraf are you using? If 1.7.x, try setting UseWildcardsExpansion = true

version 1.8.2 doesn't work for Windows 10 x64 - telegraf doesn't send data
for Windows 7 x32 - doesn't show multiple processes

@danielnelson
Copy link
Contributor

@andryua Can you attach the config you are using?

@andryua
Copy link

andryua commented Oct 31, 2018

@andryua Can you attach the config you are using?
yes
telegraf.txt

@natejgardner
Copy link

natejgardner commented Nov 5, 2018

Setting UseWildcardsExpansion to true solved this for me. I was having the same issue where Telegraf wouldn't write anything for win_perf_counters at all until I added that.

@glinton
Copy link
Contributor

glinton commented Nov 5, 2018

@vlastahajek does setting UseWildcardsExpansion work for you?

@andryua
Copy link

andryua commented Nov 9, 2018

So, how I can get summary data from multiple instances?

@danielnelson
Copy link
Contributor

@andryua This would be something that can be done using an aggregator plugin or, I think most flexibly, at query time.

@danielnelson
Copy link
Contributor

I feel like we are getting off topic here, this issue is documenting a known limitation/quirk of the current plugin that came up during development. Most people should use UseWildcardsExpansion = true if they can instead, but this option will cause counters to be localized.

If anyone is experiencing additional issues please check for other open issues and if none is found open a new issue.

@andryua
Copy link

andryua commented Nov 12, 2018

@andryua This would be something that can be done using an aggregator plugin or, I think most flexibly, at query time.

can, You help me - where I can get this aggrefator plugin and how I can apply it in telegraf for Windows. Thanks!

@vlastahajek
Copy link
Contributor Author

@vlastahajek does setting UseWildcardsExpansion work for you?

@glinton, everything seems to be working as expected with attached config in master and in 1.8.2. Both, with UseWildcardsExpansion=true and UseWildcardsExpansion=false.

@danielnelson
Copy link
Contributor

@andryua Here are the docs, if you need more help can you create a new topic on the InfluxData Community site.

https://github.com/influxdata/telegraf/tree/master/plugins/aggregators/basicstats

@andryua
Copy link

andryua commented Nov 13, 2018

@andryua Here are the docs, if you need more help can you create a new topic on the InfluxData Community site.

https://github.com/influxdata/telegraf/tree/master/plugins/aggregators/basicstats

Thanks! But only on Win 7 x64 work correctly. On Win 10 x64 - show min-max only in process section

@barbarajoost
Copy link

With Telegraf 1.8.3 on Server 2008 R2 x64 and UseWildcardsExpansion = true I get the expected results for the process I want to monitor.

@sspaink
Copy link
Contributor

sspaink commented Apr 12, 2022

Closing as it seems UseWildcardsExpansion = true is the solution. Please comment/re-open if that isn't the case for you.

@sspaink sspaink closed this as completed Apr 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/windows Related to windows plugins (win_eventlog, win_perf_counters, win_services) platform/windows
Projects
None yet
Development

No branches or pull requests

9 participants