Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

win_perf_counters fails on counters that return no data #5179

Closed
nicgrobler opened this issue Dec 21, 2018 · 3 comments
Closed

win_perf_counters fails on counters that return no data #5179

nicgrobler opened this issue Dec 21, 2018 · 3 comments
Labels
area/windows Related to windows plugins (win_eventlog, win_perf_counters, win_services)
Milestone

Comments

@nicgrobler
Copy link
Contributor

nicgrobler commented Dec 21, 2018

Relevant telegraf.conf:

System info:

telegraf > 1.8, windows 2016, Hyper-V

Steps to reproduce:

Enable the config item for a counter that has no associated instances ("RDMA Activity", for example) on the server. the same applies to others.

Expected behavior:

Telegraf collects data from the many other counters that are valid, and do return data.

Actual behavior:

Telegraf logs an error similar to:

"E! [telegraf] Error running agent: error while getting value for counter \RDMA Activity(*)\RDMA Completion Queue Errors: No data to return."

and returns no other data.

Additional info:

Telegraf 1.8 does not behave like this - it sends for other counters.

As new to the windows agent, wanted to verify that this is a bug, and not expected (i.e. is the windows agent by design, supposed to return nothing as soon as a single input object has no data - guessing not).

The reason that we need this "fix" is because in large environments, it is a nightmare trying to change the local config on each server whenever there are no instances active (of any particular perf counter) goes away - for example, when admins are doing maintenance / playing with new features etc, they still want to be getting data from other counters.

The error is being returned from here (within the Gather func in win_perf_counters.go), two different places depending upon whether wildcardexpansion used or not:

if !isKnownCounterDataError(err) {
    return fmt.Errorf("error while getting value for counter %s: %v", metric.counterPath, err)
}

and a fix would be to simply do this:

func isKnownCounterDataError(err error) bool {
	if pdhErr, ok := err.(*PdhError); ok && (pdhErr.ErrorCode == PDH_INVALID_DATA ||
		pdhErr.ErrorCode == PDH_CALC_NEGATIVE_VALUE ||
		pdhErr.ErrorCode == PDH_CSTATUS_INVALID_DATA || 
		pdhErr.ErrorCode == PDH_NO_DATA) { -------> this line here
		return true
	}
	return false
}

obviously, this assumes that I am right about what's going on, and if the fix results in desired behaviour. If not, perhaps adding a global config flag that tells the agent to have this behaviour when desired (keep collecting and sending data for other other counters that do return data).

@glinton glinton added the area/windows Related to windows plugins (win_eventlog, win_perf_counters, win_services) label Dec 21, 2018
@glinton
Copy link
Contributor

glinton commented Dec 21, 2018

Thanks, would you mind opening a pr with your suggested fix?

@nicgrobler
Copy link
Contributor Author

Sure, no worries

@danielnelson
Copy link
Contributor

Closed in #5182

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/windows Related to windows plugins (win_eventlog, win_perf_counters, win_services)
Projects
None yet
Development

No branches or pull requests

3 participants