Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metricbeat fails to get information for protected processes under Windows #17314

Closed
2 tasks
adriansr opened this issue Mar 30, 2020 · 17 comments · Fixed by elastic/elastic-agent-system-metrics#104 or #37027
Closed
2 tasks
Assignees
Labels
bug help wanted Indicates that a maintainer wants help on an issue or pull request Metricbeat Metricbeat Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team :Windows

Comments

@adriansr
Copy link
Contributor

adriansr commented Mar 30, 2020

Issue

For confirmed bugs, please report:

  • Version: 7.x
  • Operating System: Windows
  • Discuss Forum URL:
  • Steps to Reproduce:

When running the system metricset, the debug log is filled with messages like:

process/process.go:475  Skip process pid=0: error getting process state for pid=0: getProcName failed: OpenProcess failed for pid=0: The parameter is incorrect.; getProcStatus failed: OpenProcess failed for pid=0: The parameter is incorrect.; getParentPid failed: OpenProcess failed for pid=0: The parameter is incorrect.
2019-11-20T15:38:58.734Z        DEBUG   [processes]     process/process.go:475  Skip process pid=4: error getting process state for pid=4: getProcName failed: GetProcessImageFileName failed for pid=4: GetProcessImageFileName failed: invalid argument
2019-11-20T15:38:58.736Z        DEBUG   [processes]     process/process.go:486  Error getting details for process Registry with pid=120: error getting process mem for pid=120: OpenProcess failed for pid=120: Access is denied.
2019-11-20T15:38:58.736Z        DEBUG   [processes]     process/process.go:486  Error getting details for process smss.exe with pid=464: error getting process mem for pid=464: OpenProcess failed for pid=464: Access is denied.
2019-11-20T15:38:58.737Z        DEBUG   [processes]     process/process.go:486  Error getting details for process csrss.exe with pid=700: error getting process mem for pid=700: OpenProcess failed for pid=700: Access is denied.
2019-11-20T15:38:58.737Z        DEBUG   [processes]     process/process.go:486  Error getting details for process wininit.exe with pid=812: error getting process mem for pid=812: OpenProcess failed for pid=812: Access is denied.
2019-11-20T15:38:58.737Z        DEBUG   [processes]     process/process.go:486  Error getting details for process csrss.exe with pid=820: error getting process mem for pid=820: OpenProcess failed for pid=820: Access is denied.
2019-11-20T15:38:58.737Z        DEBUG   [processes]     process/process.go:486  Error getting details for process services.exe with pid=884: error getting process mem for pid=884: OpenProcess failed for pid=884: Access is denied.
2019-11-20T15:38:58.752Z        DEBUG   [processes]     process/process.go:486  Error getting details for process MemCompression with pid=3648: error getting process mem for pid=3648: OpenProcess failed for pid=3648: Access is denied.
2019-11-20T15:38:58.755Z        DEBUG   [processes]     process/process.go:486  Error getting details for process svchost.exe with pid=4412: error getting process mem for pid=4412: OpenProcess failed for pid=4412: Access is denied.
2019-11-20T15:38:58.764Z        DEBUG   [processes]     process/process.go:486  Error getting details for process MsMpEng.exe with pid=5708: error getting process mem for pid=5708: OpenProcess failed for pid=5708: Access is denied.
2019-11-20T15:38:58.773Z        DEBUG   [processes]     process/process.go:486  Error getting details for process NisSrv.exe with pid=9004: error getting process mem for pid=9004: OpenProcess failed for pid=9004: Access is denied.
2019-11-20T15:38:58.780Z        DEBUG   [processes]     process/process.go:486  Error getting details for process SecurityHealthService.exe with pid=12232: error getting process mem for pid=12232: OpenProcess failed for pid=12232: Access is denied.
2019-11-20T15:38:58.792Z        DEBUG   [processes]     process/process.go:486  Error getting details for process SgrmBroker.exe with pid=17156: error getting process mem for pid=17156: OpenProcess failed for pid=17156: Access is denied.
2019-11-20T15:38:58.793Z        DEBUG   [processes]     process/process.go:486  Error getting details for process svchost.exe with pid=11332: error getting process mem for pid=11332: OpenProcess failed for pid=11332: Access is denied.
2019-11-20T15:38:58.796Z        DEBUG   [processes]     process/process.go:486  Error getting details for process svchost.exe with pid=13536: error getting process mem for pid=13536: OpenProcess failed for pid=13536: Access is denied.
2019-11-20T15:38:58.797Z        DEBUG   [processes]     process/process.go:486  Error getting details for process init with pid=2828: error getting process mem for pid=2828: OpenProcess failed for pid=2828: Access is denied.
2019-11-20T15:38:58.799Z        DEBUG   [processes]     process/process.go:486  Error getting details for process init with pid=16416: error getting process mem for pid=16416: OpenProcess failed for pid=16416: Access is denied.
2019-11-20T15:38:58.799Z        DEBUG   [processes]     process/process.go:486  Error getting details for process bash with pid=15504: error getting process mem for pid=15504: OpenProcess failed for pid=15504: Access is denied.
2019-11-20T15:38:58.814Z        DEBUG   [processes]     process/process.go:434  Filtered top 

We would like to understand if there are some permissions missing for Metricbeat.exe that could allow to fetch information from all or most of the processes that are currently failing. It would be good to compare with Process Explorer which can list this information for all processes.

Definition of done

  • Metricbeat gets information for all processors
  • Create a followup task to create tests when Windows runner will be available
@adriansr adriansr added bug :Windows Metricbeat Metricbeat help wanted Indicates that a maintainer wants help on an issue or pull request labels Mar 30, 2020
@andresrc andresrc added [zube]: Inbox Team:Integrations Label for the Integrations team labels Apr 14, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations (Team:Integrations)

@blakerouse
Copy link
Contributor

@adriansr How was metricbeat started on Windows? Is it running as a service or is it being executed directly? If being executed directly, can you provide how you are starting it? Administrator cmd? Administrator powershell?

@willemdh
Copy link

willemdh commented Oct 26, 2021

We have the same problem. Multiple processes are not getting indexed...

image

2021-10-26T16:12:17.134+0200	DEBUG	[processes]	process/process.go:577	Error getting details for process Registry with pid=100: error getting process mem for pid=100: OpenProcess failed for pid=100: Access is denied.
2021-10-26T16:12:17.134+0200	DEBUG	[processes]	process/process.go:577	Error getting details for process smss.exe with pid=292: error getting process mem for pid=292: OpenProcess failed for pid=292: Access is denied.
2021-10-26T16:12:17.134+0200	DEBUG	[processes]	process/process.go:577	Error getting details for process csrss.exe with pid=396: error getting process mem for pid=396: OpenProcess failed for pid=396: Access is denied.
2021-10-26T16:12:17.135+0200	DEBUG	[processes]	process/process.go:577	Error getting details for process csrss.exe with pid=504: error getting process mem for pid=504: OpenProcess failed for pid=504: Access is denied.
2021-10-26T16:12:17.135+0200	DEBUG	[processes]	process/process.go:577	Error getting details for process wininit.exe with pid=552: error getting process mem for pid=552: OpenProcess failed for pid=552: Access is denied.
2021-10-26T16:12:17.136+0200	DEBUG	[processes]	process/process.go:577	Error getting details for process services.exe with pid=636: error getting process mem for pid=636: OpenProcess failed for pid=636: Access is denied.
2021-10-26T16:12:17.164+0200	DEBUG	[processes]	process/process.go:577	Error getting details for process cyserver.exe with pid=1604: error getting process mem for pid=1604: OpenProcess failed for pid=1604: Access is denied.
2021-10-26T16:12:17.164+0200	DEBUG	[processes]	process/process.go:577	Error getting details for process svchost.exe with pid=4284: error getting process mem for pid=4284: OpenProcess failed for pid=4284: Access is denied.
2021-10-26T16:12:17.170+0200	DEBUG	[processes]	process/process.go:577	Error getting details for process csrss.exe with pid=6544: error getting process mem for pid=6544: OpenProcess failed for pid=6544: Access is denied.
2021-10-26T16:12:17.182+0200	DEBUG	[processes]	process/process.go:577	Error getting details for process csrss.exe with pid=7148: error getting process mem for pid=7148: OpenProcess failed for pid=7148: Access is denied.

@narph
Copy link
Contributor

narph commented Nov 2, 2021

In short, we are using standard user-mode Windows 32 APIs to query information on process internals, and we are unable to perform operations on protected processes due to their higher level of security.

Some references below offer detailed information on this, the access rights we use to query information (OpenProcess 32 api more precisely) look to be denied for protected processes.

https://www.microsoftpressstore.com/articles/article.aspx?p=2233328&seqNum=2
https://docs.microsoft.com/en-us/windows/win32/procthread/process-security-and-access-rights#protected-processes
https://docs.microsoft.com/en-us/windows/win32/procthread/process-security-and-access-rights

We are also using the PROCESS_QUERY_LIMITED_INFORMATION flag, that should grant limited access to information on the process but, in this case, the information we require from the process is still not accessible through this option.

It would be good to compare with Process Explorer which can list this information for all processes.

Process Explorer also uses standard user-mode Windows APIs to query information so indeed, it would be interesting if it can collect the exact process information we are not able to.
We will have to investigate this scenario and confirm we are not able to provide certain metrics for protected process.

@willemdh
Copy link

willemdh commented Nov 3, 2021

Thanks @narph for the info. It would be Nice if we would be able to Get cpu and memory info From protected processen to Get a complete picture imho.

@andrewkroh
Copy link
Member

Is Metricbeat using go-sysinfo, gosigar, or some its own code to fetch process info?

Metricbeat can probably be made more robust when encountering protected processes. Ideally it would fallback back to using PROCESS_QUERY_LIMITED_INFORMATION to open the process. And then fetch as much information as is available (based on docs linked by narph I think this is only the path to the executable).

I looked over the go-sysinfo library and made some notes.

  1. It will fallback to PROCESS_QUERY_LIMITED_INFORMATION. https://github.com/elastic/go-sysinfo/blob/504d69c91710df28aa7fa7cc5d1a624d9e918126/providers/windows/process_windows.go#L268-L280
  2. It ignores PID 0 and 4 (which are never accessible) to limit error logging. https://github.com/elastic/go-sysinfo/blob/504d69c91710df28aa7fa7cc5d1a624d9e918126/providers/windows/process_windows.go#L49-L51
  3. It should probably ignore access denied errors when calling GetProcessTimes because PROCESS_QUERY_LIMITED_INFORMATION is not listed to provide this info. https://github.com/elastic/go-sysinfo/blob/504d69c91710df28aa7fa7cc5d1a624d9e918126/providers/windows/process_windows.go#L114-L117

@andrewkroh
Copy link
Member

Is Metricbeat using go-sysinfo, gosigar, or some its own code to fetch process info?

Metricbeat is using the libbeat/metric code that uses gosigar.

// newProcess creates a new Process object and initializes it with process
// state information. If the process's command line and environment variables
// are known they should be passed in to avoid re-fetching the information.
func newProcess(pid int, cmdline string, env common.MapStr) (*Process, error) {
state := sigar.ProcState{}
err := state.Get(pid)

@botelastic
Copy link

botelastic bot commented Dec 23, 2022

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Dec 23, 2022
@willemdh
Copy link

+1

@gabriellandau
Copy link

gabriellandau commented Jan 17, 2023

Metricbeat is using the libbeat/metric code that uses gosigar.

Thanks for that. I see two things we can fix.

  1. ProcMem::Get() is requesting PROCESS_VM_READ, which will definitely fail on PPL Endpoint. It's passed to GetProcessMemoryInfo here. That API only needs PROCESS_QUERY_LIMITED_INFORMATION on OS versions that Endpoint supports, so we can get rid of PROCESS_VM_READ

  2. getProcCredName() here is requesting PROCESS_QUERY_INFORMATION when we really only need PROCESS_QUERY_LIMITED_INFORMATION.

ProcArgs::Get() here is requesting PROCESS_VM_READ. That will not succeed against PPL Endpoint, and there's no quick workaround. We need to ensure that such failures are non-fatal, and that MetricBeat still returns the data it can successfully collect for the Endpoint process.

@willemdh
Copy link

Thanks @gabriellandau

This week we had an issue with lsass.exe on a system and we missed it completely as Metricbeat cannot capture lsass.exe cpu usage...

@jlind23 jlind23 added Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team and removed Team:Integrations Label for the Integrations team labels Jan 18, 2023
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@nfritts
Copy link

nfritts commented Jun 2, 2023

FYI I've seen another customer come in with questions about why they aren't getting metric data about Endpoint that would have been solved/prevented by resolving this issue.

@rdner
Copy link
Member

rdner commented Aug 21, 2023

The error message from the description is not logged anymore due to removal in #30076 (perhaps by mistake). We should put it back while fixing this issue.

@rdner
Copy link
Member

rdner commented Aug 23, 2023

It's better to ask @pierrehilbert about estimations.

@belimawr
Copy link
Contributor

After some investigation and discussion with the team, my current plan for this task is:

  1. Collect metrics in a different order, grouping all metrics that require the same access right, starting with the lower ones
  2. Returning partial metrics in case some of them fail. My understanding is that on Windows we don't have enough permissions to read all metrics from all process.
  3. Errors will be added to the event's error field so it's easy to spot why metrics are missing.

@cmacknz cmacknz changed the title Metricbeat fails to get information for some processes under Windows Metricbeat fails to get information for protected processes under Windows Nov 1, 2023
belimawr added a commit to elastic/elastic-agent-system-metrics that referenced this issue Nov 3, 2023
## What does this PR do?

It improves metric collection on Windows hosts so we can collect some
metrics from privileged process like Elastic Endpoint or Elastic-Agent.
In order to achieve that metrics collection for Windows are grouped by
the access level required to collect metrics. First the metrics that can
be collected with `PROCESS_QUERY_LIMITED_INFORMATION` are collected,
then the others are collected. In case the second batch fails, we still
report the partial metrics.

## Why is it important?

It allows us to collect CPU and memory metrics from Endpoint and
Elastic-Agent, which improves our monitoring dashboards.

## Related issues

- Closes elastic/beats#17314
@cmacknz
Copy link
Member

cmacknz commented Nov 3, 2023

Reopening, let's not close this until the system metrics dependency is updated in Beats and Agent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug help wanted Indicates that a maintainer wants help on an issue or pull request Metricbeat Metricbeat Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team :Windows
Projects
None yet