Skip to content

Report eBPF service statuses instead of checking installation#334

Merged
srikrishnaveturi merged 3 commits intoAzure:devfrom
srikrishnaveturi:dev
Apr 13, 2026
Merged

Report eBPF service statuses instead of checking installation#334
srikrishnaveturi merged 3 commits intoAzure:devfrom
srikrishnaveturi:dev

Conversation

@srikrishnaveturi
Copy link
Copy Markdown
Contributor

Overview

This PR improves the eBPF service status reporting in the GuestProxy Agent VM Extension. Previously, the eBPF substatus only reported whether the EbpfCore and NetEbpfExt driver services were installed. This meant the status could show Success even when one or both services were stopped or in a degraded state.

The change updates eBPF status reporting to query the runtime state (e.g., Running, Stopped) and start type (e.g., AutoStart, Disabled) of each eBPF driver service. The substatus now reports Success only when both services are confirmed running, and Error otherwise — providing more accurate and actionable health information. Each error case also includes the state of the other service in the message for full context.

The implementation is also refactored to use a dedicated ServiceStatusInfo struct (replacing a raw tuple return) and a pure build_ebpf_substatus helper function that isolates the substatus decision logic from I/O, enabling deterministic unit testing without mocking.


High Level Code Changes

proxy_agent_shared/src/service/windows_service.rs

  • Added a ServiceStatusInfo struct to hold the runtime status of a Windows service, with fields for service_name, state: Option<ServiceState>, and start_type.
  • Added summary() and message() methods on ServiceStatusInfo. summary() produces a human-readable string (e.g., "Running, AutoStart" or "NotInstalled"); message() produces a log-friendly string using the service name and summary — the check_service_status: prefix is prepended by callers when logging.
  • Made query_service_status public so it can be consumed by the new status-check logic.
  • Re-exported ServiceState as a public type for callers to pattern-match on.
  • Added unit tests for ServiceStatusInfo::summary() and ServiceStatusInfo::message() covering not-installed, running, and stopped states.

proxy_agent_shared/src/service.rs

  • Added check_service_status() which returns a ServiceStatusInfo struct. Callers determine whether a service is installed by checking state.is_some() — the previous redundant is_installed boolean and prebuilt message string have been removed.
  • Re-exported ServiceStatusInfo and ServiceState from the module for use by callers.
  • Added a unit test (test_check_service_status) that validates the function against both a non-existent service and a freshly installed test service.

proxy_agent_extension/src/service_main.rs

  • Extracted a pure build_ebpf_substatus(core, ext) helper function that takes two ServiceStatusInfo values and returns the SubStatus to report. This isolates all substatus decision logic from the I/O in report_ebpf_status.
  • Updated report_ebpf_status() to call check_service_status for both services as two independent (non-nested) calls, so both services are always queried and logged regardless of the other's state.
  • The substatus now reports Success/STATUS_CODE_OK only when both EbpfCore and NetEbpfExt are Running; otherwise reports Error/STATUS_CODE_NOT_OK.
  • All four cases (both missing, core missing, ext missing, both present) include the state of both services in the formatted message.
  • Added test_build_ebpf_substatus unit test covering all 7 meaningful state combinations: both not installed, one missing with the other running, both running (success), and partial/both-stopped error cases.

Testing

Manually replaced the built ProxyAgentExt.exe in a VM and tested the following cases:

  1. Both stopped and one disabled:
[{"version":"1.0","timestampUTC":"2026-04-01T15:01:33.746","status":{"name":"ProxyAgentVMExtension","operation":"Enable","configurationAppliedTime":"2026-04-01T15:01:33Z","status":"Transitioning","code":0,"formattedMessage":{"lang":"en-US","message":"Started ProxyAgent Extension Monitoring thread."},"substatus":[{"name":"ProxyAgentConnectionSummary","status":"Transitioning","code":4,"formattedMessage":{"lang":"en-US","message":"Proxy agent aggregate status file is stale. Status timestamp: 2026-04-01 14:54:02.802 +00:00:00, Current time: 2026-04-01 15:01:33.7065978 +00:00:00"}},{"name":"ProxyAgentStatus","status":"Transitioning","code":4,"formattedMessage":{"lang":"en-US","message":"Proxy agent aggregate status file is stale. Status timestamp: 2026-04-01 14:54:02.802 +00:00:00, Current time: 2026-04-01 15:01:33.7065978 +00:00:00"}},{"name":"ProxyAgentFailedAuthenticationSummary","status":"Transitioning","code":4,"formattedMessage":{"lang":"en-US","message":"Proxy agent aggregate status file is stale. Status timestamp: 2026-04-01 14:54:02.802 +00:00:00, Current time: 2026-04-01 15:01:33.7065978 +00:00:00"}},{"name":"EbpfStatus","status":"Error","code":4,"formattedMessage":{"lang":"en-US","message":"EbpfCore: Stopped, Disabled, NetEbpfExt: Stopped, AutoStart"}}]}}]
  1. One service running and one stopped and disabled:
[{"version":"1.0","timestampUTC":"2026-04-01T15:02:34.124","status":{"name":"ProxyAgentVMExtension","operation":"Enable","configurationAppliedTime":"2026-04-01T15:02:34Z","status":"Transitioning","code":0,"formattedMessage":{"lang":"en-US","message":"Started ProxyAgent Extension Monitoring thread."},"substatus":[{"name":"ProxyAgentConnectionSummary","status":"Transitioning","code":4,"formattedMessage":{"lang":"en-US","message":"Proxy agent aggregate status file is stale. Status timestamp: 2026-04-01 14:54:02.802 +00:00:00, Current time: 2026-04-01 15:02:34.0880885 +00:00:00"}},{"name":"ProxyAgentStatus","status":"Transitioning","code":4,"formattedMessage":{"lang":"en-US","message":"Proxy agent aggregate status file is stale. Status timestamp: 2026-04-01 14:54:02.802 +00:00:00, Current time: 2026-04-01 15:02:34.0880885 +00:00:00"}},{"name":"ProxyAgentFailedAuthenticationSummary","status":"Transitioning","code":4,"formattedMessage":{"lang":"en-US","message":"Proxy agent aggregate status file is stale. Status timestamp: 2026-04-01 14:54:02.802 +00:00:00, Current time: 2026-04-01 15:02:34.0880885 +00:00:00"}},{"name":"EbpfStatus","status":"Error","code":4,"formattedMessage":{"lang":"en-US","message":"EbpfCore: Stopped, Disabled, NetEbpfExt: Running, AutoStart"}}]}}]
  1. Both services running:
[{"version":"1.0","timestampUTC":"2026-04-01T15:05:20.171","status":{"name":"ProxyAgentVMExtension","operation":"Enable","configurationAppliedTime":"2026-04-01T15:05:20Z","status":"Error","code":0,"formattedMessage":{"lang":"en-US","message":"Started ProxyAgent Extension Monitoring thread."},"substatus":[{"name":"ProxyAgentConnectionSummary","status":"Transitioning","code":4,"formattedMessage":{"lang":"en-US","message":"Proxy agent aggregate status file is stale. Status timestamp: 2026-04-01 14:54:02.802 +00:00:00, Current time: 2026-04-01 15:05:20.134109 +00:00:00"}},{"name":"ProxyAgentStatus","status":"Transitioning","code":4,"formattedMessage":{"lang":"en-US","message":"Proxy agent aggregate status file is stale. Status timestamp: 2026-04-01 14:54:02.802 +00:00:00, Current time: 2026-04-01 15:05:20.134109 +00:00:00"}},{"name":"ProxyAgentFailedAuthenticationSummary","status":"Transitioning","code":4,"formattedMessage":{"lang":"en-US","message":"Proxy agent aggregate status file is stale. Status timestamp: 2026-04-01 14:54:02.802 +00:00:00, Current time: 2026-04-01 15:05:20.134109 +00:00:00"}},{"name":"EbpfStatus","status":"Success","code":0,"formattedMessage":{"lang":"en-US","message":"EbpfCore: Running, AutoStart, NetEbpfExt: Running, AutoStart"}}]}}]
  1. After re-running GuestProxyAgent service:
[{"version":"1.0","timestampUTC":"2026-04-01T15:07:35.883","status":{"name":"ProxyAgentVMExtension","operation":"Enable","configurationAppliedTime":"2026-04-01T15:07:35Z","status":"Success","code":0,"formattedMessage":{"lang":"en-US","message":"Started ProxyAgent Extension Monitoring thread."},"substatus":[{"name":"ProxyAgentConnectionSummary","status":"Success","code":0,"formattedMessage":{"lang":"en-US","message":"proxy connection summary is empty"}},{"name":"ProxyAgentStatus","status":"Success","code":0,"formattedMessage":{"lang":"en-US","message":"{\"version\":\"1.0.41.0\",\"status\":\"SUCCESS\",\"monitorStatus\":{\"status\":\"RUNNING\",\"message\":\"Proxy agent status is running.\"},\"keyLatchStatus\":{\"status\":\"RUNNING\",\"message\":\"Found key details from local and ready to use. - 141\",\"states\":{\"imdsRuleId\":\"0de+3s0TY/ikVyv5nylfT5I8KqA=\",\"wireServerRuleId\":\"AfxJN89/TGr+9Bq112Jpk+FT2C0=\",\"hostGARuleId\":\"AfxJN89/TGr+9Bq112Jpk+FT2C0=\",\"keyGuid\":\"1938a0d4-adbd-4276-8b12-fccd7574f375\",\"secureChannelState\":\"WireServer Enforce -  IMDS Disabled - HostGA Enforce\"}},\"ebpfProgramStatus\":{\"status\":\"RUNNING\",\"message\":\"Started Redirector with eBPF maps - 120\"},\"proxyListenerStatus\":{\"status\":\"RUNNING\",\"message\":\"Started proxy listener, ready to accept request - 120\"},\"telemetryLoggerStatus\":{\"status\":\"UNKNOWN\",\"message\":\"Status unknown.\"},\"proxyConnectionsCount\":0}"}},{"name":"ProxyAgentFailedAuthenticationSummary","status":"Success","code":0,"formattedMessage":{"lang":"en-US","message":"proxy failed auth summary is empty"}},{"name":"EbpfStatus","status":"Success","code":0,"formattedMessage":{"lang":"en-US","message":"EbpfCore: Running, AutoStart, NetEbpfExt: Running, AutoStart"}}]}}]

Comment thread proxy_agent_shared/src/service.rs Outdated
Comment thread proxy_agent_extension/src/service_main.rs Outdated
Changed error case log messages to use the same 'EbpfCore: {summary}, NetEbpfExt: {summary}' format
@srikrishnaveturi srikrishnaveturi merged commit d499f08 into Azure:dev Apr 13, 2026
12 checks passed
ZhidongPeng added a commit that referenced this pull request Apr 27, 2026
* Report eBPF service statuses instead of checking installation (#334)

* Report eBPF service statuses instead of checking installation

---------

Co-authored-by: Srikrishna Veturi <sveturi@microsoft.com>

* Fix clippy::unnecessary_sort_by (#336)

* Bump rand from 0.8.5 to 0.8.6 (#339)

Bumps [rand](https://github.com/rust-random/rand) from 0.8.5 to 0.8.6.
- [Release notes](https://github.com/rust-random/rand/releases)
- [Changelog](https://github.com/rust-random/rand/blob/0.8.6/CHANGELOG.md)
- [Commits](rust-random/rand@0.8.5...0.8.6)

---
updated-dependencies:
- dependency-name: rand
  dependency-version: 0.8.6
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump openssl from 0.10.73 to 0.10.78 (#338)

Bumps [openssl](https://github.com/rust-openssl/rust-openssl) from 0.10.73 to 0.10.78.
- [Release notes](https://github.com/rust-openssl/rust-openssl/releases)
- [Commits](rust-openssl/rust-openssl@openssl-v0.10.73...openssl-v0.10.78)

---
updated-dependencies:
- dependency-name: openssl
  dependency-version: 0.10.78
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Zhidong Peng <zpeng@microsoft.com>

* GPA service to use host-date-time for signed http requests (#335)

* GPA service to use host-date-time for signed http requests

* add logging

* fix typo

* Bump rand from 0.8.5 to 0.8.6 (#339)

Bumps [rand](https://github.com/rust-random/rand) from 0.8.5 to 0.8.6.
- [Release notes](https://github.com/rust-random/rand/releases)
- [Changelog](https://github.com/rust-random/rand/blob/0.8.6/CHANGELOG.md)
- [Commits](rust-random/rand@0.8.5...0.8.6)

---
updated-dependencies:
- dependency-name: rand
  dependency-version: 0.8.6
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump openssl from 0.10.73 to 0.10.78 (#338)

Bumps [openssl](https://github.com/rust-openssl/rust-openssl) from 0.10.73 to 0.10.78.
- [Release notes](https://github.com/rust-openssl/rust-openssl/releases)
- [Commits](rust-openssl/rust-openssl@openssl-v0.10.73...openssl-v0.10.78)

---
updated-dependencies:
- dependency-name: openssl
  dependency-version: 0.10.78
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Zhidong Peng <zpeng@microsoft.com>

* resolve comments

Co-authored-by: Copilot <copilot@github.com>

* fix spelling

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Zhidong Peng <zpeng@micrsoft.com>
Co-authored-by: Copilot <copilot@github.com>

* Add local file-based access-control rule support. (#329)

* Add local file-based access-control rule support.

* formatting

* resolve comments and validate the parsed local rules.

* fix formatting.

* fix case-insensitive match

* prefix_local_rule_names

Co-authored-by: Copilot <copilot@github.com>

* Display useLocalFileRules.

* update log level at attemptting

Co-authored-by: Copilot <copilot@github.com>

* fix formatting

---------

Co-authored-by: Zhidong Peng <zpeng@micrsoft.com>
Co-authored-by: Copilot <copilot@github.com>

* cmdline to take the first 4 arguments  (#340)

* cmdline to take the first 4 arguments
* fix in common code path

* Update version to 1.0.43

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Srikrishna Veturi <veturi.srikrishna@gmail.com>
Co-authored-by: Srikrishna Veturi <sveturi@microsoft.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Zhidong Peng <zpeng@micrsoft.com>
Co-authored-by: Copilot <copilot@github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants