diff --git a/pipeline/inputs/windows-exporter-metrics.md b/pipeline/inputs/windows-exporter-metrics.md index c64f39902..4cc7e6d21 100644 --- a/pipeline/inputs/windows-exporter-metrics.md +++ b/pipeline/inputs/windows-exporter-metrics.md @@ -8,13 +8,29 @@ description: A plugin based on Prometheus Windows Exporter to collect system and The initial release of Windows Exporter metrics contains a single collector available from Prometheus Windows Exporter. -Metrics collected with Windows Exporter metrics flow through a separate pipeline from logs and current filters don't operate on top of metrics. +{% hint style="info" %} + +Metrics collected with Windows Exporter metrics flow through a separate pipeline from logs and current filters don't operate on top of metrics. This plugin is only supported on Windows operating systems as it uses Windows Management Instrumentation (WMI) to access the relevant metrics. + +{% endhint %} ## Configuration +`scrape_interval` sets the default for all scrapes. To set granular scrape intervals, set the specific interval. For example, `collector.cpu.scrape_interval`. When using a granular scrape interval, if a value greater than `0` is used, it overrides the global default. Otherwise, the global default is used. + +The plugin top-level `scrape_interval` setting is the global default. Any custom settings for individual `scrape_intervals` override that specific metric scraping interval. + +Each `collector.xxx.scrape_interval` option only overrides the interval for that specific collector and updates the associated set of provided metrics. + +Overridden intervals only change the collection interval, not the interval for publishing the metrics which is taken from the global setting. + +For example, if the global interval is set to `5` and an override interval of `60` is used, the published metrics will be reported every five seconds. However, the specific collector will stay the same for 60 seconds until it's collected again. + +This helps with down-sampling when collecting metrics. + | Key | Description | Default | |------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------| -| `scrape_interval` | The rate at which metrics are collected. | `5 seconds` | +| `scrape_interval` | The rate in seconds at which metrics are collected from the Windows host. | `5` | | `we.logical_disk.allow_disk_regex` | Specify the regular expression for logical disk metrics to allow collection of. | `"/.+/"` (all) | | `we.logical_disk.deny_disk_regex` | Specify the regular expression for logical disk metrics to prevent collection of or ignore. | `NULL` (all) | | `we.net.allow_nic_regex` | Specify the regular expression for network metrics captured by the name of the NIC. | `"/.+/"` (all) | @@ -23,22 +39,22 @@ Metrics collected with Windows Exporter metrics flow through a separate pipeline | `we.service.exclude` | Specify the key value pairs for the exclude condition for the `WHERE` clause of service metrics. | `NULL` | | `we.process.allow_process_regex` | Specify the regular expression covering the process metrics to collect. | `"/.+/"` (all) | | `we.process.deny_process_regex` | Specify the regular expression for process metrics to prevent collection of or ignore. | `NULL` (all) | -| `collector.cpu.scrape_interval` | The rate in seconds at which `cpu` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0 seconds` | -| `collector.net.scrape_interval` | The rate in seconds at which `net` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0 seconds` | -| `collector.logical_disk.scrape_interval` | The rate in seconds at which `logical_disk` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0 seconds` | -| `collector.cs.scrape_interval` | The rate in seconds at which `cs` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0 seconds` | -| `collector.os.scrape_interval` | The rate in seconds at which `os` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0 seconds` | -| `collector.thermalzone.scrape_interval` | The rate in seconds at which `thermalzone` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0 seconds` | -| `collector.cpu_info.scrape_interval` | The rate in seconds at which `cpu_info` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0 seconds` | -| `collector.logon.scrape_interval` | The rate in seconds at which `logon` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0 seconds` | -| `collector.system.scrape_interval` | The rate in seconds at which `system` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0 seconds` | -| `collector.service.scrape_interval` | The rate in seconds at which `service` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0 seconds` | -| `collector.memory.scrape_interval` | The rate in seconds at which `memory` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0 seconds` | -| `collector.paging_file.scrape_interval` | The rate in seconds at which `paging_file` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0 seconds` | -| `collector.process.scrape_interval` | The rate in seconds at which `process` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0 seconds` | -| `collector.tcp.scrape_interval` | The rate in seconds at which `tcp` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0 seconds` | -| `collector.cache.scrape_interval` | The rate in seconds at which `cache` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0 seconds` | -| `metrics` | Specify which metrics are collected. | `"cpu,cpu_info,os,net,logical_disk,cs,cache,thermalzone,logon,system,service,tcp"` | +| `collector.cpu.scrape_interval` | The rate in seconds at which `cpu` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0` | +| `collector.net.scrape_interval` | The rate in seconds at which `net` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0` | +| `collector.logical_disk.scrape_interval` | The rate in seconds at which `logical_disk` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0` | +| `collector.cs.scrape_interval` | The rate in seconds at which `cs` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0` | +| `collector.os.scrape_interval` | The rate in seconds at which `os` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0` | +| `collector.thermalzone.scrape_interval` | The rate in seconds at which `thermalzone` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0` | +| `collector.cpu_info.scrape_interval` | The rate in seconds at which `cpu_info` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0` | +| `collector.logon.scrape_interval` | The rate in seconds at which `logon` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0` | +| `collector.system.scrape_interval` | The rate in seconds at which `system` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0` | +| `collector.service.scrape_interval` | The rate in seconds at which `service` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0` | +| `collector.memory.scrape_interval` | The rate in seconds at which `memory` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0` | +| `collector.paging_file.scrape_interval` | The rate in seconds at which `paging_file` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0` | +| `collector.process.scrape_interval` | The rate in seconds at which `process` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0` | +| `collector.tcp.scrape_interval` | The rate in seconds at which `tcp` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0` | +| `collector.cache.scrape_interval` | The rate in seconds at which `cache` metrics are collected. Values greater than `0` override the global default. Otherwise, the global default is used. | `0` | +| `metrics` | Specify which metrics are collected. Comma-separated list of collector names. | `"cpu,cpu_info,os,net,logical_disk,cs,cache,thermalzone,logon,system,service,memory,paging_file,process,tcp"` | ## Collectors available @@ -46,23 +62,23 @@ The following table describes the available collectors as part of this plugin. A The Version column specifies the Fluent Bit version where the collector is available. -| Name | Description | OS | Version | -|----------------|------------------------------------|---------|---------| -| `cpu` | Exposes CPU statistics. | Windows | v1.9 | -| `net` | Exposes Network statistics. | Windows | v2.0.8 | -| `logical_disk` | Exposes `logical_disk` statistics. | Windows | v2.0.8 | -| `cs` | Exposes `cs` statistics. | Windows | v2.0.8 | -| `os` | Exposes OS statistics. | Windows | v2.0.8 | -| `thermalzone` | Exposes `thermalzone` statistics. | Windows | v2.0.8 | -| `cpu_info` | Exposes `cpu_info` statistics. | Windows | v2.0.8 | -| `logon` | Exposes `logon` statistics. | Windows | v2.0.8 | -| `system` | Exposes `system` statistics. | Windows | v2.0.8 | -| `service` | Exposes `service` statistics. | Windows | v2.1.6 | -| `memory` | Exposes `memory` statistics. | Windows | v2.1.9 | -| `paging_file` | Exposes `paging_file` statistics. | Windows | v2.1.9 | -| `process` | Exposes `process` statistics. | Windows | v2.1.9 | -| `tcp` | Exposes `tcp` statistics. | Windows | v4.1.0 | -| `cache` | Exposes `cache` statistics. | Windows | v4.1.0 | +| Name | Description | OS | Version | +|----------------|-------------------------------------------------------------------------------------------------------------|---------|---------| +| `cpu` | Exposes CPU statistics including `utilization`, `interrupts`, and `DPCs`. | Windows | v1.9 | +| `net` | Exposes network interface statistics such as bytes transferred, packets, and errors. | Windows | v2.0.8 | +| `logical_disk` | Exposes logical disk statistics including read/write operations, latency, and free space. | Windows | v2.0.8 | +| `cs` | Exposes computer system statistics including model, manufacturer, and system type. | Windows | v2.0.8 | +| `os` | Exposes operating system statistics including version, build number, and service pack information. | Windows | v2.0.8 | +| `thermalzone` | Exposes thermal zone statistics including temperature readings. | Windows | v2.0.8 | +| `cpu_info` | Exposes CPU information including model, cores, threads, and clock speed. | Windows | v2.0.8 | +| `logon` | Exposes logon session statistics including active sessions and session types. | Windows | v2.0.8 | +| `system` | Exposes system-level statistics including uptime, processes, and threads. | Windows | v2.0.8 | +| `service` | Exposes Windows service statistics including service state, start mode, and status. | Windows | v2.1.6 | +| `memory` | Exposes memory statistics including available, cached, and committed bytes. | Windows | v2.1.9 | +| `paging_file` | Exposes paging file statistics including usage, peak usage, and allocation. | Windows | v2.1.9 | +| `process` | Exposes process-level statistics including CPU usage, memory consumption, handles, and threads per process. | Windows | v2.1.9 | +| `tcp` | Exposes TCP connection statistics including active connections, segments, and errors. | Windows | v4.1.0 | +| `cache` | Exposes cache statistics including cache hits, misses, and utilization. | Windows | v4.1.0 | ## Threading @@ -78,9 +94,9 @@ In the following configuration file, the input plugin `windows_exporter_metrics` {% tab title="fluent-bit.yaml" %} ```yaml -# Node Exporter Metrics + Prometheus Exporter +# Windows Exporter Metrics + Prometheus Exporter # ------------------------------------------- -# The following example collect host metrics on Linux and expose +# The following example collects Windows host metrics and exposes # them through a Prometheus HTTP endpoint. # # After starting the service try it with: @@ -107,9 +123,9 @@ pipeline: {% tab title="fluent-bit.conf" %} ```text -# Node Exporter Metrics + Prometheus Exporter +# Windows Exporter Metrics + Prometheus Exporter # ------------------------------------------- -# The following example collect host metrics on Linux and expose +# The following example collects Windows host metrics and exposes # them through a Prometheus HTTP endpoint. # # After starting the service try it with: @@ -141,6 +157,100 @@ You can test the expose of the metrics by using `curl`: curl http://127.0.0.1:2021/metrics ``` +### Filtering disk and network metrics + +The Windows Exporter metrics plugin supports filtering logical disk and network interface metrics using regular expressions. + +#### Logical disk filtering + +Use `we.logical_disk.allow_disk_regex` and `we.logical_disk.deny_disk_regex` to control which logical disks are included in the metrics. + +Example configuration to only collect metrics from C: and D: drives: + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: windows_exporter_metrics + tag: windows_metrics + we.logical_disk.allow_disk_regex: "^(C|D):$" +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text +[INPUT] + name windows_exporter_metrics + tag windows_metrics + we.logical_disk.allow_disk_regex ^(C|D):$ +``` + +{% endtab %} +{% endtabs %} + +#### Network interface filtering + +Use `we.net.allow_nic_regex` to filter network interfaces by name. + +Example configuration to only collect metrics from Ethernet and Wi-Fi adapters: + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: windows_exporter_metrics + tag: windows_metrics + we.net.allow_nic_regex: "(Ethernet|Wi-Fi)" +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text +[INPUT] + name windows_exporter_metrics + tag windows_metrics + we.net.allow_nic_regex (Ethernet|Wi-Fi) +``` + +{% endtab %} +{% endtabs %} + +#### Process filtering + +Use `we.process.allow_process_regex` and `we.process.deny_process_regex` to control which processes are included in the metrics. + +Example configuration to exclude system processes: + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: windows_exporter_metrics + tag: windows_metrics + we.process.deny_process_regex: "(System|Idle|svchost)" +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text +[INPUT] + name windows_exporter_metrics + tag windows_metrics + we.process.deny_process_regex (System|Idle|svchost) +``` + +{% endtab %} +{% endtabs %} + ### Service where clause Windows service collector will retrieve all the service information for the local node or container. @@ -239,6 +349,86 @@ The WMI query will be called with the translated parameter as: SELECT * FROM Win32_Service WHERE (Name='docker' OR Name LIKE '%Svc%' OR Name LIKE '%Service') AND (NOT Name LIKE 'UdkUserSvc%' AND Name!='XboxNetApiSvc') AND (NOT Name LIKE 'webthreatdefusersvc%') ``` +### Selecting specific collectors + +You can configure the plugin to collect only specific metrics by using the `metrics` parameter. Use this to reduce resource usage or focus on specific system components. + +Example configuration to collect only CPU, memory, and disk metrics: + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: windows_exporter_metrics + tag: windows_metrics + metrics: "cpu,memory,logical_disk" + scrape_interval: 5 +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text +[INPUT] + name windows_exporter_metrics + tag windows_metrics + metrics cpu,memory,logical_disk + scrape_interval 5 +``` + +{% endtab %} +{% endtabs %} + +### Custom scrape intervals per collector + +You can set different scrape intervals for individual collectors to optimize resource usage. For example, you might want to collect CPU metrics more frequently than system information. + +Example configuration with custom intervals: + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: windows_exporter_metrics + tag: windows_metrics + scrape_interval: 10 + collector.cpu.scrape_interval: 5 + collector.memory.scrape_interval: 5 + collector.system.scrape_interval: 60 +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text +[INPUT] + name windows_exporter_metrics + tag windows_metrics + scrape_interval 10 + collector.cpu.scrape_interval 5 + collector.memory.scrape_interval 5 + collector.system.scrape_interval 60 +``` + +{% endtab %} +{% endtabs %} + +In this example, CPU and memory metrics are collected every 5 seconds, while system metrics are collected every 60 seconds. The global `scrape_interval` of 10 seconds determines how often metrics are published to the output. + +## Requirements and permissions + +The Windows Exporter metrics plugin uses Windows Management Instrumentation (WMI) to collect metrics. The following requirements apply: + +- **Operating System**: Windows only (Windows 7/Server 2008 R2 or later) +- **Permissions**: The Fluent Bit process must have appropriate permissions to query WMI. Most metrics can be collected with standard user permissions, but some collectors might require elevated privileges. +- **WMI Service**: The Windows Management Instrumentation service must be running. + +If you encounter permission errors, try running Fluent Bit with administrator privileges or ensure the service account has the necessary WMI query permissions. + ## Enhancement requests The plugin implements a subset of the available collectors in the original Prometheus Windows Exporter. If you would like a specific collector prioritized, open a GitHub issue by using the following template: