Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics output #5

Open
wants to merge 35 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
a7f31e4
start of metrics output implementation
jspaleta Mar 3, 2022
f9670e4
update test to allow for no search
jspaleta Mar 3, 2022
34527dd
add memory_usage
jspaleta Mar 3, 2022
19b1906
add created_at metric
jspaleta Mar 3, 2022
3a0bec9
add num_fds metric
jspaleta Mar 3, 2022
dc87447
add io counter metrics read/write_count/bytes
jspaleta Mar 3, 2022
65cafeb
add page fault metrics
jspaleta Mar 3, 2022
ad57abd
added context switching metrics
jspaleta Mar 3, 2022
25ea598
added num_threads metric
jspaleta Mar 3, 2022
96e1f0d
added rlimit metrics
jspaleta Mar 4, 2022
f33054e
add memory info stats
jspaleta Mar 4, 2022
dac0352
add total processes metric
jspaleta Mar 4, 2022
a86b3ac
add processes total state metrics
jspaleta Mar 4, 2022
487db41
add total process state metrics
jspaleta Mar 4, 2022
1842319
help text refactor
jspaleta Mar 4, 2022
b5da95b
more metrics work
jspaleta Mar 4, 2022
933479a
ensure processes are only tabulated once
jspaleta Mar 4, 2022
e30b0f0
refactor metrics processing
jspaleta Mar 4, 2022
b178276
update test coverage
jspaleta Mar 4, 2022
1ea9cac
add test coverage
jspaleta Mar 4, 2022
83610c5
update to latest deps
jspaleta Mar 4, 2022
9c4ceae
fix metric
jspaleta Mar 4, 2022
257e16a
add metrics into readme
jspaleta Mar 4, 2022
29854e5
moar readme
jspaleta Mar 4, 2022
1308317
hide the sumologic compat flag for now
jspaleta Mar 4, 2022
2bb010a
show truncated metrics output in the matching example
jspaleta Mar 4, 2022
426eb42
updated linter
jspaleta Mar 4, 2022
139ecd8
make linter happy
jspaleta Mar 4, 2022
6c661ca
update the go version used to build and test
jspaleta Mar 4, 2022
940e87f
protect against empty status array in unprivledged windows env
jspaleta Mar 4, 2022
18dce07
update go version; run go mod tidy
Mar 10, 2022
80bc3d3
remove host.name tag
jspaleta May 12, 2022
010224e
fix invalid tag names
jspaleta May 12, 2022
648535c
update to sdk v0.16.0 and go 1.18
jspaleta Jul 11, 2022
27d27f9
add new bonsai notify workflow to run after release finishes
jspaleta Jul 11, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@ jobs:
- name: Checkout code
uses: actions/checkout@v2
- name: Run golangci-lint
uses: actions-contrib/golangci-lint@v1
uses: golangci/golangci-lint-action@v2
30 changes: 30 additions & 0 deletions .github/workflows/notify-bonsai.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
name: Bonsai

##
# Only run this workflow after goreleaser complets
##
on:
workflow_run:
workflows:
- goreleaser
types:
- completed

jobs:
##
# Note:
# The Bonsai GitHub webhook integration looks for a GitHub webhook payload matching a worflow_job named: `bonsai-recompile` with status: `completed`
# To enable automatic Bonsai recompiles after building a new release in github make sure webhook_jobs events are enabled for
# the Bonsai GitHub webhook installed in GitHub repository by Bonsai as part of the asset repository resgistration process
##
bonsai-recompile:
##
# Only run this workflow_job after goreleaser completes successfully
##
runs-on: ubuntu-latest
if: ${{ github.event.workflow_run.conclusion == 'success' }}
steps:
- name: bonsai-webhook
continue-on-error: true
run: echo "Trigger recompile on 'completed' workflow_job event matching workflow_job.name 'bonsai-recompile' and workflow.name 'bonsai'"

2 changes: 1 addition & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v1
with:
go-version: 1.14.x
go-version: 1.18.x
- name: Run GoReleaser
uses: goreleaser/goreleaser-action@v1
with:
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,10 @@ jobs:
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Set up Go 1.14
- name: Set up Go 1.18
uses: actions/setup-go@v1
with:
go-version: 1.14
go-version: 1.18
id: go
- name: Test
run: go test -v ./...
4 changes: 2 additions & 2 deletions .goreleaser.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ builds:
- # First Build
env:
- CGO_ENABLED=0
main: main.go
ldflags: '-s -w -X github.com/sensu-community/sensu-plugin-sdk/version.version={{.Version}} -X github.com/sensu-community/sensu-plugin-sdk/version.commit={{.Commit}} -X github.com/sensu-community/sensu-plugin-sdk/version.date={{.Date}}'
main: .
ldflags: '-s -w -X github.com/sensu/sensu-plugin-sdk/version.version={{.Version}} -X github.com/sensu/sensu-plugin-sdk/version.commit={{.Commit}} -X github.com/sensu/sensu-plugin-sdk/version.date={{.Date}}'
# Set the binary output location to bin/ so archive will comply with Sensu Go Asset structure
binary: bin/{{ .ProjectName }}
goos:
Expand Down
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,14 @@ and this project adheres to [Semantic
Versioning](http://semver.org/spec/v2.0.0.html).

## Unreleased
### Added
- Sumo Logic Dashboard compatible metric output

### Changed
- Allow for empty search configuration to match full process list
- Added metrics only flag to disable search based alerts
- Converted search alert text output into metric comment strings
- Update to sensu plugin sdk 0.15.0
- Changed types import to corev2
- Minor README fix

Expand Down
137 changes: 128 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,15 @@

## Table of Contents
- [Overview](#overview)
- [Output Metrics](#output-metrics)
- [procstat](#procstat)
- [processes](#processes)
- [Configuration](#configuration)
- [Asset registration](#asset-registration)
- [Check definition](#check-definition)
- [Usage examples](#usage-examples)
- [Help output](#help-output)
- [Environment variables](#environment-variables)
- [Search string details](#search-string-details)
- [Exit severity](#exit-severity)
- [Annotations](#annotations)
Expand All @@ -19,11 +23,92 @@

## Overview

The Sensu Processes Check is a [Sensu Check][1] that searches for certain
running processes (or other strings in a command line). It can search for
multiple processes and, on a per-string basis, set the number of processes
expected, severity if the number of processes is not met, and whether or not
to search the full command line for the requested string.
The Sensu Processes Check is a [Sensu Check][1] that provides metrics for
processes found in the host process table.

It can optionally restrict the list of processes considered, using a [search configuration](#search-string-details) specifying multiple
strings to match. The search configuration can also be used to set alert conditions based on the
number of matching processes on a per-string basis.

### Output Metrics
Metrics output conforms to the Prometheus exposition standard.

#### procstat
The `procstat` metric family provides individual per-process metrics for each process matching the configured search criteria.
Each different per-process metric in this metric family is distinguished by the value of the label named `field`.
Each metric is also labeled with `process_executable_name` and `process_executable_pid`

| Field | Units | Description |
|--------------------------------|-------------|-------------------------------------|
| cpu_usage | percent | percent of CPU time used by process
| cpu_time | seconds | total amount of time process has used
| created_at | nanoseconds | process creation time since Unix epoch
| memory_usage | percent | percent of memory used by process
| memory_rss | bytes | process resident set (the number of virtual pages resident in RAM
| memory_vms | bytes | size of the process's virtual memory (address space)
| memory_swap | bytes | size of the process's swap
| memory_data | bytes | size of the process's data segment (initialized data, uninitialized data, and heap
| memory_stack | bytes | size of the process stack
| memory_locked | bytes | size of memory locked into RAM
| num_fds | count | number of file descriptors opened by process
| file_locks | count | number of file locks held by process
| num_threads | count | number of process threads
| read_count | count | number or read operations performed
| read_bytes | bytes | bytes read
| write_count | count | number of write operations performed
| writebytes | bytes | bytes written
| major_faults | count | The number of major faults the process has made which have required loading a memory page from disk.
| minor_faults | count | The number of minor faults the process has made which have not required loading a memory page from disk.
| child_major_faults | count | The number of major faults that the process's waited-for children have made.
| child_minor_faults | count | The number of minor faults that the process's waited-for children have made.
| signals_pending | count | number of currently queued signals
| nice_priority | N/A | nice priority
| realtime_priority | N/A | realtime priority
| voluntary_context_switches | count | number of voluntary context switches
| involuntary_context_switches | count | number of involuntary context switches
| rlimit_cpu_time_soft | seconds | soft maximum limit for cpu_time
| rlimit_cpu_time_hard | seconds | hard maximum limit for cpu_time
| rlimit_core_size_soft | bytes | ...
| rlimit_core_size_hard | bytes | ...
| rlimit_memory_data_soft | bytes | ...
| rlimit_memory_data_hard | bytes | ...
| rlimit_memory_stack_soft | bytes | ...
| rlimit_memory_stack_hard | bytes | ...
| rlimit_memory_rss_soft | bytes | ...
| rlimit_memory_rss_hard | bytes | ...
| rlimit_num_fds_soft | count | ...
| rlimit_num_fds_hard | count | ...
| rlimit_memory_locked_soft | bytes | ...
| rlimit_memory_locked_hard | bytes | ...
| rlimit_memory_vms_soft | bytes | ...
| rlimit_memory_vms_hard | bytes | ...
| rlimit_file_locks_soft | bytes | ...
| rlimit_file_locks_hard | bytes | ...
| rlimit_signals_pending_soft | count | ...
| rlimit_signals_pending_hard | count | ...
| rlimit_nice_priority_soft | count | ...
| rlimit_nice_priority_hard | count | ...
| rlimit_realtime_priority_soft | count | ...
| rlimit_realtime_priority_hard | count | ...

#### processes
The `processes` metric family provides summary metrics derived from the list of processes matching the configured search criteria.
Each different metric in this metric family is distinguished by the value of the label named `field`.

| Field | Units | Description |
|-------------------|-------|-------------------------------------|
| total | count | total number of processes tabulated |
| total_threads | count | total number of threads
| sleeping | count | number of processes in sleeping state
| unknown | count | number of processes in unknown state
| parked | count | number of processes in parked state
| blocked | count | number of processes in blocked state
| zombies | count | number of processes in zombie state
| stopped | count | number of processes in stopped state
| running | count | number of processes in running state
| wait | count | number of processes in wait state
| dead | count | number of processes in dead state
| idle | count | number of processes in idle state

## Configuration

Expand Down Expand Up @@ -76,12 +161,24 @@ Available Commands:

Flags:
-h, --help help for sensu-processes-check
--metrics-only Do not alert based on search configuration
-s, --search string An array of JSON search criteria, fields are "search_string", "severity", "number", "comparison", and "full_cmdline"
-S, --suppress-ok-output Aside from overal status, only output failures
-v, --verbose Verbose output

Use "sensu-processes-check [command] --help" for more information about a command.

```

### Environment variables

| Argument | Environment Variable |
|----------------------|------------------------------------|
| --metrics-only | PROCESSES_CHECK_METRICS_ONLY |
| --search | PROCESSES_CHECK_SEARCH |
| --suppress-ok-output | PROCESSES_CHECK_SUPPRESS_OK_OUTPUT |
| --verbose | PROCESSES_CHECK_SUMOLOGIC_VERBOSE |

### Search string details

The search string is JSON array of processes to search for. Each JSON object
Expand All @@ -106,8 +203,19 @@ is running on a Linux server, the following output may be produced:

```
sensu-processes-check -s '[{"search_string": "sshd"}]'
OK | 3 >= 1 (found >= required) evaluated true for "sshd"
Status - OK
# OK | 3 >= 1 (found >= required) evaluated true for "sshd"
# Status - OK

# HELP procstat per-process metrics
# TYPE procstat gauge

...

# HELP processes summary metrics
# TYPE processes gauge
processes{field="total",units="count"} 3 1646434587929
...

```

If you compare the output of `ps -e` and `ps -ef` you will see the 3 matches it
Expand All @@ -132,8 +240,19 @@ and set `search_string` to `/usr/sbin/sshd`.

```
sensu-processes-check -s '[{"search_string": "/usr/sbin/sshd", "full_cmdline": true}]'
OK | 1 >= 1 (found >= required) evaluated true for "/usr/sbin/sshd"
Status - OK
# OK | 1 >= 1 (found >= required) evaluated true for "/usr/sbin/sshd"
# Status - OK

# HELP procstat per-process metrics
# TYPE procstat gauge

...

# HELP processes summary metrics
# TYPE processes gauge
processes{field="total",units="count"} 1 1646434588019
...

```

#### Supported comparisons
Expand Down
61 changes: 50 additions & 11 deletions go.mod
Original file line number Diff line number Diff line change
@@ -1,20 +1,59 @@
module github.com/sensu/sensu-processes-check

go 1.14
go 1.18

require (
github.com/Knetic/govaluate v3.0.0+incompatible
github.com/StackExchange/wmi v0.0.0-20190523213315-cbe66965904d // indirect
github.com/go-ole/go-ole v1.2.4 // indirect
github.com/pelletier/go-toml v1.6.0 // indirect
github.com/sensu-community/sensu-plugin-sdk v0.11.0
github.com/sensu/sensu-go/api/core/v2 v2.3.0
github.com/prometheus/client_model v0.2.0
github.com/prometheus/common v0.32.1
github.com/sensu/sensu-go/api/core/v2 v2.14.0
github.com/sensu/sensu-plugin-sdk v0.16.0
github.com/shirou/gopsutil v3.20.10+incompatible
github.com/spf13/afero v1.2.2 // indirect
github.com/stretchr/testify v1.7.0
)

require (
github.com/StackExchange/wmi v1.2.1 // indirect
github.com/blang/semver/v4 v4.0.0 // indirect
github.com/coreos/go-semver v0.3.0 // indirect
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/echlebek/timeproxy v1.0.0 // indirect
github.com/fsnotify/fsnotify v1.4.7 // indirect
github.com/go-ole/go-ole v1.2.5 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/golang-jwt/jwt/v4 v4.0.0 // indirect
github.com/golang/protobuf v1.5.2 // indirect
github.com/google/go-cmp v0.5.7 // indirect
github.com/google/uuid v1.1.2 // indirect
github.com/hashicorp/hcl v1.0.0 // indirect
github.com/inconshreveable/mousetrap v1.0.0 // indirect
github.com/konsorten/go-windows-terminal-sequences v1.0.3 // indirect
github.com/magiconair/properties v1.8.1 // indirect
github.com/matttproud/golang_protobuf_extensions v1.0.1 // indirect
github.com/mitchellh/mapstructure v1.1.2 // indirect
github.com/pelletier/go-toml v1.2.0 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/robertkrimen/otto v0.0.0-20191219234010-c382bd3c16ff // indirect
github.com/robfig/cron/v3 v3.0.1 // indirect
github.com/sensu/sensu-go/types v0.10.0 // indirect
github.com/sensu/sensu-licensing v0.1.2 // indirect
github.com/sirupsen/logrus v1.6.0 // indirect
github.com/spf13/afero v1.1.2 // indirect
github.com/spf13/cast v1.3.1 // indirect
github.com/spf13/cobra v1.4.0 // indirect
github.com/spf13/jwalterweatherman v1.1.0 // indirect
github.com/stretchr/testify v1.6.0
golang.org/x/net v0.0.0-20200114155413-6afb5195e5aa // indirect
golang.org/x/sys v0.0.0-20200120151820-655fe14d7479 // indirect
gopkg.in/ini.v1 v1.51.1 // indirect
github.com/spf13/pflag v1.0.5 // indirect
github.com/spf13/viper v1.7.0 // indirect
github.com/subosito/gotenv v1.2.0 // indirect
go.etcd.io/etcd/api/v3 v3.5.0 // indirect
golang.org/x/net v0.0.0-20210525063256-abc453219eb5 // indirect
golang.org/x/sys v0.0.0-20210603081109-ebe580a85c40 // indirect
golang.org/x/text v0.3.6 // indirect
google.golang.org/genproto v0.0.0-20210602131652-f16073e35f0c // indirect
google.golang.org/grpc v1.38.0 // indirect
google.golang.org/protobuf v1.26.0 // indirect
gopkg.in/ini.v1 v1.51.0 // indirect
gopkg.in/sourcemap.v1 v1.0.5 // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c // indirect
)
Loading