happo-agent
is yet another Nagios nrpe plugin. And improvement nrpe functions.
- More secure communication. Supports TLS 1.2.
- Less fork cost at bastion(proxy) mode. Proxy request handled by thread (not fork()).
- Metric collection. Compatible to Sensu plugin format.
- inventory collection.
- Red Hat Enterprise Linux (RHEL) 6.x, 7.x
- CentOS 6.x, 7.x
- Ubuntu 12.04 or later
/path/to/happo-agent daemon -A [Accept from IP/Subnet] -B [Public key file] -R [Private key file] -M [Metric config file (Accept empty file)]
Many configuration can be with environment variables.
See /etc/default/happo-agent.env
(example is in contrib/etc/default/happo-agent.env)
Call plugin from check_happo
, happo-agent
calls local nagios plugin program. Then, return code and value to check_happo
.
For more information, please see check_happo
README.
Every one minute, execute sensu metrics plugin defined by metrics.yaml
, and buffering results.
If you collect buffering results, you can use API /metric
method.
Get command based inventory data via API /inventory
method.
You create happo-agent
client management server if you want.
Use api client commands, happo-agent
calls endpoint url which is client management server.
/path/to/happo-agent add -e [ENDPOINT_URL] -g [GROUP_NAME[!SUB_GROUP_NAME]] -i [OWN_IP] -H [HOSTNAME] [-p BASTON_IP]
/path/to/happo-agent is_added -e [ENDPOINT_URL] -g [GROUP_NAME[!SUB_GROUP_NAME]] -i [OWN_IP]
/path/to/happo-agent remove -e [ENDPOINT_URL] -g [GROUP_NAME[!SUB_GROUP_NAME]] -i [OWN_IP]
$ sudo yum install epel-release
$ sudo yum install nagios-plugins-all
$ go get -dv github.com/heartbeatsjp/happo-agent
$ sudo install $GOHOME/src/bin/happo-agent /usr/local/bin/happo-agent
$ sudo install -d -m 755 /etc/happo
$ cd /etc/happo
$ sudo openssl genrsa -aes128 -out happo-agent.key 2048
$ sudo openssl req -new -key happo-agent.key -sha256 -out happo-agent.csr
$ sudo openssl x509 -in happo-agent.csr -days 3650 -req -signkey happo-agent.key -sha256 -out happo-agent.pub
$ sudo touch metrics.yaml
$ sudo chmod go-rwx happo-agent.key
$ sudo install contrib/etc/default/happo-agent.env /etc/default/happo-agent.env
$ sudo install contrib/etc/init/happo-agent.conf /etc/init/happo-agent.conf
$ sudo initctl reload-configuration
$ sudo initctl start happo-agent
You want to use sensu metrics plugins, should install /usr/local/bin
.
Pre build binary maybe useful. Releases · heartbeatsjp/happo-agent
metrics.yaml
metrics:
- hostname: [HOSTNAME]
plugins:
- plugin_name: [Sensu plugin name (Path not needed)]
plugin_option: [Sensu plugin name options]
- ...
- ...
- Listen port: 6777 (Default)
- HTTPS, TLS 1.2, CipherSuite:
TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
Check available.
- Input format
- None
- Return format
- String "OK"
$ wget -q --no-check-certificate -O - https://127.0.0.1:6777/
OK
Use agent bastion(proxy) mode.
- Input format
- JSON
- Input variables
- proxy_hostport:
- (Array) bastion_ip:port. It can multiple define.
- request_type: request type (e.g.
monitor
) - request_json: Send JSON string to server.
- proxy_hostport:
- Return format
- JSON
- Return variables
- By
request_type
type.
- By
In case --proxy-timeout-seconds
reached, return 504 Gateway Timeout
.
$ wget -q --no-check-certificate -O - https://192.0.2.1:6777/proxy --post-data='{"proxy_hostport": ["198.51.100.1:6777"], "request_type": "monitor", "request_json": "{\"apikey\": \"\", \"plugin_name\": \"check_procs\", \"plugin_option\": \"-w 100 -c 200\"}"}'
{"return_value":1,"message":"PROCS WARNING: 168 processes\n"}
Example calls wget host -> https://192.0.2.1:6777/proxy -> https://198.51.100.1:6777/monitor
.
Get inventory information from command.
- Input format
- JSON
- Input variables
- apikey: ""
- command: execute command
- command_option: command option
- Return format
- JSON
- Return variables
- return_code: commands return code
- return_value: commands return value (stdout, stderr)
$ wget -q --no-check-certificate -O - https://127.0.0.1:6777/inventory --post-data='{"apikey": "", "command": "uname", "command_option": "-a"}'
{"return_code":0,"return_value":"Linux saito-hb-vm101 2.6.32-573.3.1.el6.x86_64 #1 SMP Thu Aug 13 22:55:16 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux\n"}
Call monitor plugin. It likes nrpe.
- Input format
- JSON
- Input variables
- apikey: ""
- command: execute nagios plugin command
- command_option: command option
- Return format
- JSON
- Return variables
- return_code: commands return code
- return_value: commands return value (stdout, stderr)
In case --command-timeout
reached, return 500 Internal Server Error
.
$ wget -q --no-check-certificate -O - https://127.0.0.1:6777/monitor --post-data='{"apikey": "", "plugin_name": "check_procs", "plugin_option": "-w 100 -c 200"}'
{"return_value":1,"message":"PROCS WARNING: 168 processes\n"}
Get collected metric values.
- Input format
- JSON
- Input variables
- apikey: ""
- Return format
- JSON
- Return variables
- MetricData:
- (Array)
- hostname: Hostname
- timestamp: Unix time
- metrics: metric name - metric value (key-value)
- (Array)
- Message: message from agent (if error occurred)
- MetricData:
$ wget -q --no-check-certificate -O - https://127.0.0.1:6777/metric --post-data='{"apikey": ""}'
{"metric_data":[{"hostname":"saito-hb-vm101","timestamp":1444028730,"metrics":{"linux.context_switches.context_switches":32662,"linux.disk.elapsed.iotime_sda":52,"linux.disk.elapsed.iotime_weighted_sda":82,"linux.disk.rwtime.tsreading_sda":0,"linux.disk.rwtime.tswriting_sda":82,"linux.forks.forks":88,"linux.interrupts.interrupts":19642,"linux.ss.CLOSE-WAIT":0,"linux.ss.CLOSING":0,"linux.ss.ESTAB":9,"linux.ss.FIN-WAIT-1":0,"linux.ss.FIN-WAIT-2":0,"linux.ss.LAST-ACK":0,"linux.ss.LISTEN":31,"linux.ss.SYN-RECV":0,"linux.ss.SYN-SENT":0,"linux.ss.TIME-WAIT":7,"linux.ss.UNCONN":0,"linux.ss.UNKNOWN":0,"linux.swap.pswpin":0,"linux.swap.pswpout":0,"linux.users.users":1}},…(snip)…],"message":""}
Append metric values. (passive metrics collection)
- Input format
- JSON
- Input variables
- apikey: ""
- MetricData:
- (Array)
- hostname: Hostname
- timestamp: Unix time
- metrics: metric name - metric value (key-value)
- (Array)
- Return format
- JSON
- Return variables
- Message: message from agent (if error occurred)
$ wget -q --no-check-certificate -O - https://127.0.0.1:6777/metric/append --post-data='{"apikey": "", "metric_data":[{"hostname":"saito-hb-vm101","timestamp":1444028730,"metrics":{"linux.context_switches.context_switches":32662,"linux.disk.elapsed.iotime_sda":52,"linux.disk.elapsed.iotime_weighted_sda":82,"linux.disk.rwtime.tsreading_sda":0,"linux.disk.rwtime.tswriting_sda":82,"linux.forks.forks":88,"linux.interrupts.interrupts":19642,"linux.ss.CLOSE-WAIT":0,"linux.ss.CLOSING":0,"linux.ss.ESTAB":9,"linux.ss.FIN-WAIT-1":0,"linux.ss.FIN-WAIT-2":0,"linux.ss.LAST-ACK":0,"linux.ss.LISTEN":31,"linux.ss.SYN-RECV":0,"linux.ss.SYN-SENT":0,"linux.ss.TIME-WAIT":7,"linux.ss.UNCONN":0,"linux.ss.UNKNOWN":0,"linux.swap.pswpin":0,"linux.swap.pswpout":0,"linux.users.users":1}},...(snip)...]}'
{"status": "ok", "message": ""}
TODO
replaced to /status
Get happo-agent status
- Input format
- None
- Input variables
- None
- Return format
- JSON
- Return variables
- app_version: happo-agent version ( equivalent to
happo-agent -v
) - uptime_seconds: seconds from happo-agent started
- num_goroutine: number of goroutine
- metric_buffer_status
- oldest_timestamp: oldest Timestamp(int64) in metric_data_buffer
- newest_timestamp: newest Timestamp(int64) in metric_data_buffer
- callers:
filepath:linenum
of each goroutines
- app_version: happo-agent version ( equivalent to
$ wget -q --no-check-certificate -O - https://127.0.0.1:6777/status
{"app_version":"1.0.0","uptime_seconds":13,"num_goroutine":15,"metric_buffer_status":{"newest_timestamp":1505180794,"oldest_timestamp":1504852118},"callers":["/goroot/src/runtime/extern.go:219","/gopath/src/github.com/heartbeatsjp/happo-agent/model/status.go:28",...(snip)...]}
Get happo-agent memory usage status
- Input format
- None
- Input variables
- None
- Return format
- JSON
- Return variables
- runtime.MemStatus
$ wget -q --no-check-certificate -O - https://127.0.0.1:6777/status/memory
{"Alloc":7155296,"TotalAlloc":12148632,"Sys":14395640,"Lookups":34,"Mallocs":23456,"Frees":6565,...(snip)...}%
Get request status/count.
- Input format
- None
- Input variables
- None
- Return format
- JSON
- Return variables
- last1: Last 1 Minutes results
- url: url
- counts:
<status_code>
- count
- last5: Last 5 Minutes results
- same as last1
- last1: Last 1 Minutes results
$ wget -q --no-check-certificate -O - https://127.0.0.1:6777/status/request
{"keys":["s-1498112479","s-1498112819"]}
{"last1":[{"url":"/","counts":{"200":3,"403":1}},{"url":"/proxy","counts":{"200":1,"403":1}}],"last5":[{"url":"/","counts":{"200":3,"403":1}},{"url":"/proxy","counts":{"200":1,"403":1}}]}
Get machine state key list.
- Input format
- None
- Input variables
- None
- Return format
- JSON
- Return variables
- keys: machine-state key list
$ wget -q --no-check-certificate -O - https://127.0.0.1:6777/machine-state
{"keys":["s-1498112479","s-1498112819"]}
Get machine state.
- Input format
- None
- Input variables
- key (can find from
/machine-state/
)
- key (can find from
- Return format
- JSON
- Return variables
- machineState: command results
$ wget -q --no-check-certificate -O - https://127.0.0.1:6777/machine-state/s-1498112479
{"machineState":"********** w (2017-06-22T15:21:19+09:00) cron 15:21:19 up 13 days, ..."}
- key
m-<timestamp>
are metrics(timestamp is unixtime).- value:
happo_agent.MetricsData
- value:
- key
s-<timestamp>
are saved machine state(timestamp is unixtime).- value:
string
- value:
syndtr/goleveldb: LevelDB key/value database in Go.
- Fork (http://github.com/heartbeatsjp/happo-agent/fork)
- Create a feature branch
- Commit your changes
- Rebase your local changes against the master branch
- Run test suite with the
go test ./...
command and confirm that it passes - Run
gofmt -s
- Create a new Pull Request
You also can run test suite with docker on local PC.
Overview : Basically use go test
- Unit test :
go test
in CI - Endpoint behavior test :
go test
in CI - Regression test : automatically do in CI
daemontest
pipeline is the daemon running test.
if you run daemontest
on local with wercker-cli,
run below.
wercker build --pipeline daemontest
To confirm binary will suite to the criteria.
- Case:
- Test Duration: 2700sec(45min)
- Monitor Requests: 10kreq/45min => 3333/3min
- Metric Count: 200 => metrics data stored 200 metrics per minute
- Criteria:
- CPU Usage: up to 4%
- Monitoring agent's cpu usage shoud be small.
- Mem Usage: up to 500MB
- Monitoring agent's memory usage shoud be small. And more, we have to avoid memory leaking.
- Disk Usage: up to 250KB
- Disk Usage is almost related to the amount of storing metrics. We have to keep disk usage properly.
- CPU Usage: up to 4%
We know that long-long running test is good for daemon, but max is 59min, because of Wercker's restriction.
... note about implementation:
- daemontest configurations are in wercker.yml
daemontest > steps > script.name=="test daemon behavior" > code
- yq filter is
.daemontest.steps[] | select(.script.name=="test daemon behavior") | .script.code
- yq filter is
- Case:
- Test Duration:
TEST_DURATION_SEC
- to complete requests while test duration, maybe we have to change
MONITOR_REQUESTS
andMONITOR_REQUESTS_INTERVAL
- to complete requests while test duration, maybe we have to change
- Monitor Requests:
MONITOR_REQUESTS / TEST_DURATION_SEC
- Metric Count:
METRICS_COUNT
- Test Duration:
- Criteria:
- CPU Usage:
CPU_THRESHOLD_PERCENT
- Mem Usage:
MEM_THRESHOLD_KB
- Disk Usage:
DB_DISK_THRESHOLD_KB
- CPU Usage:
Copyright 2016 HEARTBEATS Corporation.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.