-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add unbound input plugin #3434
Add unbound input plugin #3434
Changes from 5 commits
3f14a43
25f2027
9ef60ac
fd7b457
e1f6302
3bbbc20
b707914
9d28bad
b4153e6
1798d8c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,186 @@ | ||
# Unbound Input Plugin | ||
|
||
This plugin gathers stats from [Unbound - a validating, recursive, and caching DNS resolver](https://www.unbound.net/) | ||
|
||
### Configuration: | ||
|
||
```toml | ||
# A plugin to collect stats from Unbound - a validating, recursive, and caching DNS resolver | ||
[[inputs.unbound]] | ||
## If running as a restricted user you can prepend sudo for additional access: | ||
#use_sudo = false | ||
|
||
## The default location of the unbound-control binary can be overridden with: | ||
binary = "/usr/sbin/unbound-control" | ||
|
||
## By default, telegraf gathers stats for 3 metric points. | ||
## Setting stats will override the defaults shown below. | ||
## stats may also be set to ["all"], which will collect all stats | ||
stats = ["total.*", "num.*","time.up", "mem.*"] | ||
``` | ||
|
||
### Measurements & Fields: | ||
|
||
This is the full list of stats provided by unbound. Stats will be grouped by their prefix (eg thread0, | ||
total, etc). In the output, the prefix will be used as a tag, and removed from field names. See | ||
https://www.unbound.net/documentation/unbound-control.html for details. | ||
|
||
- unbound | ||
thread0.num.queries | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should consider replace dots with underscores the field |
||
thread0.num.cachehits | ||
thread0.num.cachemiss | ||
thread0.num.prefetch | ||
thread0.num.recursivereplies | ||
thread0.requestlist.avg | ||
thread0.requestlist.max | ||
thread0.requestlist.overwritten | ||
thread0.requestlist.exceeded | ||
thread0.requestlist.current.all | ||
thread0.requestlist.current.user | ||
thread0.recursion.time.avg | ||
thread0.recursion.time.median | ||
total.num.queries | ||
total.num.cachehits | ||
total.num.cachemiss | ||
total.num.prefetch | ||
total.num.recursivereplies | ||
total.requestlist.avg | ||
total.requestlist.max | ||
total.requestlist.overwritten | ||
total.requestlist.exceeded | ||
total.requestlist.current.all | ||
total.requestlist.current.user | ||
total.recursion.time.avg | ||
total.recursion.time.median | ||
time.now | ||
time.up | ||
time.elapsed | ||
mem.total.sbrk | ||
mem.cache.rrset | ||
mem.cache.message | ||
mem.mod.iterator | ||
mem.mod.validator | ||
histogram.000000.000000.to.000000.000001 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How are these histogram metrics encoded? Is the field named There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes actually the field name will be 000000.000000.to.000000.000001 which is not really usefful. Anyway I don't think that this kind of information is relevant for telegraf. It is easier to construct the histogram from telegraf collected data rather than using these fields. I think I will skip them from collected metrics. |
||
histogram.000000.000001.to.000000.000002 | ||
histogram.000000.000002.to.000000.000004 | ||
histogram.000000.000004.to.000000.000008 | ||
histogram.000000.000008.to.000000.000016 | ||
histogram.000000.000016.to.000000.000032 | ||
histogram.000000.000032.to.000000.000064 | ||
histogram.000000.000064.to.000000.000128 | ||
histogram.000000.000128.to.000000.000256 | ||
histogram.000000.000256.to.000000.000512 | ||
histogram.000000.000512.to.000000.001024 | ||
histogram.000000.001024.to.000000.002048 | ||
histogram.000000.002048.to.000000.004096 | ||
histogram.000000.004096.to.000000.008192 | ||
histogram.000000.008192.to.000000.016384 | ||
histogram.000000.016384.to.000000.032768 | ||
histogram.000000.032768.to.000000.065536 | ||
histogram.000000.065536.to.000000.131072 | ||
histogram.000000.131072.to.000000.262144 | ||
histogram.000000.262144.to.000000.524288 | ||
histogram.000000.524288.to.000001.000000 | ||
histogram.000001.000000.to.000002.000000 | ||
histogram.000002.000000.to.000004.000000 | ||
histogram.000004.000000.to.000008.000000 | ||
histogram.000008.000000.to.000016.000000 | ||
histogram.000016.000000.to.000032.000000 | ||
histogram.000032.000000.to.000064.000000 | ||
histogram.000064.000000.to.000128.000000 | ||
histogram.000128.000000.to.000256.000000 | ||
histogram.000256.000000.to.000512.000000 | ||
histogram.000512.000000.to.001024.000000 | ||
histogram.001024.000000.to.002048.000000 | ||
histogram.002048.000000.to.004096.000000 | ||
histogram.004096.000000.to.008192.000000 | ||
histogram.008192.000000.to.016384.000000 | ||
histogram.016384.000000.to.032768.000000 | ||
histogram.032768.000000.to.065536.000000 | ||
histogram.065536.000000.to.131072.000000 | ||
histogram.131072.000000.to.262144.000000 | ||
histogram.262144.000000.to.524288.000000 | ||
num.query.type.A | ||
num.query.type.PTR | ||
num.query.type.TXT | ||
num.query.type.AAAA | ||
num.query.type.SRV | ||
num.query.type.ANY | ||
num.query.class.IN | ||
num.query.opcode.QUERY | ||
num.query.tcp | ||
num.query.ipv6 | ||
num.query.flags.QR | ||
num.query.flags.AA | ||
num.query.flags.TC | ||
num.query.flags.RD | ||
num.query.flags.RA | ||
num.query.flags.Z | ||
num.query.flags.AD | ||
num.query.flags.CD | ||
num.query.edns.present | ||
num.query.edns.DO | ||
num.answer.rcode.NOERROR | ||
num.answer.rcode.SERVFAIL | ||
num.answer.rcode.NXDOMAIN | ||
num.answer.rcode.nodata | ||
num.answer.secure | ||
num.answer.bogus | ||
num.rrset.bogus | ||
unwanted.queries | ||
unwanted.replies | ||
|
||
### Tags: | ||
|
||
As indicated above, the prefix of a unbound stat will be used as it's 'section' tag. So section tag may have one of | ||
the following values: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems like each section is very different, maybe we should call the measurement There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As there is not too much data (no really need to spread information in several measurements) and I do not control unbound-control future evolution, I will start to simply drop section/tag and push all fields as they are (just converting field name dots to underscores). |
||
- section: | ||
- thread0 | ||
- total | ||
- time | ||
- mem | ||
- histogram | ||
- num | ||
- unwanted | ||
|
||
### Permissions: | ||
|
||
It's important to note that this plugin references unbound-control, which may require additional permissions to execute successfully. | ||
Depending on the user/group permissions of the telegraf user executing this plugin, you may need to alter the group membership, set facls, or use sudo. | ||
|
||
**Group membership (Recommended)**: | ||
```bash | ||
$ groups telegraf | ||
telegraf : telegraf | ||
|
||
$ usermod -a -G unbound telegraf | ||
|
||
$ groups telegraf | ||
telegraf : telegraf unbound | ||
``` | ||
|
||
**Sudo privileges**: | ||
If you use this method, you will need the following in your telegraf config: | ||
```toml | ||
[[inputs.unbound]] | ||
use_sudo = true | ||
``` | ||
|
||
You will also need to update your sudoers file: | ||
```bash | ||
$ visudo | ||
# Add the following line: | ||
telegraf ALL=(ALL) NOPASSWD: /usr/sbin/unbound-control | ||
``` | ||
|
||
Please use the solution you see as most appropriate. | ||
|
||
### Example Output: | ||
|
||
``` | ||
telegraf --config etc/telegraf.conf --input-filter unbound --test | ||
* Plugin: inputs.unbound, Collection 1 | ||
> unbound,section=total,host=laptop-aromeyer num.cachemiss=0,requestlist.current.all=0,num.cachehits=0,requestlist.overwritten=0,requestlist.max=0,num.recursivereplies=0,requestlist.avg=0,recursion.time.avg=0,recursion.time.median=0,num.prefetch=0,requestlist.exceeded=0,requestlist.current.user=0,tcpusage=0,num.queries=0 1509977403000000000 | ||
> unbound,section=time,host=laptop-aromeyer up=5794.844261,elapsed=12.484727,now=1509977402.617432 1509977403000000000 | ||
|
||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,163 @@ | ||
// +build !windows | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you remove this build flag, and also in the test file. |
||
|
||
package unbound | ||
|
||
import ( | ||
"bufio" | ||
"bytes" | ||
"fmt" | ||
"os/exec" | ||
"strconv" | ||
"strings" | ||
"time" | ||
|
||
"github.com/influxdata/telegraf" | ||
"github.com/influxdata/telegraf/filter" | ||
"github.com/influxdata/telegraf/internal" | ||
"github.com/influxdata/telegraf/plugins/inputs" | ||
) | ||
|
||
type runner func(cmdName string, UseSudo bool) (*bytes.Buffer, error) | ||
|
||
// Unbound is used to store configuration values | ||
type Unbound struct { | ||
Stats []string | ||
Binary string | ||
UseSudo bool | ||
|
||
filter filter.Filter | ||
run runner | ||
} | ||
|
||
var defaultStats = []string{"total.*", "num.*", "time.up", "mem.*"} | ||
var defaultBinary = "/usr/sbin/unbound-control" | ||
|
||
var sampleConfig = ` | ||
## If running as a restricted user you can prepend sudo for additional access: | ||
#use_sudo = false | ||
|
||
## The default location of the unbound-control binary can be overridden with: | ||
binary = "/usr/sbin/unbound-control" | ||
|
||
## By default, telegraf gather stats for 3 metric points. | ||
## Setting stats will override the defaults shown below. | ||
## Glob matching can be used, ie, stats = ["total.*"] | ||
## stats may also be set to ["*"], which will collect all stats | ||
stats = ["total.*", "num.*","time.up", "mem.*"] | ||
` | ||
|
||
func (s *Unbound) Description() string { | ||
return "A plugin to collect stats from Unbound - a validating, recursive, and caching DNS resolver " | ||
} | ||
|
||
// SampleConfig displays configuration instructions | ||
func (s *Unbound) SampleConfig() string { | ||
return sampleConfig | ||
} | ||
|
||
// Shell out to unbound_stat and return the output | ||
func unboundRunner(cmdName string, UseSudo bool) (*bytes.Buffer, error) { | ||
cmdArgs := []string{"stats"} | ||
|
||
cmd := exec.Command(cmdName, cmdArgs...) | ||
|
||
if UseSudo { | ||
cmdArgs = append([]string{cmdName}, cmdArgs...) | ||
cmd = exec.Command("sudo", cmdArgs...) | ||
} | ||
|
||
var out bytes.Buffer | ||
cmd.Stdout = &out | ||
err := internal.RunTimeout(cmd, time.Millisecond*200) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems like an aggressive timeout, I would move it up to at least a second and consider making it configurable. |
||
if err != nil { | ||
return &out, fmt.Errorf("error running unbound-control: %s", err) | ||
} | ||
|
||
return &out, nil | ||
} | ||
|
||
// Gather collects the configured stats from unbound_stat and adds them to the | ||
// Accumulator | ||
// | ||
// The prefix of each stat (eg MAIN, MEMPOOL, LCK, etc) will be used as a | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These aren't actual sections, and there is no unbound_stat. I guess this is holdovers from an earlier version of the code, can you update it? |
||
// 'section' tag and all stats that share that prefix will be reported as fields | ||
// with that tag | ||
func (s *Unbound) Gather(acc telegraf.Accumulator) error { | ||
if s.filter == nil { | ||
var err error | ||
if len(s.Stats) == 0 { | ||
s.filter, err = filter.Compile(defaultStats) | ||
} else { | ||
// legacy support, change "all" -> "*": | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can't have legacy support already since we are a new plugin :) |
||
if s.Stats[0] == "all" { | ||
s.Stats[0] = "*" | ||
} | ||
s.filter, err = filter.Compile(s.Stats) | ||
} | ||
if err != nil { | ||
return err | ||
} | ||
} | ||
|
||
out, err := s.run(s.Binary, s.UseSudo) | ||
if err != nil { | ||
return fmt.Errorf("error gathering metrics: %s", err) | ||
} | ||
|
||
sectionMap := make(map[string]map[string]interface{}) | ||
scanner := bufio.NewScanner(out) | ||
for scanner.Scan() { | ||
|
||
cols := strings.Split(scanner.Text(), "=") | ||
|
||
stat := cols[0] | ||
value := cols[1] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This could panic if somehow there are not two fields, make sure to guard against this. |
||
|
||
if s.filter != nil && !s.filter.Match(stat) { | ||
continue | ||
} | ||
|
||
parts := strings.SplitN(stat, ".", 2) | ||
|
||
section := parts[0] | ||
field := parts[1] | ||
|
||
// Init the section if necessary | ||
if _, ok := sectionMap[section]; !ok { | ||
sectionMap[section] = make(map[string]interface{}) | ||
} | ||
|
||
sectionMap[section][field], err = strconv.ParseUint(value, 10, 64) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Parse as int64, since this is what the Accumulator holds, if there is an error parsing don't add to the fields. |
||
if err != nil { | ||
sectionMap[section][field], err = strconv.ParseFloat(value, 64) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This fallback method could switch the type of the field, which will cause the metrics to be impossible to add to InfluxDB. We may need to parse all fields as floats, or have a list of the types. |
||
if err != nil { | ||
acc.AddError(fmt.Errorf("Expected a numeric or a float value for %s = %v\n", | ||
stat, value)) | ||
} | ||
} | ||
|
||
} | ||
|
||
for section, fields := range sectionMap { | ||
tags := map[string]string{ | ||
"section": section, | ||
} | ||
if len(fields) == 0 { | ||
continue | ||
} | ||
acc.AddFields("unbound", fields, tags) | ||
} | ||
|
||
return nil | ||
} | ||
|
||
func init() { | ||
inputs.Add("unbound", func() telegraf.Input { | ||
return &Unbound{ | ||
run: unboundRunner, | ||
Stats: defaultStats, | ||
Binary: defaultBinary, | ||
UseSudo: false, | ||
} | ||
}) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this section list the fields how they will be emitted from Telegraf, so
thread0.num.queries
will benum.queries
.