Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ceph Cluster Performance Input Plugin #1514

Closed
wants to merge 1 commit into from

Conversation

spjmurray
Copy link
Contributor

The existing ceph input plugin only has access to the local admin daemon socket
on the local host, and as such has access to a limited subset of data. This
plugin uses CLI commands to get access to the full spread of Ceph data. This
patch collects global OSD map and IO statistics, PG state and per pool IO
and utilization statistics.

Implements #1513

This is my first ever attempt at writing Go, so take it easy on me ;)

@spjmurray spjmurray force-pushed the ceph_pool_stats branch 2 times, most recently from 0492105 to 3cc2900 Compare July 18, 2016 14:31
@sparrc
Copy link
Contributor

sparrc commented Jul 18, 2016

Nice job with the plugin, but do you think it could get added to the plain ceph plugin, rather than standing on it's own?


## Measurements & Fields

* ceph.osdmap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the measurement names should be separated by "_" instead of "."

@spjmurray spjmurray force-pushed the ceph_pool_stats branch 2 times, most recently from a36262e to 69679ba Compare July 19, 2016 09:56
@spjmurray
Copy link
Contributor Author

Comments addressed, have at it @sparrc !

@spjmurray spjmurray force-pushed the ceph_pool_stats branch 3 times, most recently from 05adacf to 3ffc350 Compare July 19, 2016 11:50
@spjmurray
Copy link
Contributor Author

3ffc350 - refactored PG states to be a set of fields, killed off the pool ID tags as no one in the world would use them rather than the textual name, which solves Cameron's ambiguity issue, and follows my mantra of KISS

@@ -46,15 +62,78 @@ Would be parsed into the following metrics, all of which would be tagged with co

## suffix used to identify socket files
socket_suffix = "asok"

## Ceph user to authenticate as
ceph_user = "client.admin"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was the ceph user not necessary in the previous plugin? and do we need to add a password configuration? or does the user need to do something to setup their admin credentials correctly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original plugin poked at a socket, but that only gives basic data for a single process. The user here implicitly looks up a key (possibly via ceph.conf) which allows you to authenticate and talk to the entire cluster, all simple stuff for a ceph admin, but I'm anticipating you want less terse documentation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a little more detail, only if you think it's necessary though

@spjmurray
Copy link
Contributor Author

CI fail apparently unrelated:

--- FAIL: TestPgbouncerGeneratesMetrics (0.00s)
    Error Trace:    pgbouncer_test.go:59
    Error:      Should be true


FAIL
FAIL    github.com/influxdata/telegraf/plugins/inputs/pgbouncer 0.055s

@@ -79,6 +107,23 @@ func (c *Ceph) SampleConfig() string {

func (c *Ceph) Gather(acc telegraf.Accumulator) error {
c.setDefaults()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you didn't write the setDefaults function, but this is actually incorrect and will overwrite any user-provided configuration parameters. Defaults should be set in the init() function. So in init(), you should return the Ceph struct with any default options set.

You also should probably be setting GatherAdminSocketStats to true here. If this is the current behavior, then users of the ceph plugin will be broken by the change unless they update their config file (because all bools default to false in Go).

@sparrc
Copy link
Contributor

sparrc commented Aug 17, 2016

thanks @spjmurray, just one more thing to address. Basically you will need to remove the setDefaults function and set the default fields in the init() function instead.

@spjmurray spjmurray force-pushed the ceph_pool_stats branch 2 times, most recently from 11ea4c9 to da7acc9 Compare August 19, 2016 10:00
@spjmurray
Copy link
Contributor Author

Much cleaner @sparrc. Running against my production cluster now quite happily 😄

@spjmurray spjmurray force-pushed the ceph_pool_stats branch 2 times, most recently from 0e70c19 to e70d45f Compare August 19, 2016 10:22
The existing ceph input plugin only has access to the local admin daemon socket
on the local host, and as such has access to a limited subset of data.  This
extends the plugin to use CLI commands to get access to the full spread of Ceph
data.  This patch collects global OSD map and IO statistics, PG state and per pool
IO and utilization statistics.
@sparrc
Copy link
Contributor

sparrc commented Aug 30, 2016

closed by 38d8771

@sparrc sparrc closed this Aug 30, 2016
@kako420
Copy link

kako420 commented Aug 16, 2017

is this included in v1.3.5?

@danielnelson
Copy link
Contributor

@kako420 Yes, this was added in version 1.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants