Ceph integration/bridge needs to provide pool usage data #80

brainfunked · 2016-12-16T12:39:01Z

Use ceph df to gather the data and add it as part of the ceph state sync in the integration. The collectd ceph plugin may be a useful reference for how to parse this data.

The text was updated successfully, but these errors were encountered:

shtripat · 2016-12-19T11:34:22Z

@brainfunked I understand the pool utilization data needs to be fetched, parsed and pushed to time series DB for trending purpose as well.

For pool utilization instant data for the cluster, we can think of getting the same data pulled at the time of ceph state sync and attach to the cluster state.

Comments??

shtripat · 2016-12-20T10:15:03Z

Adding snippet from discussion on ceph-devel below (not able to get archive link)

> Hi Team,
>
> Our team is currently working on project named "tendrl" [1][2].
> Tendrl is a management platform for software defined storage system like 
> Ceph, Gluster etc.
>
> As part of tendrl we are integrating with collectd to collect 
> performance data and we maintain the time series data in graphite.
>
> I have a question at this juncture regarding pool utilization data.
> As our thought process goes, we think of using output from command "ceph 
> df" and parse it to figure out pool utilization data and push it to 
> graphite using collectd.
> The question here is what is/would be performance impact of running 
> "ceph df" command on ceph nodes. We should be running this command only 
> on mon nodes I feel.
>

Correct, that data comes from the MONs and is not that heavy.

> Wanted to verify with the team here if this thought process is in right 
> direction and if so what ideally should be frequency of running the 
> command "ceph df" from collectd.
>

Running the command means forking a process every time and also going through the whole cephx authentication and client <> MON process.

> This is just from our point of view and we are open to any other 
> foolproof solution (if any).

The best would be to keep a open connection to a MON and run the 'df' command directly on the MONs in a loop.

I wrote something like that in Python a while ago for 'ceph status': https://gist.github.com/wido/ac53ae01d661dd57f4a8

cmd = {"prefix":"status", "format":"json"}

If you change that to:

cmd = {"prefix":"df", "format":"json"}

You ask the MON for 'df' and get back a JSON. Run that in a loop where you sleep every 1 or 5 seconds and you should have very real-time information.

shtripat · 2016-12-21T06:48:24Z

This is being taken care as part of #93

brainfunked added ENHANCEMENT:Core Tendrl Functionality INTERFACE:API INTERFACE:GUI STORAGESYSTEM:Ceph and removed INTERFACE:API INTERFACE:GUI labels Dec 16, 2016

shtripat self-assigned this Dec 19, 2016

shtripat closed this as completed Dec 21, 2016

shtripat reopened this Dec 21, 2016

brainfunked added the FEATURE:List Views label Dec 21, 2016

r0h4n closed this as completed Apr 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ceph integration/bridge needs to provide pool usage data #80

Ceph integration/bridge needs to provide pool usage data #80

brainfunked commented Dec 16, 2016

shtripat commented Dec 19, 2016

shtripat commented Dec 20, 2016

shtripat commented Dec 21, 2016