Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgr/influx: Various time fixes #20494

Merged
merged 3 commits into from Mar 27, 2018
Merged

mgr/influx: Various time fixes #20494

merged 3 commits into from Mar 27, 2018

Conversation

wido
Copy link
Member

@wido wido commented Feb 20, 2018

This PR fixes three things:

  • Make the default interval 30 seconds
  • Make sure all values have the same timestamp and reduce syscalls for fetching the current time
  • Set a retention period by default in InfluxDB

The default interval of 5 seconds is a nice to have, but it breaks large systems. Collecting and sending all this information on a 2000 OSD cluster is killing for the Manager. We shouldn't set that by default.

In addition a optimization here is to only fetch the time once, store it in a variable and us it in that run. Otherwise values have different times, but we also perform a lot of syscalls for fetching the time.

The last fix is to set a default retention of 8 weeks in InfluxDB. Users can change this however they like, it will not be modified afterwards, but it's to prevent users setting this up and having a DB grow for ever and breaking at some point in the future.

A interval of 5 seconds (previous default) provides a high
resolution, but also a very large number of entries in the Influx
database.

By setting the default 30 seconds the amount of entries is reduced to
1/6th of that and thus saves storage and other resources.

If users want they can change the interval to anything they like.

Signed-off-by: Wido den Hollander <wido@42on.com>
Without a retention policy it is up to the administrator to set
a retention policy on the data.

This can easily be forgotten resulting in a ever growing InfluxDB
database.

Users can change the retention policy if they want to afterwards,
this is just the default policy set by the InfluxDB module when
it initially creates the database.

Signed-off-by: Wido den Hollander <wido@42on.com>
By fetching the current time once and storing it into a variable
we save a lot of system calls. On large clusters this can be a lot
of system calls.

In addition we also make sure that all points gathered in the same
loop/run have exactly the same timestamp.

Otherwise there will be a difference in time between the items in
InfluxDB which then causes problems when creating graphs with for
example Grafana.

Signed-off-by: Wido den Hollander <wido@42on.com>
@wido wido added the mgr label Feb 20, 2018
@wido wido requested a review from jcsp February 20, 2018 08:24
@jcsp
Copy link
Contributor

jcsp commented Feb 20, 2018

This all looks fine, although I'm a bit concerned that this was painfully slow at 5s intervals on a 2000 OSD cluster -- suggests that the code to gather and send the report was taking multiple seconds to execute.

@wido
Copy link
Member Author

wido commented Feb 20, 2018

@jcsp Well, Influx also has to accept the data and the JSON needed to be send.

The thing is, I don't have this cluster at my disposal to test and debug everything on it, it's in production.

However, having a 30 second interval means you have a 6th of the data compared to the 5s interval.

People can still use a 5s interval, but I don't think it's wise to make it the default

@wido
Copy link
Member Author

wido commented Mar 26, 2018

@jcsp Can you take a look at this one again?

If OK I want to create a backport PR just like I did with the Zabbix module in #20781 to make sure that Luminous has the latest fixes for the Influx en Zabbix module.

@jcsp jcsp merged commit c7049c1 into ceph:master Mar 27, 2018
@jcsp
Copy link
Contributor

jcsp commented Mar 27, 2018

(tested this locally)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants