RRD Cleanup: centralize rrd_exists check, utilize data_update() and rrd_name() #3800

Merged
merged 28 commits into from Aug 10, 2016

Projects

None yet

5 participants

@murrant
Contributor
murrant commented Jul 8, 2016 edited

Remove is_file(rrd) checks from poller and discovery.
Use data_update instead of rrd_update/rrd_create and influx_update
Centralize rrd file check so we can check against a remote rrdcached server too
Use rrd_name() to generate rrd file names
Allow read-only rrdtool commands when norrd is set
We should be able to do distributed polling without NFS with this patch, but I haven't tested it.

Tested on a couple installs for me.

murrant added some commits Jul 7, 2016
@murrant murrant Remove nfs requirement for distributed poller (with rrdtool 1.5+)
Use data_update instead of rrd_update/rrd_create and influx_update
Centralize rrd file check so we can check against a remote rrdcached server too
Use rrd_name() to generate rrd file names
57650e4
@murrant murrant Merge remote-tracking branch 'upstream/master' into rrd-create-remote
a1ab190
@murrant murrant Fix scrut issues, pass by reference
c18c01c
@murrant murrant Fix rrdtool_check_rrd_exists()
c5c5469
@murrant murrant removed the Blocker label Jul 8, 2016
@murrant murrant Eradicate is file checks for our rrds (leave them for nfsen and colle…
…ctd)
c01d8b7
@murrant murrant added the Blocker label Jul 8, 2016
@murrant murrant Fix sensors rrd name issue
Use last instead of first, as last won't flush rrdcached
Only stream_select if timeout > 0
Attempt to fix mysql
01cfb96
@murrant murrant removed the Blocker label Jul 9, 2016
@laf
Member
laf commented Jul 9, 2016

I'll throw this on at home and test, once we think we've ironed out the big flaws I'm happy to throw this onto works install.

I've been waiting to setup a new remote poller in our office so it makes sense to give it a whirl on that - in fact I can test this just on that one poller to start with.

@laf
Member
laf commented Jul 11, 2016

No issues at home here.

Will move it to work and see how that goes.

@laf
Member
laf commented Jul 11, 2016

https://scrutinizer-ci.com/g/librenms/librenms/inspections/abae7c65-126d-43e1-ba8c-1d74f11c3461/issues/files/includes/rrdtool.inc.php?status=new&selectedLabels%5B0%5D=9&orderField=path&order=asc&honorSelectedPaths=0

Only thing labelled as a bug.

@murrant
Contributor
murrant commented Jul 12, 2016

I think that is a parsing bug, as it isn't expected to pass variables that way since they will be out of scope. But we don't care about these variables and only pass them because they are required. I could break them out into separate lines, but that would be much messier and increase the scope of the variables.

@murrant
Contributor
murrant commented Jul 15, 2016

Any issues with this? I haven't found any more bugs personally. Need to check any and all graphs you can find if you want to test.

@murrant murrant Fix slow poller debug (waiting for output)
More robust rrdtool_check_rrd_exists() check
Line return after outputting the rrd command for easier to read debug output
Updated some phpdoc comments in rrdtool.inc.php
dd9e19d
@murrant murrant Fix extra data for TCP stats.
969a08c
@murrant murrant Fix missing data fonetstats-snmp
7c3ff52
@murrant murrant Check for file missing string anywhere in stdout or stderr
0a8acc7
@murrant murrant changed the title from Remove nfs requirement for distributed poller (with rrdtool 1.5+) to RRD Cleanup: centralize rrd_exists check, utilize data_update() and rrd_name() Jul 18, 2016
@murrant murrant Merge remote-tracking branch 'upstream/master' into rrd-create-remote
1f5cfe2
murrant added some commits Jul 20, 2016
@murrant murrant Doc updates
d7a7c5e
@murrant murrant Sometimes the output streams already contain output, so we don't get …
…the output we expect.

Clear those output buffers, but only when we care about it.
c43f533
@paulgear
Member

@murrant, I haven't had a chance to try this yet, but just wanted to say thanks for doing it. It was always my intent to go through bit by bit and fix up the code to consistently use data_update().

@murrant murrant referenced this pull request Jul 25, 2016
Closed

app: os-updates #3876

@murrant murrant Don't use $config['rrdcached_dir'] anymore, always use relative direc…
…tories when using rrdcached.

Wait for output on every command, this prevents input left in the buffer from messing up subsequent commands.
It doesn't seem like there is a penalty, but I'd have to do more testing.
e3772e2
@murrant murrant More compact debug output.
RRDC_USER is not used in the systemd unit file (this isn't how you set an environment variable), so remove it.
8d950a4
@murrant murrant Update nfs-stats.inc.php to use data_update
filter tags in ports.inc.php, and update the phpdoc for rrd_array_filter()
76d5311
@murrant murrant Correct missing/extra data issues
netstats-icmp missing data
netstats-tcp extra data
2f218e3
@murrant murrant Merge remote-tracking branch 'upstream/master' into rrd-create-remote
2c98fd7
@murrant murrant Merge remote-tracking branch 'upstream/master' into rrd-create-remote
97daf3d
@murrant murrant update nfs-v3, minor cleanup nfs-stats to match.
d4670d9
murrant added some commits Aug 3, 2016
@murrant murrant Update the rrdtool timeout to 5 seconds.
This should be adequate for most deployments.
ca89d74
@murrant murrant Try to flush the output stream before we check it.
668c081
@murrant
Contributor
murrant commented Aug 4, 2016 edited

Still having issues with the output from rrdtool. Here are the options @librenms/reviewers

  1. Set the timeout to wait for output command to something reasonably high 10s -30s range for every command sent. If rrdcached is unreachable, we will wait this amount of time each call.
  2. Send create for every update. Takes about the same time as does to check if it exists, but rrdcached creates a log entry every time it receives an create command. (we send -O to prevent overwriting files)
  3. Change the way rrdtool is called so we create a new process for every command

Benchmarks (to remote rrdcached server through a vpn):

rrdlast 50 = 0.00040197372436523
rrdcreate 50 = 0.00034189224243164
rrdupdate 50 = 0.00047183036804199
rrdlast timeout30 50 = 0.016049861907959
rrdcreate timeout30 50 = 3.2816460132599
rrdupdate timeout30 50 = 3.1834480762482
exec rrdlast 50 = 3.1194610595703
exec rrdcreate 50 = 3.0942099094391
exec rrdcreate nowait 50 = 0.36376404762268
exec rrdupdate 50 = 2.985221862793

Mostly it's fine but sometimes this happens:

  1. call update1
  2. call update2
  3. receive output from update1
  4. receive output from update2
@laf
Member
laf commented Aug 5, 2016

As per IRC, 2) Seems to be the best option.

murrant added some commits Aug 7, 2016
@murrant murrant Always send rrd create 837e360
@murrant murrant Merge remote-tracking branch 'upstream/master' into rrd-create-remote
ead28f6
@murrant murrant Merge remote-tracking branch 'upstream/master' into rrd-create-remote
68303a9
@murrant murrant Fix php lint error
aa859fe
@murrant murrant One more lint error
1182cb3
@laf
Member
laf commented Aug 9, 2016

Tagged to merge in 24 hours.

@librenms/reviewers speak up now.

@murrant murrant rrdtool.inc.php: Remove unused functions
Always force -O with create
Mark rrdtool() as internal
A little PSR-2 and whitespace cleanup
78723db
@laf laf merged commit e09f5c5 into librenms:master Aug 10, 2016

2 checks passed

Auto-Deploy Build finished.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@murrant murrant deleted the murrant:rrd-create-remote branch Aug 12, 2016
@tuxis-ie
Contributor

This broke the Proxmox and Ceph plugins (again).

@tuxis-ie tuxis-ie referenced this pull request Aug 12, 2016
Closed

Proxmox and Ceph plugins are broken #4036

0 of 4 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment