F5 LTM - Take 2 #5205

Merged
merged 11 commits into from Jan 19, 2017

Projects

None yet

6 participants

@adaniels21487
Contributor

DO NOT DELETE THIS TEXT

Please note

Please read this information carefully. You can run ./scripts/pre-commit.php to check your code before submitting.

This is the second attempt at the F5 LTM module.
It has been modified to make the poller module fetch only the necessary OID's, not round up to a common point in the tree.

It works fine on my installations, but these are too small to exhibit the problem seen in the first attempt. I would appreciate some testing on larger LTM installations.

Thanks,
Aaron

adaniels21487 added some commits Jun 29, 2016
@adaniels21487 adaniels21487 F5 LTM.
This module performs discovery and polling of LTM objects on F5 BigIP devices.

It contains the following features:
- Discovery of Virtual servers, Pools and Pool Members.
- Collection and Graphing of the following metrics across Virtual Servers and Pool Members:
  - Bytes In/Out, Packets In/Out, Connections
- Pagination and Searching of all tables via bootgrid.
- Alerting on the following conditions:
  - Virtual Server Down - Critical (2)
  - Pool does not have the minimum required members - Critical (2)
  - Pool Member down - warning (1)
b93fda4
@adaniels21487 adaniels21487 - make bootgrid search case insensitive
- Bits vx Bytes on Graphs
77d2d86
@adaniels21487 adaniels21487 - Scrutiniser fixes
- Laf's comments
7d9d226
@adaniels21487 adaniels21487 - Merge Master 3c40fc0
@adaniels21487 adaniels21487 - Laf's comments
4c16a98
@adaniels21487 adaniels21487 - Don't poll OID's we dont need.
111249e
@adaniels21487 adaniels21487 referenced this pull request Dec 21, 2016
Merged

F5 LTM #5149

2 of 2 tasks complete
@laf
laf approved these changes Dec 21, 2016 View changes

This adds 10 seconds to polling for someone with 2000 vips which seems an acceptable trade off to me so I'm happy with this.

@laf
Member
laf commented Dec 21, 2016 edited

@librenms/reviewers will merge tomorrow unless anyone objects

@laf
Member
laf commented Dec 22, 2016

Blocking as issues reported in #5149 (comment)

@rucarrol
Contributor
rucarrol commented Jan 9, 2017

Hey!

I've been trialling this PR (LTM Virtual Servers (346) | LTM Pools (332)) internally, and it seems to work beautifully!

I see that since we have quite a few VIPs, we run out of colours in the multi-graph.

I was thinking, is it possible to sort the data in the graph from most used to least used? This would really help with running out of colours/70% of our graphs being red :)

/Ruairi

@adaniels21487
Contributor

Hi @rucarrol
Thanks for the feedback.

Have you had any issues?
@awlx has had intermittent zend_mm_heap corrupted errors which may be related to PHP bugs, have you seen this (#5149) ?

Re: Running out of colors. if you add /debug=yes to the graph url it will tell you if you have run out of colors (with that many VIPs/Pools you will have), please confirm this and I will add some more colors to the array.

I also have many unused VIPs/Pools and am looking at excluding these from graphs in a future PR. The problem is there is no run-time way (that I can find) of excluding them. The best I can do is pre-calculate which ones are used and build the graph with those. The problem with this approach is that if you were to select an alternate time period those items may or may not have any utilization.
I believe to v2 webUI may be able to handle this better.

@awlx
awlx commented Jan 10, 2017

I did the Select mentioned in #5149.

mysql> SELECT type as name, count(*) as count FROM component WHERE device_id = 6 GROUP BY type;
+-------------------+-------+
| name | count |
+-------------------+-------+
| f5-ltm-pool | 759 |
| f5-ltm-poolmember | 1270 |
| f5-ltm-vs | 1247 |
+-------------------+-------+
3 rows in set (0.01 sec)

@rucarrol
Contributor

Hey @adaniels21487

Adding /debug=Yes/ causes the graph to stop rendering (!). While the graphs do render correctly, it's a fundamental issue with the F5's here - too many VIPs.

I dont think it's a major issue tbh, just wondering if possible.

I've deployed this PR to my entire LB fleet now, will report back in ~24hrs and see how it is working.

/Ruairi

@awlx
awlx commented Jan 10, 2017

I just upgraded to PHP 5.6. Let's see if that helps.

@rucarrol
Contributor

Since I dont have perms to push to this repo, I found a small correction on table sorting:

index 4671dc3..d44f10e 100644
--- a/html/pages/device/loadbalancer/ltm_pool_all.inc.php
+++ b/html/pages/device/loadbalancer/ltm_pool_all.inc.php
@@ -17,8 +17,8 @@
     <tr>
         <th data-column-id="poolid" data-type="numeric" data-visible="false">poolid</th>
         <th data-column-id="name">Name</th>
-        <th data-column-id="minup">Minimum Members</th>
-        <th data-column-id="currentup">Current Members</th>
+        <th data-column-id="minup" data-type="numeric">Minimum Members</th>
+        <th data-column-id="currentup" data-type="numeric">Current Members</th>
         <th data-column-id="status" data-visible="false">Status</th>
         <th data-column-id="message">Status</th>
     </tr>

/Ruairi

adaniels21487 added some commits Jan 9, 2017
@adaniels21487 adaniels21487 - Fixed bug in polling minupstatus
- Fixed bug in datatable (Thanks @rucarrol)
- All VS graphs will never run out of colours
- Moved SQL for upstream changes
2a8f436
@adaniels21487 adaniels21487 Merged master
Conflicts:
	html/pages/device.inc.php
	includes/definitions.inc.php
6929386
@adaniels21487 adaniels21487 - Add loadbalancing poller and discovery modules to F5 yaml definitions
- Fixed devices page
84d9624
@awlx
awlx commented Jan 11, 2017

With the new PHP version (PHP 5.6.14 (cli) (built: Oct 2 2015 08:48:49)) it seems to be stable. Another good thing would be, to add the VIPs to the IPv4 table of LibreNMS to make them searchable.

@awlx
awlx commented Jan 11, 2017

And for sure I run out of colors.

http://i.imgur.com/VZO5elN.png

@awlx
awlx commented Jan 11, 2017

And the pool members are not displayed in the Pool overview.
http://i.imgur.com/48RNnTQ.png

@adaniels21487 adaniels21487 - We dont use category anymore, we use type.
0faf090
@adaniels21487
Contributor

Hi @awlx
Can you please let me know the graph type (in the querystring. eg: type=device_bigip_ltm_allvs_conns) and send a screenshot with the legend enabled (ok to blank out names if they are sensitive).
Pool member issue should be fixed now.

@awlx
awlx commented Jan 12, 2017

type=device_bigip_ltm_allvs_conns
type=device_bigip_ltm_allvs_bytesin
type=device_bigip_ltm_allvs_bytesout
type=device_bigip_ltm_allvs_pktsin
type=device_bigip_ltm_allvs_pktsout

Those graphs are affected.

Screenshot of the whole graph is not so easy ;). I don't have a screen which is able to display the whole legend. And if I save the graph as picture, I cannot pixelate it because all programs crash (graph size 355MB uncompressed).

Maybe this still helps:
http://i.imgur.com/Sx2AGAR.png

@awlx
awlx commented Jan 12, 2017 edited

And I can confirm pool-members work now.

@awlx
awlx commented Jan 12, 2017

Btw is the format of the IP address supposed to look like this? Or is it a bug? :)

http://i.imgur.com/BvoUuzv.png

@adaniels21487
Contributor
adaniels21487 commented Jan 12, 2017 edited

@awlx
Re Graphs: The code loops through an array of color codes. The change made the other day was to go back to the start when the array ended. Mine looks like: http://i.imgur.com/WoSFveD.png
At line 35 of html/includes/graphs/device/bigip_ltm_allvs_conns.inc.php you should have a block of code that looks like this, please confirm.

    // Grab a colour from the array.
    if (isset($colours[$colcount])) {
        $colour = $colours[$colcount];
    } else {
        $colcount = 0;
        $colour = $colours[$colcount];
    }

Re IP Address: We just pull from SNMP so I assume this is the format your F5 is returning the data in. Please snmpwalk 1.3.6.1.4.1.3375.2.2.10.1.2.1.3 it will return a HEX value, we convert this to the IP.
Mine looks like: http://i.imgur.com/nlLiEJA.png

Aaron

@adaniels21487
Contributor

@rucarrol are you seeing the same color issues with the allvs graphs as @awlx?

@rucarrol
Contributor

Hey,

So after a certain amount of colours in the graph, I end up with this: http://imgur.com/a/md4Jo

It repeats down for another few hundred servers.

@awlx
awlx commented Jan 13, 2017 edited

Output of snmpwalk http://i.imgur.com/Y0BzMWS.png

I have the code you mentioned.

And I can confirm it looks like the graph of @rucarrol .

@adaniels21487
Contributor

@awlx @rucarrol
Hmm, I dont know whats going on..
Is one of you available on IRC to do some interactive troubleshooting?
can you send me the rrd command

@rucarrol
Contributor

Hey @adaniels21487 ,

We're both in the EU timezones, is that ok? I'll ping you in the LibreNMS channel.

/Ruairi

@adaniels21487
Contributor

So I have taken a look at the RRD command from @awlx and it all looks good.
I have PM'd him the img and the colours are rotating through the array when rendered on my system.

Perhaps it is a RRD issue. I am running 1.4.8:

adaniels@dev:~$ rrdtool -v
RRDtool 1.4.8  Copyright 1997-2013 by Tobias Oetiker <tobi@oetiker.ch>
               Compiled Nov 16 2014 14:30:06

I can't find anything relevant in the RRDtool github. What version of rrdtool do you run?

@awlx
awlx commented Jan 17, 2017

RRDtool 1.3.8 Copyright 1997-2009 by Tobias Oetiker tobi@oetiker.ch
Compiled Apr 3 2014 13:07:03

But somehow it started to work since yesterday. But I deleted all the rrd files (because of diskspace issues).

@rucarrol
Contributor
# rrdtool --version
RRDtool 1.5.5  Copyright by Tobias Oetiker <tobi@oetiker.ch>

So I'm on a newer version, so not 100% sure it's related to rrdtool.

I did a:

find /path/to/rra/directory -name "*f5-ltm*" -exec rm {} \;

To see if this will change anything in the same way it did for @awlx, however I still see the same cycle of reds :(

I'll poke about a bit more and see if I can spot anything more obvious.

@rucarrol
Contributor

Ok,

After a bit of looking (thanks, @awlx ), noticed I was not on the right branch. Corrected, and it looks good now!

Thank you so much!

@adaniels21487 adaniels21487 - Moved SQL for upstream changes.
dba1460
@scrutinizer-notifier

The inspection completed: 1 updated code elements

@adaniels21487
Contributor

Awesome, thanks team.
if there are no further issues, I think this is ready for a merge.

@laf
Member
laf commented Jan 17, 2017

Brilliant. Thanks everyone. Will take a look over the next day or so again just to be sure.

@laf
laf approved these changes Jan 19, 2017 View changes
@laf
Member
laf commented Jan 19, 2017

image

@laf laf merged commit de45d8d into librenms:master Jan 19, 2017

2 checks passed

Auto-Deploy Build finished.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@adaniels21487 adaniels21487 deleted the adaniels21487:issue-4644 branch Jan 20, 2017
@adaniels21487 adaniels21487 restored the adaniels21487:issue-4644 branch Feb 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment