Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Installing Sflow Support
IXP Manager can use sflow data to build peer-to-peer traffic graphs and traffic aggregate analysis of bytes/packets, split by VLAN and protocol (IPv4 / IPv6), for both individual IXP peering ports, and for entire VLANs.
The IXP Manager sflow peer-to-peer graphing system depends on the MAC address database system to detect point to point traffic. Before proceeding further, this mechanism should work so that when you click on
IXP Admin Actions | MAC Addresses from the admin portal, you should see a MAC address associated with each port.
On the IXP switches, you will need all your edge ports to be connected to an sflow capable switch, and also for all these switches to support sflow accounting in the same direction. In other words, either all must support ingress accounting or else all must support egress accounting.
Most switches which support sflow will support ingress accounting, because this is what's required in RFC 3176. Some switches, notably Dell Force10, only support egress sflow. If you use these on your IXP alongside other switches which only support ingress sflow, then the sflow graphs will show twice the traffic in one direction for the p2p graphs and zero traffic in the other direction. There is no way for IXP Manager to work around this problem.
If not all of the IXP edge ports are sflow capable, then sflow traffic data will be lost for these ports. This means that some point-to-point traffic graphs will show up with zero traffic, and that the sflow aggregate graphs will be wrong.
Note that of Cisco's entire product range, only the Nexus 3000 and Nexus 3100 series switches support sflow. IXP Manager does not support netflow and has no current plans to support it in the future, as most Cisco L2 Netflow implementations do not export mac address information and consequently are too incomplete to provide workable peer-to-peer statistics. Furthermore, the sflow support on the Cisco Nexus 3k range is crippled due to the software only allowing ingress + egress sflow to be configured. Functional accounting requires ingress-only or egress-only sflow to be configured on a per-port basis: ingress + egress will cause double-counting.
Similar to netflow, sflow needs to be configured using an accounting perimeter. This means that ingress sflow accounting should be enabled on all edge ports, but on none of the core ports. If it's enabled on any of the core ports, traffic will be double-counted which leads to inaccuracy.
Each switch on the network sends sampled sflow packets to an sflow collector. These packets are processed by the "sflowtool" command, which converts into an easily-parseable ascii format. IXP Manager provides a perl script to take the output of the sflowtool command, correlate this against the IXP database and to use this to build up a matrix of traffic flows which are then exported to RRD format.
The RRD files are stored on disk and can be accessed either by home-grown code or else by using the sample sflow grapher included in IXP Manager.
Many vendors support sflow, but some do not. There is a partial list on the sflow web site. At this time, IXP Manager does not support netflow data export for traffic analysis.
Sflow uses data sampling. This means that the results it produces are estimated projections, but on large data sets, these projections tend to be statistically accurate. Each switch needs to be configured to have a particular sampling rate. The exact rate chosen will depend on the traffic levels on the switch, how powerful the switch management plane CPU is, and how much bandwidth is available for the switch management.
On a small setup with relatively low levels of traffic (e.g. 100kpps), it would be useful to leave the sampling rate very low (e.g. 1:256). If the switch or the entire network is handling very large quantities of traffic, this figure should be high enough that IXP ports with low quantities of traffic will still get good quality graphs, but low enough that the switch management CPU isn't trashed, and that packets are not dropped on the management ethernet port.
Some switches have automatic rate-limiting built in for sflow data export.
Brocade TurboIron 24X
By default a TIX24X will export 100 sflow records per second. This can be changed using the following command:
SSH@Switch# dm device-command 2762233 SSH@Switch# tor modreg CPUPKTMAXBUCKETCONFIG(3) PKT_MAX_REFRESH=0xHHHH
... where HHHH is the hex representation of the number of sflow records per second. INEX has done some very primitive usage profiling which suggests that going above ~3000 sflow records per second will trash the management CPU too hard, so we use PKT_MAX_REFRESH=0x0BB8. Note that this command is not reboot persistent, and any time a TIX24X is rebooted, the command needs to be re-entered manually.
Each IXP edge port will have 4 separate RRD files for recording traffic to each other participant on the same VLAN on the IXP fabric: ipv4 bytes, ipv6 bytes, ipv4 packets and ipv6 packets. This means that the number of RRD files grows very quickly as the number of IXP participants increases. Roughly speaking, for every N participants at the IXP, there will be about 4*N^2 RRD files. As this number creates extremely high I/O requirements on even medium sized exchanges, IXP Manager requires that rrdcached is used.
As sflow can put a reasonably high load on a server - via sflow data bandwidth, collector CPU requirements, disk I/O and disk space for RRD files - it may be a good idea to have a separate server to handle the IXP's sflow system.
The sflow server will need:
- a web server with php5 support
- a copy of sflowtool
- rrdtool + rrdcached
- perl 5.10.2 or later
- the following perl modules: DBI, Net::IP, Config::General, RRDs
- mrtg (for Net_SNMP_util)
- a filesystem partition with enough disk space, mounted noatime, nodiratime. You may also want to consider disabling filesystem journaling.
- Install the open source packages mentioned above. On FreeBSD, these can be bulk installed using the following command.
pkg install apache22 sflowtool git devel/subversion databases/rrdtool php5-pdo_mysql php5-session mrtg p5-Daemon-Control p5-Config-General p5-NetAddr-IP p5-DBD-mysql
- Daemon::Control is available on ubuntu as libdaemon-control-perl.
- Install IXP Manager using the instructions provided in the IXP Manager wiki, including the third party libraries. You can ignore Doctrine, as it's not required by the sflow module.
- Install the IXPManager perl library in the
perl Makefile.PL; make install)
- configure and start
rrdcached. We recommend using journaled mode with the
-P FLUSH,UPDATE -m 0666 -l unix:/var/run/rrdcached.sockoptions enabled. Note that these options allow uncontrolled write access to the RRD files from anyone on the sflow machine.
- on FreeBSD it is a good idea to set
net.inet.udp.blackhole=1in /etc/sysctl.conf, to stop the kernel from replying to unknown sflow packets with an ICMP unreachable reply.
application/configs/application.ini configuration file is used by
sflow-graph.php. There are three variables which can be set:
sflow.rootdir: the directory where all the sflow .rrd files will be stored. Must be the same as the
sflow.rrd.rrdcached.sock: the location of the rrdcached socket. Usually set to
sflow.rrd.rrdtool: the location of the rrdtool binary.
Sample configuration might looks like this:
; sflow parameters (peer to peer graphs) sflow.enabled = true sflow.rootdir = /data/ixpmatrix sflow.rrd.rrdtool = /usr/bin/rrdtool sflow.rrd.rrdcached.sock = unix:/var/run/rrdcached.sock
sflow-graph.php is used to serve P2P graphs to IXP Manager on demand. For performance reasons, it has no built-in authentication, so it must be secured by restricting access to only the main IXP Manager web server's IP address, so that authentication is handled by IXP Manager. This can be done by placing IP address restrictions in the sflow web server location configuration, or else by configuring a virtual http server listening on a different port or even a different host, and using firewalls or packet filters to block access from untrusted third parties.
For example, if the
sflow-graph.php servlet is accessible via the IP address restricted URL
http://sflow.example.com/p2p/sflow-graph.php, IXP Manager can be configured to use this location by going to Infrastructures and clicking to edit your IXP at the end of the list. Then place this URL in the MRTG P2P Path field.
At some stage in the future,
ixpmanager.conf will be replaced completely by
application.ini. For the moment, the following sflow parameters can be set in the
sflow_rrdcached: set to 0 or 1, depending on whether you want to use rrdcached or not.
sflowtool: the location of the sflowtool binary.
sflowtool_opts: command-line options to pass to sflowtool
sflow_rrddir: the directory where all the sflow .rrd files will be stored. Must be the same as the
Note that the section of
ixpmanager.conf will need to be configured to give the sflow collector system access to your SQL database.
An example ixpmanager.conf might look like this:
<sql> dbase_type = mysql dbase_database = ixpmanager dbase_username = ixpmanager_user dbase_password = blahblah dbase_hostname = sql.example.com </sql> <ixp> sflowtool = /usr/bin/sflowtool sflowtool_opts = -4 -p 6343 -l sflow_rrdcached = 1 sflow_rrddir = /data/ixpmatrix </ixp>
Things which are important to get sflow support working properly include:
- the Mac Address table in IXP manager is populated correctly with all customer MAC addresses using the update-l2database.pl script
- ensuring that all switchports are set up correctly in IXP Manager, i.e. Core ports and Peering ports are configured as such
- ensuring that sflow accounting is configured on peering ports and is disabled on all core ports
- Using Arista / Cisco / Dell F10 kit with LAGs? Make sure you configure the port channel name in the Advanced Options section of the customer's virtual interface port configuration.
- if the sflow-to-rrd-handler script crashes, this may indicate that the back-end filesystem is overloaded. Installing rrdcached is a first step here. If it crashes with rrdcached enabled, then you need more disk I/O (SSDs, etc).
- if there is too much of a difference between the sflow p2p individual aggregate stats and the port stats from the main graphing system, it might be that the switch is throttling sflow samples. It will be necessary to check the maximum sflow pps rate on the switch processor, compare that with the pps rate in the Switch Statistics graphs and work out the switch process pps throughput on the basis of the sflow sample rate. Increasing the sflow sampling ration may help, at the cost of less accurate graphs for peering sessions with low traffic levels.