Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T5086: Add sFlow feature based on hsflowd #1891

Merged
merged 1 commit into from
Mar 16, 2023
Merged

Conversation

sever-sever
Copy link
Member

@sever-sever sever-sever commented Mar 14, 2023

Change Summary

Add sFlow feature based on hsflowd
According to user reviews, it works more stable and more productive than pmacct
I haven't deleted 'pmacct' 'system flow-accounting sflow' yet It could be migrated or deprecated later

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes)
  • Migration from an old Vyatta component to vyos-1x, please link to related PR inside obsoleted component
  • Other (please describe):

Related PR

vyos/vyos-build#320

Related Task(s)

Component(s) name

sflow

Proposed changes

How to test

set system sflow agent-address 'fe80::5054:ff:fef2:cace'
set system sflow agent-interface 'eth0'
set system sflow interface 'eth0'
set system sflow interface 'eth1'
set system sflow polling '30'
set system sflow sampling-rate '100'
set system sflow server 192.168.122.1 port '6343'
set system sflow server 192.168.122.11 port '6343'

check service

vyos@r14# sudo systemctl status hsflowd
● hsflowd.service - Host sFlow
     Loaded: loaded (/lib/systemd/system/hsflowd.service; disabled; preset: enabled)
    Drop-In: /run/systemd/system/hsflowd.service.d
             └─override.conf
     Active: active (running) since Tue 2023-03-14 20:58:28 EET; 3s ago
   Main PID: 13144 (hsflowd)
      Tasks: 2 (limit: 9400)
     Memory: 716.0K
        CPU: 10ms
     CGroup: /system.slice/hsflowd.service
             └─13144 /usr/sbin/hsflowd -m 4d6f4d291ae8446f8d2b3decd9da64c7 -d -f /run/sflow/hsflowd.conf

Mar 14 20:58:28 r14 systemd[1]: Started hsflowd.service - Host sFlow.
[edit]
vyos@r14# 

config:

vyos@r14# cat /run/sflow/hsflowd.conf
# Genereated by /usr/libexec/vyos/conf_mode/system_sflow.py
# Parameters http://sflow.net/host-sflow-linux-config.php

sflow {
  polling=30
  sampling=100
  sampling.bps_ratio=0
  agentIP=192.168.122.14
  agent=eth0
  collector { ip = 192.168.122.1 udpport = 6343 }
  collector { ip = 192.168.122.11 udpport = 6343 }
  pcap { dev=eth0 }
  pcap { dev=eth1 }
}

Dump:

vyos@r14# sudo tcpdump -ntvi eth0 port 6343
tcpdump: listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes

IP (tos 0x0, ttl 64, id 63082, offset 0, flags [DF], proto UDP (17), length 200)
    192.168.122.14.55255 > 192.168.122.11.6343: sFlowv5, IPv6 agent fe80::5054:ff:fef2:cace, agent-id 100000, seqnum 10, uptime 84849, samples 1, length 172
	counter sample (2), length 124, seqnum 3, type 0, idx 2, records 2
	    enterprise 0, Unknown (1005) length 8
		0x0000:  0000 0004 6574 6830
	    enterprise 0, Generic counter (1) length 88
	      ifindex 2, iftype 6, ifspeed 0, ifdirection 1 (full-duplex)
	      ifstatus 3, adminstatus: up, operstatus: up
	      In octets 19986, unicast pkts 200, multicast pkts 4294967295, broadcast pkts 4294967295, discards 0
	      In errors 0, unknown protos 4294967295
	      Out octets 22174, unicast pkts 105, multicast pkts 4294967295, broadcast pkts 4294967295, discards 0
	      Out errors 0, promisc mode 0
IP (tos 0x0, ttl 64, id 45823, offset 0, flags [DF], proto UDP (17), length 200)
    192.168.122.14.43119 > 192.168.122.1.6343: sFlowv5, IPv6 agent fe80::5054:ff:fef2:cace, agent-id 100000, seqnum 10, uptime 84849, samples 1, length 172
	counter sample (2), length 124, seqnum 3, type 0, idx 2, records 2
	    enterprise 0, Unknown (1005) length 8
		0x0000:  0000 0004 6574 6830
	    enterprise 0, Generic counter (1) length 88
	      ifindex 2, iftype 6, ifspeed 0, ifdirection 1 (full-duplex)
	      ifstatus 3, adminstatus: up, operstatus: up
	      In octets 19986, unicast pkts 200, multicast pkts 4294967295, broadcast pkts 4294967295, discards 0
	      In errors 0, unknown protos 4294967295
	      Out octets 22174, unicast pkts 105, multicast pkts 4294967295, broadcast pkts 4294967295, discards 0
	      Out errors 0, promisc mode 0
^C
2 packets captured
2 packets received by filter

Checklist:

  • I have read the CONTRIBUTING document
  • I have linked this PR to one or more Phabricator Task(s)
  • I have run the components SMOKETESTS if applicable
  • My commit headlines contain a valid Task id
  • My change requires a change to the documentation
  • I have updated the documentation accordingly

@sflow
Copy link

sflow commented Mar 14, 2023

Hello all,

Three comments on the CLI settings:

(1) It is probably better to set the sflow agent-address to an interface, rather than an IP address. That way it doesn't cause confusion if that address goes away. Or worse, if it is allocated to some other device in the network. The line that goes in /etc/hsflowd.conf looks like this:

agent=DEVICE

e.g. agent=eth0

Did you have some other reason for wanting to set an explicit IP address here?

(2) Since the sflow agent-address goes in the payload of the sFlow datagrams, the source-address of those packets is not relevant. So there is no need to make it a setting here. It will be whatever it would normally be to send UDP to that server. If an sFlow feed is forwarded to another server the source address will change. The server will typically ignore the source-address and look only at the agent-address.

(3) The sflow server setting should probably allow an extra argument to determine the namespace or vrf. The line that goes in hsflowd.conf looks like this:

collector { ip=IPADDRESS udpport=PORT namespace=NAMESPACE }

or:

collector { ip=IPADDRESS udpport=PORT dev=DEVICE }

e.g. collector { ip=10.0.0.30 udpport=6343 dev=eth0 }

Whether you use the namespace or the dev parameter depends on how VyOS uses namespaces in Linux. I hope that makes sense?

Neil McKee

@sever-sever
Copy link
Member Author

  1. It is probably better to set the sflow agent-address to an interface, rather than an IP address.

What will be if there are several aliases on the interface? Which address it will be?
For example eth0 - 192.02.1/24 and 203.0.113.1/24 + IPv6

  1. sure source address can be removed as it not used anywhere right now

  2. We don’t use namespaces for services, but we can use VRF as optional. It’s a good feature if it works correctly with VRFs
    So can service be started in VRF? It can be next step, but It is simple to integrate and start service in some VRF

@sflow
Copy link

sflow commented Mar 14, 2023

(1) If there are several addresses on the interface, it will choose one of them - the one it considers to be the most likely to be globally unique. And as long as nothing changes it will choose the same one every time. This election will apply to all IP addresses on the router if you do not set agent=DEVICE, so the "set system sflow agent-address" setting can be optional.

It is also possible to set agent.cidr=CIDR to indicate a preference, or agent.cidr=!CIDR to put a thumb on the scale the other way. For example:

agent.cidr=!10.0.0.0/8

tells the election that 10.* addresses should be avoided. Or:

agent.cidr=::/0

indicates that IPv6 is preferred.

I don't know if it is necessary to expose this agent.cidr setting. I guess it could be an optional parameter:

set system sflow agent DEVICE [ CIDR ]

but I don't know if anyone would use it. If the DEVICE has a global IPv4 address then it is almost always OK to use it as the sflow-agent-address. If it only has an IPv6 address then no problem, we'll use that.

(3) The process does not have to be started in a namespace/VRF. It can send to multiple collectors in different namespaces as long as it has permission to switch namespaces and open sockets:
https://github.com/sflow/host-sflow/blob/v2.0.47-1/src/Linux/hsflowd.c#L1060-L1210

And new comment:
(4) Given that there are workable defaults for all settings, is there a CLI command to just turn sFlow on/off? Something like:

set system sflow enable
set system sflow disable

@sflow
Copy link

sflow commented Mar 15, 2023

(5) For the 1:N packet sampling rate the default behavior requires explanation. hsflowd will consider the ifSpeed of each port in turn. If the ifSpeed (bits/sec) is unknown then the default given as samplingRate=N is used, but if the interface has a known ifSpeed then the samplingRate is given by the expression: ifSpeed / 1e6. So for a 1G interface it is 1000. For a 10G interface it is 10000 and so on. This works well as a default. It is based on the requirement to detect a new flow of 10% bandwidth in 1 second. (I don't know if hsflowd is picking up the ifSpeed of the interfaces on VyOS. If not, we should try to ensure that it can.)

The interface sampling rates can be overridden with settings like this:

sampling.1G=2000
samling.10G=5000
sampling.400G=65536

Those are easy to understand, so you might want to allow them to be set via the VyOS CLI. Something like:

set system sflow sampling-rate speed 400G 65536

In practice the analysis is not particularly sensitive to the sampling rate so that should be enough flexibility for almost any real-world deployment, but if you want to allow the sampling rate to be forced to a specific value for a specific interface then that can be made possible too. Let me know.

(6) It seems that the VyOS Linux kernel is compiled without the "drop_monitor" module. Is that correct? If that module can be included in the build then you could add this to hsflowd.conf:

dropmon { limit=50 start=on sw=on hw=off }

so that hsflowd will export the headers of dropped packets (along with the name of the function in the linux kernel where that skb was dropped) as part of the standard sFlow feed. This measurement complements the sFlow packet-sampling and counter-telemetry well because it provides visibility into the traffic that is not flowing. Very helpful for troubleshooting. The limit (a rate limit max of N drops per second that will be sent out in the sFlow datagrams) is the parameter that you would set in the CLI. Perhaps something like:

set system sflow dropmon limit 50

I hope you will consider enabling this feature. Very powerful.

@sever-sever
Copy link
Member Author

sever-sever commented Mar 15, 2023

(1) It is probably better to set the sflow agent-address to an interface, rather than an IP address. That way it doesn't cause confusion if that address goes away. Or worse, if it is allocated to some other device in the network. The line that goes in /etc/hsflowd.conf looks like this:

agent=DEVICE

e.g. agent=eth0

Did you have some other reason for wanting to set an explicit IP address here?

Sure, there is the typical use case when the user configures the firewall, requiring explicitly set source/destination for allowed directions.

So it could be extended

set system sflow agent-interface 'eth0'

@sever-sever
Copy link
Member Author

sever-sever commented Mar 15, 2023

(3) The sflow server setting should probably allow an extra argument to determine the namespace or vrf. The line that goes in hsflowd.conf looks like this:

collector { ip=IPADDRESS udpport=PORT namespace=NAMESPACE }

or:

collector { ip=IPADDRESS udpport=PORT dev=DEVICE }

e.g. collector { ip=10.0.0.30 udpport=6343 dev=eth0 }

If interface eth1 is already associated with vrf mgmt

vyos@r14# sudo ip -d link show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master mgmt state UP mode DEFAULT group default qlen 1000
    link/ether 52:54:00:c7:31:bc brd ff:ff:ff:ff:ff:ff promiscuity 0  allmulti 0 minmtu 68 maxmtu 9212 
    vrf_slave table 1010 addrgenmode none numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 parentbus pci parentdev 0000:02:00.0 
[edit]
vyos@r14# 
[edit]
vyos@r14# sudo ip -d link show dev mgmt
9: mgmt: <NOARP,MASTER,UP,LOWER_UP> mtu 65575 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 22:7f:d3:56:24:1e brd ff:ff:ff:ff:ff:ff promiscuity 0  allmulti 0 minmtu 1280 maxmtu 65575 
    vrf table 1010 addrgenmode none numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 
[edit]
vyos@r14# 

Should it be

collector { ip=10.0.0.30 udpport=6343 dev=eth1 }

Or

collector { ip=10.0.0.30 udpport=6343 dev=mgmt }

(I don't know if hsflowd is picking up the ifSpeed of the interfaces on VyOS. If not, we should try to ensure that it ca

I guess It depends on the driver

vyos@r14# sudo ethtool -i eth0
driver: virtio_net
version: 1.0.0

[edit]
vyos@r14# 
[edit]
vyos@r14# 
[edit]
vyos@r14# sudo ethtool -i eth1
driver: e1000e
version: 6.1.16-amd64-vyos

[edit]
vyos@r14# 
[edit]
vyos@r14# 
[edit]
vyos@r14# sudo ethtool eth0 | grep Speed
	Speed: Unknown!
[edit]
vyos@r14# 
[edit]
vyos@r14# sudo ethtool eth1 | grep Speed
	Speed: 1000Mb/s
[edit]
vyos@r14#

vyos@r14# cat /sys/class/net/eth0/speed
-1
[edit]
vyos@r14# cat /sys/class/net/eth1/speed
1000
[edit]
vyos@r14# 

# 1G interface will be sampled at 1-in-1000
# 10G interface will be sampled at 1-in-10000
# 40G interface will be sampled at 1-in-40000
# If the ifSpeed (bits/sec) is unknown then the default given as samplingRate=N is used
Copy link

@pavel-odintsov pavel-odintsov Mar 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did research in their documentation and here https://sflow.net/host-sflow-linux-config.php I found following:

sampling=400
For interfaces with no speed or applications with no specific sampling setting, fall back on this default 1-in-N rate. 

I read it as it will apply when you have no sections like following in configuration:

sampling.100M = 100
sampling.1G   = 500
sampling.10G  = 1000
sampling.40G  = 4000

I did test on my own machine with 1G interface with following setup:

sflow {
  DNSSD = off
  polling = 30
  agentIP = 127.0.0.1
  sampling = 150
  collector { ip = 127.0.0.1 udpport = 6343 }
  # Add all relevant interfaces to this list 
  pcap { dev = enp37s0f0 }
}

And it did not behave as I expected. I have no sampling.1G here but it ignored it and used hardcoded 1:1000:

sudo tshark  -i lo -n -V -f 'port 6343'|grep 'Sampling rate'
Sampling rate: 1 out of 1000 packets
Sampling rate: 1 out of 1000 packets
Sampling rate: 1 out of 1000 packets
Sampling rate: 1 out of 1000 packets
Sampling rate: 1 out of 1000 packets

Such approach makes sampling rate setup extremely complicated and counter-intuitive.

I believe some intermediate option like setting all speeds and fallback speed to same value will be best option:

  sampling.100M = {{ sampling_rate }}
  sampling.1G   = {{ sampling_rate }}
  sampling.10G  = {{ sampling_rate }}
  sampling.40G  = {{ sampling_rate }}
  sampling = {{ sampling_rate }}

According to our deployments 1:1000 may be safe option from performance and accuracy perspective for almost all speeds.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Their default pre-calculated value for 10G is a way too high and it even conflicts with their official recommendations:

10G interface will be sampled at 1-in-10000

@sever-sever
Copy link
Member Author

The interface sampling rates can be overridden with settings like this:

sampling.1G=2000 samling.10G=5000 sampling.400G=65536

Those are easy to understand, so you might want to allow them to be set via the VyOS CLI. Something like:

set system sflow sampling-rate speed 400G 65536

@sflow, Honestly, it looks like a bug and should work another way.
It is ok if some options have default values, but those default values should be overwritten if explicitly specified. As I understand it was some use case for it.
It would be ideal to have the option to set the rate explicitly. It could be a "feature request." 🥇

So this syntax looks ugly

 set system sflow sampling-rate speed 10G 2000
 set system sflow sampling-rate speed 400G 65536

If we already have:

set system sflow sampling-rate '128'

We also use defaultValues starting from ssh-port to container-registry and they all overwritten by configured values.

Maybe some new option like hard_sampling=xxx (which will set the values explicitly) or separate default_sampling=xxx and sampling=yyy
Another idea is sampling=auto|1000|2000

If auto settings/calculation works well, I prefer to leave it as is until a better solution is found.
Thanks.

@sflow
Copy link

sflow commented Mar 15, 2023

(3) dev=mgmt or dev=eth1. I think either of these will work, actually.

(5) In hsflowd version 2.0.48-1 (released today) you have some new choices. You now have more control over the bps_ratio that applies to interfaces that have an ifSpeed. The default is sampling.bps_ratio=1000000 as discussed above, but you can turn that behavior off altogether with:

sampling.bps_ratio = 0

So as long as you haven't set anything like "sampling.1G=1000" it will fall back to the top-level default.

So if the goal is to set the same sampling rate to 5000 for all interfaces regardless of ifSpeed then you can do that with:

sampling=5000
sampling.bps_ratio=0

Also in version 2.0.48-1, you can override the samplingRate down in a pcap{} section. So if you want to allow a custom samplingRate on a given interface (which can occasionally be useful for testing) then you can do this:

pcap { dev=eth1 sampling=100 }

And one more new comment:
(7) You can leave out the "DNSSD=off" line. It is off by default.

@pavel-odintsov
Copy link

@sflow awesome, thank you so much for this improvement!

I've tried hsflowd 2.0.48 on 1G physical interface:

sudo ethtool  enp37s0f0
Settings for enp37s0f0:
Supported ports: [ TP ]
Supported link modes:   10baseT/Half 10baseT/Full
                        100baseT/Half 100baseT/Full
                        1000baseT/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes:  10baseT/Half 10baseT/Full
                        100baseT/Half 100baseT/Full
                        1000baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported	
Speed: 1000Mb/s
Duplex: Full
Auto-negotiation: on
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
MDI-X: off (auto)
Supports Wake-on: pumbg
Wake-on: g
Current message level: 0x00000007 (7)
                               drv probe link
Link detected: yes

I've used following configuration:

sflow {
  DNSSD = off
  polling = 30
  agentIP = 127.0.0.1
  sampling = 376
  sampling.bps_ratio=0
  collector { ip = 127.0.0.1 udpport = 6343 }
  pcap { dev = enp37s0f0 }
}

And it worked just fine:

sudo tshark  -i lo -n -V -f 'port 6343'|grep 'Sampling rate'
        Sampling rate: 1 out of 376 packets
        Sampling rate: 1 out of 376 packets
        Sampling rate: 1 out of 376 packets
        Sampling rate: 1 out of 376 packets

FastNetMon was able to parse traffic without any issues:

2023-03-15 22:05:52.654403 2a01:4b00:88b8:3a00:ff33:1595:8b2f:c9a9:49588 > 2001:41d0:0203:9fe4::0001:443 protocol: tcp flags: ack frag: 0  packets: 1 size: 90 bytes ip size: 0 bytes ttl: 0 sample ratio: 376  
2023-03-15 22:05:52.654403 2a01:4b00:88b8:3a00:ff33:1595:8b2f:c9a9:49588 > 2001:41d0:0203:9fe4::0001:443 protocol: tcp flags: ack frag: 0  packets: 1 size: 90 bytes ip size: 0 bytes ttl: 0 sample ratio: 376  
2023-03-15 22:05:52.654403 2a01:4b00:88b8:3a00:ff33:1595:8b2f:c9a9:49588 > 2001:41d0:0203:9fe4::0001:443 protocol: tcp flags: rst frag: 0  packets: 1 size: 78 bytes ip size: 0 bytes ttl: 0 sample ratio: 376 

Then I repeated my tests in GCE on virtual network interface:

sudo ethtool ens4
Settings for ens4:
Supported ports: [ ]
Supported link modes:   Not reported
Supported pause frame use: No
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes:  Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: Unknown!
Duplex: Unknown! (255)
Port: Other
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
Link detected: yes

And it worked just fine from FastNetMon too:

2023-03-15 22:16:13,239 [INFO] Dump: 2023-03-15 22:16:13.239206 140.82.121.9:443 > 10.154.15.196:47418 protocol: tcp flags: psh,ack frag: 0  packets: 1 size: 2839 bytes ip size: 2821 bytes ttl: 55 sample ratio: 143  
2023-03-15 22:16:13,239 [INFO] Dump: 2023-03-15 22:16:13.239233 10.154.15.196:47418 > 140.82.121.9:443 protocol: tcp flags: ack frag: 0  packets: 1 size: 70 bytes ip size: 52 bytes ttl: 64 sample ratio: 143  
2023-03-15 22:16:13,239 [INFO] Dump: 2023-03-15 22:16:13.239324 10.154.15.196:47418 > 140.82.121.9:443 protocol: tcp flags: ack frag: 0  packets: 1 size: 70 bytes ip size: 52 bytes ttl: 64 sample ratio: 143  
2023-03-15 22:16:14,159 [INFO] Dump: 2023-03-15 22:16:14.159022 10.154.15.196:47418 > 140.82.121.9:443 protocol: tcp flags: ack frag: 0  packets: 1 size: 70 bytes ip size: 52 bytes ttl: 64 sample ratio: 143  
2023-03-15 22:16:14,159 [INFO] Dump: 2023-03-15 22:16:14.159216 45.189.200.1:2055 > 10.154.15.196:2055 protocol: udp frag: 0  packets: 1 size: 1498 bytes ip size: 1480 bytes ttl: 240 sample ratio: 143

It looks very solid and works as expected.

Well done!

@sever-sever
Copy link
Member Author

sever-sever commented Mar 16, 2023

@pavel-odintsov @sflow Thanks!!! I'll change the template soon.
Does hsflowd support arm64 build?

08:38:08  gcc -std=gnu99 -I. -I../json -I../sflow  -fPIC -g -O2 -D_GNU_SOURCE -DHSP_VERSION=2.0.48 -DPROCFS=/proc -DSYSFS=/sys -DETCFS=/etc -DVARFS=/var -DUTHEAP -DHSP_OPTICAL_STATS -DHSP_MOD_DIR=/etc/hsflowd/modules -Wall -Wstrict-prototypes -Wunused-value -Wunused-function  -c mod_json.c 
08:38:09  gcc -o mod_json.so mod_json.o -shared  
08:38:09  gcc -std=gnu99 -I. -I../json -I../sflow  -fPIC -g -O2 -D_GNU_SOURCE -DHSP_VERSION=2.0.48 -DPROCFS=/proc -DSYSFS=/sys -DETCFS=/etc -DVARFS=/var -DUTHEAP -DHSP_OPTICAL_STATS -DHSP_MOD_DIR=/etc/hsflowd/modules -Wall -Wstrict-prototypes -Wunused-value -Wunused-function  -c mod_dnssd.c 
08:38:10  gcc -o mod_dnssd.so mod_dnssd.o -shared  -lresolv
08:38:10  gcc -std=gnu99 -I. -I../json -I../sflow  -fPIC -g -O2 -D_GNU_SOURCE -DHSP_VERSION=2.0.48 -DPROCFS=/proc -DSYSFS=/sys -DETCFS=/etc -DVARFS=/var -DUTHEAP -DHSP_OPTICAL_STATS -DHSP_MOD_DIR=/etc/hsflowd/modules -Wall -Wstrict-prototypes -Wunused-value -Wunused-function  -c mod_pcap.c 
08:38:10  mod_pcap.c:21:10: fatal error: pcap.h: No such file or directory
08:38:10     21 | #include <pcap.h>
08:38:10        |          ^~~~~~~~
08:38:10  compilation terminated.
08:38:10  make[1]: *** [Makefile:458: mod_pcap.o] Error 1
08:38:10  make[1]: Leaving directory '/home/admin/workspace/vyos-build-hsflowd_current/build-arm64/packages/hsflowd/host-sflow/src/Linux'
08:38:10  make: *** [Makefile:16: hsflowd] Error 2

@c-po
Copy link
Member

c-po commented Mar 16, 2023

We now have system flow-accounting and system sflow - I find it a bit confusing. Maybe system flowaccounting netflow should be move to system netflow?

@c-po
Copy link
Member

c-po commented Mar 16, 2023

@pavel-odintsov @sflow Thanks!!! I'll change the template soon. Does hsflowd support arm64 build?

08:38:08  gcc -std=gnu99 -I. -I../json -I../sflow  -fPIC -g -O2 -D_GNU_SOURCE -DHSP_VERSION=2.0.48 -DPROCFS=/proc -DSYSFS=/sys -DETCFS=/etc -DVARFS=/var -DUTHEAP -DHSP_OPTICAL_STATS -DHSP_MOD_DIR=/etc/hsflowd/modules -Wall -Wstrict-prototypes -Wunused-value -Wunused-function  -c mod_json.c 
08:38:09  gcc -o mod_json.so mod_json.o -shared  
08:38:09  gcc -std=gnu99 -I. -I../json -I../sflow  -fPIC -g -O2 -D_GNU_SOURCE -DHSP_VERSION=2.0.48 -DPROCFS=/proc -DSYSFS=/sys -DETCFS=/etc -DVARFS=/var -DUTHEAP -DHSP_OPTICAL_STATS -DHSP_MOD_DIR=/etc/hsflowd/modules -Wall -Wstrict-prototypes -Wunused-value -Wunused-function  -c mod_dnssd.c 
08:38:10  gcc -o mod_dnssd.so mod_dnssd.o -shared  -lresolv
08:38:10  gcc -std=gnu99 -I. -I../json -I../sflow  -fPIC -g -O2 -D_GNU_SOURCE -DHSP_VERSION=2.0.48 -DPROCFS=/proc -DSYSFS=/sys -DETCFS=/etc -DVARFS=/var -DUTHEAP -DHSP_OPTICAL_STATS -DHSP_MOD_DIR=/etc/hsflowd/modules -Wall -Wstrict-prototypes -Wunused-value -Wunused-function  -c mod_pcap.c 
08:38:10  mod_pcap.c:21:10: fatal error: pcap.h: No such file or directory
08:38:10     21 | #include <pcap.h>
08:38:10        |          ^~~~~~~~
08:38:10  compilation terminated.
08:38:10  make[1]: *** [Makefile:458: mod_pcap.o] Error 1
08:38:10  make[1]: Leaving directory '/home/admin/workspace/vyos-build-hsflowd_current/build-arm64/packages/hsflowd/host-sflow/src/Linux'
08:38:10  make: *** [Makefile:16: hsflowd] Error 2

Yes it does - CI will soon build both versions.

@sever-sever
Copy link
Member Author

We now have system flow-accounting and system sflow - I find it a bit confusing. Maybe system flowaccounting netflow should be move to system netflow?

@c-po Sure, I guess we'll drop system flow-accounting entirely later. I.e we should definitely to do the migration.
So it was the simplest solution to integrate the new sflow as the whole flow-accounting sflow|netflow (pmacct) handled by old flow_accounting_conf.py

And thanks for fixing arm64 :)

@sever-sever
Copy link
Member Author

The PR completely ready for review

interface-definitions/system-sflow.xml.in Outdated Show resolved Hide resolved
interface-definitions/system-sflow.xml.in Outdated Show resolved Hide resolved
interface-definitions/system-sflow.xml.in Show resolved Hide resolved
src/conf_mode/system_sflow.py Show resolved Hide resolved
Add sFlow feature based on hsflowd
According to user reviews, it works more stable and more productive
than pmacct
I haven't deleted 'pmacct' 'system flow-accounting sflow' yet
It could be migrated or deprecated later

  set system sflow agent-address '192.0.2.14'
  set system sflow interface 'eth0'
  set system sflow interface 'eth1'
  set system sflow polling '30'
  set system sflow sampling-rate '100'
  set system sflow server 192.0.2.1 port '6343'
  set system sflow server 192.0.2.11 port '6343'
@c-po c-po merged commit 5619dd6 into vyos:current Mar 16, 2023
@sever-sever
Copy link
Member Author

(6) It seems that the VyOS Linux kernel is compiled without the "drop_monitor" module. Is that correct? If that module can be included in the build then you could add this to hsflowd.conf:
dropmon { limit=50 start=on sw=on hw=off }
so that hsflowd will export the headers of dropped packets (along with the name of the function in the linux kernel where that skb was dropped) as part of the standard sFlow feed. This measurement complements the sFlow packet-sampling and counter-telemetry well because it provides visibility into the traffic that is not flowing. Very helpful for troubleshooting. The limit (a rate limit max of N drops per second that will be sent out in the sFlow datagrams) is the parameter that you would set in the CLI. Perhaps something like:
set system sflow dropmon limit 50
I hope you will consider enabling this feature. Very powerful.

@c-po could you take a look on this in the future?

@c-po
Copy link
Member

c-po commented Mar 17, 2023

Added in vyos/vyos-build@771b1f6

@sever-sever sever-sever deleted the T5086 branch March 17, 2023 11:10
@sflow
Copy link

sflow commented Mar 17, 2023

We found a minor issue. If you don't specify the UDP port in the CLI it seems that the hsflowd.conf that is generated ends up with:

collector { ip=10.1.2.3 udpport= }

which fails to parse.

Seems like this would be easy to fix. You can either generate:

collector { ip=10.1.2.3 }

or make sure the default is used:

collector { ip=10.1.2.3 udpport=6343 }

The effect will be the same either way.

@sever-sever
Copy link
Member Author

@sflow Thanks, I'll take a look

@sever-sever
Copy link
Member Author

There is the fix #1898

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants