Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packets lost at a high rate #72

Open
sevagh opened this issue Apr 12, 2021 · 3 comments
Open

Packets lost at a high rate #72

sevagh opened this issue Apr 12, 2021 · 3 comments

Comments

@sevagh
Copy link

sevagh commented Apr 12, 2021

Hello. We're using samplicator to copy statsd metric traffic (UDP packets) to two destinations. My question is, should we expect that samplicator fails to copy 100% of the traffic to every destination if there is a high packet rate?

In our setup, samplicator listens on port 8125, and forwards to ports 8124 and 9125. Here are some outputs for 5 minutes worth of pcaps on each of the 3 ports. The traffic going into samplicator is roughly 2x that of what is going into each destination:

Samplicator port, 5 minutes of packets:

user@host $ sudo tcpdump -G 300 -W 1 udp -i any 'dst port 8125' -s 0 -w ~/8125.pcap
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
Maximum file limit reached: 1
33512370 packets captured
67426415 packets received by filter
392293 packets dropped by kernel

Destination 1:

user@host $ sudo tcpdump -G 300 -W 1 udp -i any 'dst port 8124' -s 0 -w ~/8124.pcap
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
Maximum file limit reached: 1
16446120 packets captured
33060123 packets received by filter
166158 packets dropped by kernel

Destination 2:

user@host $ sudo tcpdump -G 300 -W 1 udp -i any 'dst port 9125' -s 0 -w ~/9125.pcap
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
Maximum file limit reached: 1
17295917 packets captured
34708450 packets received by filter
114771 packets dropped by kernel

The size of the pcap files also show the difference in captured packets (largest file is the samplicator port, two smaller ones are the two destinations):

user@host $ ls -latrh *.pcap
-rw-r--r-- 1 root      root      5.3G Apr 12 13:43 8125.pcap
-rw-r--r-- 1 root      root      2.6G Apr 12 13:48 8124.pcap
-rw-r--r-- 1 root      root      2.8G Apr 12 13:57 9125.pcap

Our next plan is to run some more synthetic load tests to find out at exactly what packet rate does samplicator break down and stop being able to send 100% of traffic.

It also that samplicator is at 98+% CPU usage throughout (monitored via htop).

@sevagh
Copy link
Author

sevagh commented Apr 12, 2021

As for potential code changes that can deal with such high traffic - maybe multithreading, or sendmmsg (#62)?

@mitchellsayer
Copy link

@sleinen Could you weigh in on this? I'm trying to use this package for a stream of real time sensor data and appear to be getting random packet loss at different receiving nodes. Would love to know the limitations of this package at higher transmission rates.

@sleinen
Copy link
Owner

sleinen commented Aug 5, 2022

The main limitation of the package is that it uses regular UDP datagram sockets and basic system calls (recvfrom and sendto). This means at least N+1 (where N is the number of receivers) system calls per processed UDP packet/datagram. System calls are relatively expensive. In our own use of samplicator—mainly for redistribution of unsampled Netflow streams—we simply crank up the size of UDP receive buffers to minimize loss. Of course larger buffers don't make the system more efficient, but they help cope with traffic peaks and temporary contention for host resources such as CPU cycles from other workloads. I have written a rudimentary tool called qui that can be used to visualize (in ASCII art) the utilization of those socket buffers over time. Maybe it is helpful for estimating how much buffer you need.

There are some open issues concerning possible performance improvements through the use of more modern system calls, e.g. #61 and #62, but I don't think anybody is working on implementing those.

This application might also be a nice excuse for someone to delve into BPF/XDP packet processing in the kernel—this would remove the system call overhead altogether. Another alternative platform to re-implement this simple tool from scratch could be VPP/FD.io, which makes it relatively easy to benefit from optimized user-more networking drivers such as DPDK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants