Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance impact question #80

Closed
jcdaniel14 opened this issue Jun 19, 2020 · 11 comments
Closed

Performance impact question #80

jcdaniel14 opened this issue Jun 19, 2020 · 11 comments

Comments

@jcdaniel14
Copy link

Hi there folks, here's the picture, I'm receiving flows from different routers, for some reason I receive a "copy" of the flows sent to another collector (I'm just another exporter on IOS-XR) so I get data from ports that I don't really need and I can't change the network configuration.

In order to save disk space I decided to hardcode these ports into goflow (specifically SendKafkaFlowMessage func) so when the flow comes from port "X" it does not send it to Kafka, it just ignores the message.

And it does the job but I'm not quite sure if I may still be overloading the server, in networking when you discard a packet before processing it this does not represent a performance issue for the router, would this be the case too? perhaps the impact is meaningless and I should not worry at all?

My netflow traffic is considerable and continuously growing so I can't tell if the server is overloading due to traffic peaks or because I introduced a little bit more processing overhead.

I would appreciate some insight into this. Thank you.

@lspgn
Copy link
Contributor

lspgn commented Jun 19, 2020

Hi @jcdaniel14
What metrics are you looking at that make you say your server is overloaded?
Most of the processing of GoFlow is done when decoding Filtering on the interface should be negligible.

@jcdaniel14
Copy link
Author

Hi, I have a bare metal server, 128GB RAM 24 cores AMD Opteron 6174, CPU processing seems to be around 50% utilization with spikes up to 70%.
I also have in this server kafka and ELK Stack so it is difficult to tell if I'm putting pressure by adding these lines of code inside goflow methods.

@lspgn
Copy link
Contributor

lspgn commented Jun 22, 2020

Do you have per-process monitoring? Something like prometheus-node-exporter and process-node-exporter/cadvisor?
For just goflow, this should be able to handle thousands of flows per second.

If you want to dive on the effect of adding this function, I would suggest using pprof.

@mirsblog
Copy link

@lspgn FWIW, I ran some tests on a VM with 4vCPU 2.3GHz and 32GB RAM and compared it with nfacct. The maximum rate at which I could decode IPFIX and publish to kafka, without dropped packets, was at 15000 packets/second. Anything more than that caused a significant packet loss. nfacct in comparison could easily scale up to 60000 packets/second.

I ran the tests with -kafka=false and it seemed to not have any effect. I also increased the number of workers but I did not find any marked difference in packet drops with 1 or 100 workers.

Test setup:
Host type: VM
CPU: 4 vCPU Intel Xeon E312xx
Memory: 32GB

I used IPFIX PCAP with tcpreplay to send packets from one host to another.
$ sudo tcpreplay -i ens3 -K --loop=50000 -p 15000 ipfix.pcap

I monitored packet drops @ /proc/net/udp6.

@lspgn
Copy link
Contributor

lspgn commented Jun 23, 2020

@mirsblog interesting, thanks for the insights.
Does it use all the processors?

@mirsblog
Copy link

@mirsblog interesting, thanks for the insights.
Does it use all the processors?

I assume so given runtime.GOMAXPROCS(runtime.NumCPU()) is set in goflow.go. Would that be a correct assumption?

@lspgn
Copy link
Contributor

lspgn commented Jun 23, 2020

Yes it should use all processors, was just curious if the load distribution would be the same when looking at htop. Did you compile GoFlow or did you get a specific binary?

@mirsblog
Copy link

mirsblog commented Jun 23, 2020

Yes it should use all processors, was just curious if the load distribution would be the same when looking at htop. Did you compile GoFlow or did you get a specific binary?

GoFlow: v3.4.2
GoLang: 1.14
Built Alpine image using Dockerfile found in the v3.4.2. Ran using the instructions from README.

Edit: Tested just now and checked in htop to confirm CPU load distribution is even with workers=4

@lspgn
Copy link
Contributor

lspgn commented Jun 23, 2020

The first thing I can think of affecting NetFlow decoding performance would be the shared template cache.
Protobuf encoding (+memory allocations) may also be the reason. Would need to compile a specific version which bypasses this. Also would need to pprof.

@mirsblog
Copy link

Ok. I will consider that and start another thread when I have more to share.

@jcdaniel14
Copy link
Author

Do you have per-process monitoring? Something like prometheus-node-exporter and process-node-exporter/cadvisor?
For just goflow, this should be able to handle thousands of flows per second.

If you want to dive on the effect of adding this function, I would suggest using pprof.

Don't really have per-process monitoring but will dive into it, the server is processing 25k flows/sec atm according to logstash and hasn't been affected noticeable by the changes I made. I was just worried that it handles the processing of flows/kafka/elasticsearch/logstash at the same time and I could put some stress by adapting the code the way I did, thanks for clarifying and giving some good practices advice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants