docs: add performance debugging section #4525

norg · 2020-02-06T15:21:55Z

Make sure these boxes are signed before submitting your Pull Request -- thank you.

I have read the contributing guide lines at https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Contributing
I have signed the Open Information Security Foundation contribution agreement at https://suricata-ids.org/about/contribution-agreement/
I have updated the user guide (in doc/userguide/) to reflect the changes made (if applicable)

Link to redmine ticket:

Describe changes:

Add additional information about performance issue and how to debug those

PRScript output (if applicable):

victorjulien · 2020-02-07T09:39:12Z

Nice work Andreas. Can we rename this from 'debug' to 'analysis'?

norg · 2020-02-07T11:15:46Z

Nice work Andreas. Can we rename this from 'debug' to 'analysis'?

sure that is a better name. Will change it simply in another PR after any other feedback.

jlucovsky

Looks good!

jlucovsky · 2020-02-07T12:37:46Z

doc/userguide/performance/debug.rst

+
+    sudo perf top -p $(pidof suricata)
+
+If you see specific function calls at the top and red it's a hint that those


nit: s/at top and red/at top in red/

jlucovsky · 2020-02-07T12:44:26Z

doc/userguide/performance/debug.rst

+sure you have it installed and also the debug symbols installed for suricata or
+the output won't be very helpful. This output is also helpful when you report
+performance issues as the Suricata Development team can narrow down possible
+bugs with that.


nit: s/bugs/issues/ ?

jlucovsky · 2020-02-07T12:45:14Z

doc/userguide/performance/debug.rst

+eBPF/XDP.
+
+Another helpful tool is **perf** which helps to spot performance issues. Make
+sure you have it installed and also the debug symbols installed for suricata or


nit: s/suricata/Suricata/

pevma · 2020-02-10T12:05:40Z

doc/userguide/performance/debug.rst

+If you see specific function calls at the top and red it's a hint that those
+are the bottlenecks. For example if you see **IPOnlyMatchPacket** it can be
+either a result of high drop rates or incomplete flows which result in
+decreased performance.


It can be helpful to add in text for checking out the perf top for a specific cpu and or a thread. (-t , -c i think )

will add those.

pevma · 2020-02-10T12:05:47Z

doc/userguide/performance/debug.rst

+
+Another recommendation is to run Suricata without any rules to see if it's
+mainly related to the traffic. It can also be helpful to use rule-profiling
+and/or packet-profiling at this step.


It can be worth mentioning that for that part Suricata needs to be compiled with enable-profiling and that has a perf impact so it is advised not to leave it like that in prod.

will add this note

pevma

Good job / endeavor :)

pevma · 2020-02-10T12:06:02Z

doc/userguide/performance/debug.rst

+
+First steps to check are:
+
+- Check if the traffic is bidirectional, if it's mostly unidirectional you're missing relevant parts of the flow (see **tshark** example at the bottom)


Could also check if there is a big discrepancy between SYN vs SYN-ACKs and RSTs in the stats/eve logs.

pevma · 2020-02-10T12:06:07Z

doc/userguide/performance/debug.rst

+- Check for encapsulated traffic, while GRE, MPLS etc. are supported they could also lead to performance issues. Especially if there are several layers of encapsulation
+- Use tools like **iftop** to spot elephant flows. Flows that have a rate of over 1Gbit/s for a long time can result in one cpu core at 100% all the time and increasing the droprate while it doesn't make sense to dig deep into this traffic.
+- If VLAN is used it might help to disable **vlan.use-for-tracking** especially in scenarios where only one direction of the flow has the VLAN tag
+- If VLAN QinQ (IEEE 802.1ad) is used be very cautious if you use **cluster_qm** in combinatin with Intel drivers. While the RFC expects ethertype 0x8100 and 0x88A8 in this case (see https://en.wikipedia.org/wiki/IEEE_802.1ad) most implementations only add 0x8100 on each layer. If the first seen layer has the same VLAN tag but the inner one has different VLAN tags it will still end up in the same queue in **cluster_qm** mode.


It should be mention what kernel level and specific Intel drivers (ex i40/ixgbe etc..) this is observed under. It may not be true for all Intell/all kernel versions.

Mentioning af-packet might make it easier to differentiate the runmode used.

I won't be able to test all old ones but I can at least add "up to version/firmware XY"

pevma · 2020-02-10T12:06:13Z

doc/userguide/performance/debug.rst

+- Use tools like **iftop** to spot elephant flows. Flows that have a rate of over 1Gbit/s for a long time can result in one cpu core at 100% all the time and increasing the droprate while it doesn't make sense to dig deep into this traffic.
+- If VLAN is used it might help to disable **vlan.use-for-tracking** especially in scenarios where only one direction of the flow has the VLAN tag
+- If VLAN QinQ (IEEE 802.1ad) is used be very cautious if you use **cluster_qm** in combinatin with Intel drivers. While the RFC expects ethertype 0x8100 and 0x88A8 in this case (see https://en.wikipedia.org/wiki/IEEE_802.1ad) most implementations only add 0x8100 on each layer. If the first seen layer has the same VLAN tag but the inner one has different VLAN tags it will still end up in the same queue in **cluster_qm** mode.
+- Check for other unusual or complex protocols that aren't supported very well. In several cases we've seen that Cisco Fabric Path (ethertype 0x8903) causes performance issues. It's recommended to filter it, one option would be a bpf filter with **not ether proto 0x8903**


A useful addition could be mentioning that bulk debug targeting could help in pinpointing an issue.
For example run suricata with bp filter - port 80 , or port 25 or not port 443 could help zeroing in on problematic protocol or rules category in combination with perf top for a specific cpu or thread.

will add a section for this as well

docs: add performance debugging section

9f130c2

jlucovsky reviewed Feb 7, 2020

View reviewed changes

pevma reviewed Feb 10, 2020

View reviewed changes

victorjulien closed this Feb 11, 2020

norg mentioned this pull request Mar 2, 2020

docs: add performance debugging section v2 #4621

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add performance debugging section #4525

docs: add performance debugging section #4525

norg commented Feb 6, 2020

victorjulien commented Feb 7, 2020

norg commented Feb 7, 2020

jlucovsky left a comment

jlucovsky Feb 7, 2020

jlucovsky Feb 7, 2020

jlucovsky Feb 7, 2020

pevma Feb 10, 2020

norg Feb 11, 2020

pevma Feb 10, 2020

norg Feb 11, 2020

pevma left a comment

pevma Feb 10, 2020

pevma Feb 10, 2020

norg Feb 11, 2020

pevma Feb 10, 2020

norg Feb 11, 2020


		sudo perf top -p $(pidof suricata)

		If you see specific function calls at the top and red it's a hint that those


		First steps to check are:

		- Check if the traffic is bidirectional, if it's mostly unidirectional you're missing relevant parts of the flow (see tshark example at the bottom)

docs: add performance debugging section #4525

docs: add performance debugging section #4525

Conversation

norg commented Feb 6, 2020

victorjulien commented Feb 7, 2020

norg commented Feb 7, 2020

jlucovsky left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pevma left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment