New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE: Alert on network SoftIRQ backlog or budget drops #108

Closed
jbainbri opened this Issue Oct 15, 2014 · 7 comments

Comments

Projects
None yet
3 participants
@jbainbri

jbainbri commented Oct 15, 2014

The file /proc/net/softnet_stat provides statistics on each CPU core's SoftIRQ network receive work. Here's two cores:

$ cat /proc/net/softnet_stat 
00023aee 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
028ad20a 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

If the second column of this contains greater than zero, then net.core.netdev_max_backlog needs to be increased.

If the third column of this contains greater than zero, then net.core.netdev_budget needs to be increased.

Could xsos alert to the fact these values have incremented, and provide the current tunable value?

I'd prefer this up around the ETHTOOL section, as that's where people are looking for drops. Mockup example:

ETHTOOL
  Interface Status:
    eth0  link=up 6000Mb/s full (autoneg=Y)  rx ring 982/1024   drv be2net v4.0.160r / fw 4.9.311.20
  Interface Errors:
    eth0  rx_errors: 679
          rx_tcp_checksum_errs: 679

SOFTIRQ
  Backlog max has been reached, increase net.core.netdev_max_backlog (current value: 1000)
  Budget is not sufficient, increase net.core.netdev_budget (current value: 300)

We don't collect /proc/net/softnet_stat in all versions of sosreport, so xsos would need to check for the presence of the file first.

If the column contains zero, there is no need to report, or an ok status would be fine:

SOFTIRQ
  Backlog max is sufficient (current value: net.core.netdev_max_backlog = 1000)
  Budget is sufficient (current value: net.core.netdev_budget = 300)

@ryran ryran self-assigned this Jan 1, 2015

@ryran ryran added the enhancement label Jan 1, 2015

@ryran

This comment has been minimized.

Show comment
Hide comment
@ryran

ryran Jan 1, 2015

Owner

xsos-softirq

For the record Jamie: I did no independent research on this; I simply added exactly what you asked for. I can't think of anyone else I would trust like that...

PS: I added net.core.netdev_max_backlog and net.core.netdev_budget to sysctl output as well.

PPS: I did put it right below ethtool.

Owner

ryran commented Jan 1, 2015

xsos-softirq

For the record Jamie: I did no independent research on this; I simply added exactly what you asked for. I can't think of anyone else I would trust like that...

PS: I added net.core.netdev_max_backlog and net.core.netdev_budget to sysctl output as well.

PPS: I did put it right below ethtool.

@ryran

This comment has been minimized.

Show comment
Hide comment
@ryran

ryran Jan 2, 2015

Owner

@jbainbri : As always, speak up if you have any feedback.

Owner

ryran commented Jan 2, 2015

@jbainbri : As always, speak up if you have any feedback.

@ryran ryran closed this Jan 2, 2015

@ryran

This comment has been minimized.

Show comment
Hide comment
@ryran

ryran Jan 2, 2015

Owner

PS: I just noticed that I didn't add net.core.netdev_max_backlog and net.core.netdev_budget to sysctl .... today. They were already in there, thanks to your RFE from Nov 2013. haha.

Owner

ryran commented Jan 2, 2015

PS: I just noticed that I didn't add net.core.netdev_max_backlog and net.core.netdev_budget to sysctl .... today. They were already in there, thanks to your RFE from Nov 2013. haha.

@ryran

This comment has been minimized.

Show comment
Hide comment
@ryran

ryran Oct 28, 2015

Owner

@jbainbri do we have a KCS for this? sroza was talking about it and it made me realize that if we do, I'd like to add it (like how I link the tainted kcs)

EDIT: Ah. https://access.redhat.com/solutions/1241943

Owner

ryran commented Oct 28, 2015

@jbainbri do we have a KCS for this? sroza was talking about it and it made me realize that if we do, I'd like to add it (like how I link the tainted kcs)

EDIT: Ah. https://access.redhat.com/solutions/1241943

@mwtzzz

This comment has been minimized.

Show comment
Hide comment
@mwtzzz

mwtzzz Jun 23, 2017

This should only alert if the counter in the second and third columns of softnet_stat are increasing. Simply being greater than 0 is not sufficient unless the counter has been reset since the last check.

mwtzzz commented Jun 23, 2017

This should only alert if the counter in the second and third columns of softnet_stat are increasing. Simply being greater than 0 is not sufficient unless the counter has been reset since the last check.

@jbainbri

This comment has been minimized.

Show comment
Hide comment
@jbainbri

jbainbri Jun 25, 2017

Except sosreport provides a single snapshot in time, not a repeated collection over time, so that is outside the scope of sosreport and hence xsos as well.

Even an increasing count doesn't matter to most people, as long as they're not also suffering ring buffer drops at the same time. In most cases it's fine for the SoftIRQ to reschedule with more work to complete, as long as all the work is completed before ring buffer exhaustion.

jbainbri commented Jun 25, 2017

Except sosreport provides a single snapshot in time, not a repeated collection over time, so that is outside the scope of sosreport and hence xsos as well.

Even an increasing count doesn't matter to most people, as long as they're not also suffering ring buffer drops at the same time. In most cases it's fine for the SoftIRQ to reschedule with more work to complete, as long as all the work is completed before ring buffer exhaustion.

@mwtzzz

This comment has been minimized.

Show comment
Hide comment
@mwtzzz

mwtzzz Jun 26, 2017

Both your points make sense.

mwtzzz commented Jun 26, 2017

Both your points make sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment