Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upRFE: Alert on network SoftIRQ backlog or budget drops #108
Comments
ryran
self-assigned this
Jan 1, 2015
ryran
added
the
enhancement
label
Jan 1, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
added a commit
that referenced
this issue
Jan 2, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@jbainbri : As always, speak up if you have any feedback. |
ryran
closed this
Jan 2, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ryran
Jan 2, 2015
Owner
PS: I just noticed that I didn't add net.core.netdev_max_backlog and net.core.netdev_budget to sysctl .... today. They were already in there, thanks to your RFE from Nov 2013. haha.
|
PS: I just noticed that I didn't add |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ryran
Oct 28, 2015
Owner
@jbainbri do we have a KCS for this? sroza was talking about it and it made me realize that if we do, I'd like to add it (like how I link the tainted kcs)
|
@jbainbri do we have a KCS for this? sroza was talking about it and it made me realize that if we do, I'd like to add it (like how I link the tainted kcs) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
mwtzzz
Jun 23, 2017
This should only alert if the counter in the second and third columns of softnet_stat are increasing. Simply being greater than 0 is not sufficient unless the counter has been reset since the last check.
mwtzzz
commented
Jun 23, 2017
|
This should only alert if the counter in the second and third columns of softnet_stat are increasing. Simply being greater than 0 is not sufficient unless the counter has been reset since the last check. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jbainbri
Jun 25, 2017
Except sosreport provides a single snapshot in time, not a repeated collection over time, so that is outside the scope of sosreport and hence xsos as well.
Even an increasing count doesn't matter to most people, as long as they're not also suffering ring buffer drops at the same time. In most cases it's fine for the SoftIRQ to reschedule with more work to complete, as long as all the work is completed before ring buffer exhaustion.
jbainbri
commented
Jun 25, 2017
|
Except sosreport provides a single snapshot in time, not a repeated collection over time, so that is outside the scope of sosreport and hence xsos as well. Even an increasing count doesn't matter to most people, as long as they're not also suffering ring buffer drops at the same time. In most cases it's fine for the SoftIRQ to reschedule with more work to complete, as long as all the work is completed before ring buffer exhaustion. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
mwtzzz
commented
Jun 26, 2017
|
Both your points make sense. |

jbainbri commentedOct 15, 2014
The file
/proc/net/softnet_statprovides statistics on each CPU core's SoftIRQ network receive work. Here's two cores:If the second column of this contains greater than zero, then
net.core.netdev_max_backlogneeds to be increased.If the third column of this contains greater than zero, then
net.core.netdev_budgetneeds to be increased.Could xsos alert to the fact these values have incremented, and provide the current tunable value?
I'd prefer this up around the ETHTOOL section, as that's where people are looking for drops. Mockup example:
We don't collect
/proc/net/softnet_statin all versions of sosreport, so xsos would need to check for the presence of the file first.If the column contains zero, there is no need to report, or an ok status would be fine: