netifd defaults the STP forward_delay lower than the minimum allowed by the protocol, causing the BPDU packets to be ignored by conforming implementations, risking bridge loops.
The relevant limits are set in IEEE 802.1D-1998 section 8.10.2 Table 8-3: the allowable forward_delay range is 4 - 30 seconds. netifd sets the initial default to 2 seconds.
I tested this with several of my Netgear managed switches; they ignore the invalid "2 second" STP packets. Correcting the forward_delay to within limits (4s) results in the router accepting the OpenWRT STP as the root bridge (since it has a lower bridge-id).
netifd should definitely not be defaulting to invalid values (even if it, and the kernel, allow the values to be set).
Here's a patch to fix the default: --- a/bridge.c
+++ b/bridge.c
@@ -875,7 +875,7 @@ bridge_apply_settings(struct bridge_state *bst, struct blob
_attr **tb)
Specifically, the packet is invalid as it fails the Spanning Tree Algorithm in section A.9, step 17c.
NOTE: Since the 1998 version of the standard requires subscribing to IEEE, you can also find the limits in the "free to download" updated 802.1D-2004 standard, section 17.14, Table 17-1 for the RSTP (which has the same forward delay limits as STP).
On the subject of "additional possible fixes".... (only suggestions)
The very low Forward Delay of 4 seconds still results in "non-conforming" behavior by OpenWRT, but at least no longer "breaking" behavior. Section 8.10.2 of 802.1D-1998 states:
A Bridge shall enforce the following relationships:
2 × (Bridge_Forward_Delay – 1.0 seconds) >= Bridge_Max_Age
... so even if the default Forward Delay is increased to 4 seconds, the default Max Age should also also be reduced to 6 seconds (kernel currently defaults to 20 seconds).
Also the minimum value for Forward Delay of 4 seconds is calculated (in section B.4.5) based on a Hello Time of 1 second, so that value should also be set (kernel currently defaults to 2 seconds).
Neither of these updates are critical (they work at their current defaults), but would just create "sensible" timers for STP.
The text was updated successfully, but these errors were encountered:
sshambar:
netifd defaults the STP forward_delay lower than the minimum allowed by the protocol, causing the BPDU packets to be ignored by conforming implementations, risking bridge loops.
The relevant limits are set in IEEE 802.1D-1998 section 8.10.2 Table 8-3: the allowable forward_delay range is 4 - 30 seconds. netifd sets the initial default to 2 seconds.
I tested this with several of my Netgear managed switches; they ignore the invalid "2 second" STP packets. Correcting the forward_delay to within limits (4s) results in the router accepting the OpenWRT STP as the root bridge (since it has a lower bridge-id).
netifd should definitely not be defaulting to invalid values (even if it, and the kernel, allow the values to be set).
Here's a patch to fix the default:
--- a/bridge.c
+++ b/bridge.c
@@ -875,7 +875,7 @@ bridge_apply_settings(struct bridge_state *bst, struct blob
_attr **tb)
Specifically, the packet is invalid as it fails the Spanning Tree Algorithm in section A.9, step 17c.
NOTE: Since the 1998 version of the standard requires subscribing to IEEE, you can also find the limits in the "free to download" updated 802.1D-2004 standard, section 17.14, Table 17-1 for the RSTP (which has the same forward delay limits as STP).
On the subject of "additional possible fixes".... (only suggestions)
The very low Forward Delay of 4 seconds still results in "non-conforming" behavior by OpenWRT, but at least no longer "breaking" behavior. Section 8.10.2 of 802.1D-1998 states:
A Bridge shall enforce the following relationships:
2 × (Bridge_Forward_Delay – 1.0 seconds) >= Bridge_Max_Age
... so even if the default Forward Delay is increased to 4 seconds, the default Max Age should also also be reduced to 6 seconds (kernel currently defaults to 20 seconds).
Also the minimum value for Forward Delay of 4 seconds is calculated (in section B.4.5) based on a Hello Time of 1 second, so that value should also be set (kernel currently defaults to 2 seconds).
Neither of these updates are critical (they work at their current defaults), but would just create "sensible" timers for STP.
The text was updated successfully, but these errors were encountered: