Transmitter can transmit stale fragment which is already timed out at the receiver #9604

sarveshkumarv3 · 2023-11-14T02:14:13Z

Is your feature request related to a problem? Please describe.

Reassembly timeout is currently handled via OPENTHREAD_CONFIG_6LOWPAN_REASSEMBLY_TIMEOUT (2s) which would mean a new fragment received every 2s. If transmitter cannot send the reassembly fragment within that time (due to MLE or CoAP packet coming in middle or due to channel conditions, this would cause receiver to drop.

Describe the solution you'd like A clear and concise description of what you want to happen.

The above behavior can be avoided if transmitter checks for the same timeout on the transmit side and drop the packet rather than transmitting a fragment after OPENTHREAD_CONFIG_6LOWPAN_REASSEMBLY_TIMEOUT

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

NA

Additional context Add any other context or screenshots about the feature request here.

NA

abtink · 2023-11-15T23:25:37Z

This seems to be a useful optimization/enhancement to avoid sending frames when we know the receiver will drop or ignore them.

I see one potential downside that we may want to consider:

If I am not mistaken, the "6lowpan reassembly timeout" value of 2 is not explicitly specified or fixed in the Thread specification, and I think it is a configurable parameter in OpenThread (OT).
- I tried searching in Thread spec and could not find it.
- In rfc4944 I see "The reassembly timeout MUST be set to a maximum of 60 seconds" (which seems to set a max value for timeout).
So the sender cannot be certain that the receiver is using the same timeout value of 2.
By implementing this optimization on the sender side, we are indirectly limiting the receiver's timeout value as well.

Despite this, I still believe that implementing this optimization is useful. We can consider defining and fixing the timeout in the Thread specification to be 2 seconds.

Practically, I think it is unlikely that any deployment of OT has modified the default value of OPENTHREAD_CONFIG_6LOWPAN_REASSEMBLY_TIMEOUT to use a different value.

Thoughts? @jwhui @sarveshkumarv3 @EskoDijk

sarveshkumarv3 · 2023-11-17T05:54:26Z

thanks @abtink!

I agree defining OPENTHREAD_CONFIG_6LOWPAN_REASSEMBLY_TIMEOUT as spec parameter would help here.

By implementing this optimization on the sender side, we are indirectly limiting the receiver's timeout value as well.

How was 2s chosen? Do you see any downside to limiting the timeout to 2s?

EskoDijk · 2023-11-17T08:43:34Z

Agree that it needs to be defined in the spec if we want to optimize anything here. From the transmitter's perspective: it only knows that the receiver implements the reassembly rules per RFC 4944 (which refers to 2460, now replaced by 8200) and it doesn't take into acount whether the receiver is OpenThread or other software stack. I created SPEC-1239 for possible spec updates.

abtink · 2023-11-17T18:18:21Z

Thanks guys.

I am not sure how the 2 seconds was picked as default but seems to me as a reasonable interval. I think it should be fine to require the same value in spec. Probably very unlikely that any deployment of OT has changed this number from original default of 2 seconds.

abtink · 2023-11-17T21:35:57Z

The current implementation of reassembly timeout in OT utilizes the TimeTicker class which is a one-second periodic timer that emits a signal at each tick (every second) to modules that have registered to receive ticks using the HandleTimeTick() callback.

Consequently, the 2 seconds reassembly timeout is not exact in the sense that the actual time at which a partially reassembled message is discarded can range from 2 up to to 3 seconds.

jwhui · 2023-11-20T16:13:51Z

I am not sure how the 2 seconds was picked as default but seems to me as a reasonable interval. I think it should be fine to require the same value in spec.

To provide some historical context:

We used to have a timeout config for reassembling the entire datagram.
In [qos] forward messages based on the priority of the message #3317, this was changed to a timeout config for receiving the next fragment. At the time, 2 seconds seemed sufficient for receiving the next fragment.

abtink · 2023-12-05T00:44:55Z

Submitted PR for this.

[mesh-forwarder] drop msg if frag tx delay exceeds reassembly timeout #9684

@EskoDijk @sarveshkumarv3 can you please take a look when you get a chance? Thanks.

sarveshkumarv3 · 2023-12-10T05:36:35Z

Submitted PR for this.

[mesh-forwarder] drop msg if frag tx delay exceeds reassembly timeout #9684

@EskoDijk @sarveshkumarv3 can you please take a look when you get a chance? Thanks.

Looks good, thanks for handling this.

EskoDijk · 2024-02-15T13:17:25Z

The PR was closed - difficult because fragments may go via different routes, or be delayed > 2 sec en-route while still meeting the final reassembly deadline.

One thing I realized when doing fragmentation experiments with OT: if a mesh-router on the route decides to drop one fragment, the packet reassembly is very likely to fail in the final mesh destination. In that case this mesh-router can save bandwidth by not forwarding any further fragments with that same tag/sender. The decision to drop a single fragment can come from failure to transmit the frame after 16 attempts (e.g. NoAck); or because (ECN) queue management drops it.

abtink · 2024-02-15T18:16:44Z

In that case this mesh-router can save bandwidth by not forwarding any further fragments with that same tag/sender.

I believe such a behavior may be already supported in certains situations.

We have OPENTHREAD_CONFIG_DROP_MESSAGE_ON_FRAGMENT_TX_FAILURE impacting the original sender which does the fragmentation.

openthread/src/core/config/mesh_forwarder.h

Lines 49 to 63 in 49c59ec

    
           /** 
        
            * @def OPENTHREAD_CONFIG_DROP_MESSAGE_ON_FRAGMENT_TX_FAILURE 
        
            * 
        
            * Define as 1 for OpenThread to drop a message (and not send any remaining fragments of the message) if all transmit 
        
            * attempts fail for a fragment of the message. For a direct transmission, a failure occurs after all MAC transmission 
        
            * attempts for a given fragment are unsuccessful. For an indirect transmission, a failure occurs after all data poll 
        
            * triggered transmission attempts for a given fragment fail. 
        
            * 
        
            * If set to zero (disabled), OpenThread will attempt to send subsequent fragments, whether or not all transmission 
        
            * attempts fail for a given fragment. 
        
            * 
        
            */ 
        
           #ifndef OPENTHREAD_CONFIG_DROP_MESSAGE_ON_FRAGMENT_TX_FAILURE 
        
           #define OPENTHREAD_CONFIG_DROP_MESSAGE_ON_FRAGMENT_TX_FAILURE 1 
        
           #endif

For intermediate routers we have mechasnim to track if an earlier fragment was dropped by queue managment logic and ensure to drop all subsequent related fragements. But I dont think this behavior is supported in the case where frag is dropped at MAC layer (no ack after retx).

EskoDijk · 2024-02-21T11:13:02Z

@abtink Thanks,

But I dont think this behavior is supported in the case where frag is dropped at MAC layer (no ack after retx).

There's also the possibility that only the Ack frame was disrupted, but not the original data frame. So after a fragment drop due to NoAck it may be still the case that the fragment was actually delivered okay to the next-hop router so there's no reason to drop any further/future frames.

If the drop is due to ECN / queue management logic, it's a different story: there's no chance that the fragment was delivered to the next hop because it wasn't sent yet.

Possibly related to this concept of "Ack-disruption": I've seen in the ot-rfsim simulation platform that transmitting radios will not send for some time period after a frame transmission (SIFS/LIFS time) which gives time for the Ack to be sent back undisturbed. However, neighboring radios that did receive the same frame and that know an Ack is likely to follow, don't apply any pause per the 802.15.4 spec. So they could be disturbing the Ack if they want to transmit something for themselves at that time. To be further investigated!

jwhui assigned abtink Nov 14, 2023

abtink mentioned this issue Dec 4, 2023

[mesh-forwarder] drop msg if frag tx delay exceeds reassembly timeout #9684

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transmitter can transmit stale fragment which is already timed out at the receiver #9604

Transmitter can transmit stale fragment which is already timed out at the receiver #9604

sarveshkumarv3 commented Nov 14, 2023 •

edited

abtink commented Nov 15, 2023 •

edited

sarveshkumarv3 commented Nov 17, 2023

EskoDijk commented Nov 17, 2023

abtink commented Nov 17, 2023

abtink commented Nov 17, 2023

jwhui commented Nov 20, 2023

abtink commented Dec 5, 2023 •

edited

sarveshkumarv3 commented Dec 10, 2023

EskoDijk commented Feb 15, 2024

abtink commented Feb 15, 2024

EskoDijk commented Feb 21, 2024

Transmitter can transmit stale fragment which is already timed out at the receiver #9604

Transmitter can transmit stale fragment which is already timed out at the receiver #9604

Comments

sarveshkumarv3 commented Nov 14, 2023 • edited

abtink commented Nov 15, 2023 • edited

sarveshkumarv3 commented Nov 17, 2023

EskoDijk commented Nov 17, 2023

abtink commented Nov 17, 2023

abtink commented Nov 17, 2023

jwhui commented Nov 20, 2023

abtink commented Dec 5, 2023 • edited

sarveshkumarv3 commented Dec 10, 2023

EskoDijk commented Feb 15, 2024

abtink commented Feb 15, 2024

EskoDijk commented Feb 21, 2024

sarveshkumarv3 commented Nov 14, 2023 •

edited

abtink commented Nov 15, 2023 •

edited

abtink commented Dec 5, 2023 •

edited