TCP hangs when sending #62

pothos · 2017-10-25T03:48:03Z

Hello,
actually I wanted to debug it, but need some help because it is disappears e.g. if writing to pcap or extensive logging is used.

The smoltcp server thread on tap0 will send in 64 byte steps, a Linux socket client thread will read only to ~292800 bytes instead of reading all 10000000 bytes. Smoltcp is handing out keep-alives from this point on. The minimal example is here:

https://github.com/pothos/taptest

cargo run --release should reproduce the behavior.

RUST_LOG=trace cargo run --release should let it sometimes complete (same as adding print statements in the sending server).
Uncommenting the pcap writer will let it complete reliably.

I have experienced it with other values as well, but hope that this combination also reproduces it on other machines.

Best regards,
Kai

The text was updated successfully, but these errors were encountered:

whitequark · 2017-10-25T23:19:03Z

Oh yeah you hit a bug; retransmit timer gets reset by ACKs whereas it should not.

Specifically, by duplicate ACKs that arrive when the window is zero length, and pass through the duplicate ACK checks because they count as keepalives.

pothos · 2017-10-26T16:16:14Z

Thanks for looking at it, now I have to read the RFCs along with the code and find out how the timer is supposed to be set and kept.

pothos · 2017-10-28T07:50:26Z

Ok, finally got it working. Besides keeping the retransmission timer there was an overflow that needed to be fixed. Preparing a PR now.

This would result in results near usize::MAX, and is indicative of a bug. A panic is always used instead of a debug_assert!() because debug builds are easily slow enough so that the underlying bugs are not tripped. Related to #62.

whitequark · 2017-12-21T12:36:06Z

@pothos I'm finally making progress on this—see d1e2292 for one method I'm using to expose underlying bugs. I am afraid that your fix is hiding one of them.

whitequark · 2017-12-22T09:43:34Z

@pothos I've integrated a test quite like your taptest in the commit 44db954. I've managed to expose one other bug as well.

whitequark · 2017-12-22T20:23:10Z

Fixed in b247f64.

pothos · 2018-01-02T02:32:57Z

Thanks for including the fix - I am happy that the wrapping can be covered without distinction of cases ;)

whitequark added the kind/question label Oct 25, 2017

whitequark added kind/bug and removed kind/question labels Oct 25, 2017

pothos mentioned this issue Oct 28, 2017

Fix retransmission and send offset overflow #65

Closed

whitequark closed this as completed Dec 22, 2017

whitequark mentioned this issue Dec 27, 2017

moninj: frequent updates to TTL input value break TCP connection m-labs/artiq#874

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TCP hangs when sending #62

TCP hangs when sending #62

pothos commented Oct 25, 2017

whitequark commented Oct 25, 2017 •

edited

pothos commented Oct 26, 2017 •

edited

pothos commented Oct 28, 2017

whitequark commented Dec 21, 2017

whitequark commented Dec 22, 2017

whitequark commented Dec 22, 2017

pothos commented Jan 2, 2018

TCP hangs when sending #62

TCP hangs when sending #62

Comments

pothos commented Oct 25, 2017

whitequark commented Oct 25, 2017 • edited

pothos commented Oct 26, 2017 • edited

pothos commented Oct 28, 2017

whitequark commented Dec 21, 2017

whitequark commented Dec 22, 2017

whitequark commented Dec 22, 2017

pothos commented Jan 2, 2018

whitequark commented Oct 25, 2017 •

edited

pothos commented Oct 26, 2017 •

edited