Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network stack never recovers without a hard reset after receiving closely-spaced larger UDP packets (IDFGH-1363) #3646

Closed
ssilverman opened this issue Jun 17, 2019 · 16 comments

Comments

@ssilverman
Copy link

Something happened with the network stack where it now becomes completely unresponsive if large UDP packets are received. Is this maybe related to fragmentation?

See this bug in the Arduino core: espressif/arduino-esp32#2899
It has example code and more details for reproducing the issue.

@github-actions github-actions bot changed the title Network stack never recovers without a hard reset after receiving closely-spaced larger UDP packets Network stack never recovers without a hard reset after receiving closely-spaced larger UDP packets (IDFGH-1363) Jun 17, 2019
@negativekelvin
Copy link
Contributor

I don't think this problem happens with iperf example, have you tried it?

@ssilverman
Copy link
Author

ssilverman commented Jun 18, 2019

I have not, but I'm 100% certain that the network stack becomes unresponsive without a hard reset after a few fragmented packets. Did you try the Arduino example I linked to (espressif/arduino-esp32#2899)?

I'll look at trying the iperf example now... Forgive me for asking: Do you have a link?
[Update: nm, I found it.]

[Update 2]: Here's a useful Bash script to test sending larger packets:

while true; do echo -n $(printf '.%.0s' {1..1400}) > /dev/udp/192.168.1.9/8000; sleep 0.05; done

@ssilverman
Copy link
Author

ssilverman commented Jun 18, 2019

I have a follow-on question that may lead to a workaround and a “good enough” solution for me: Is there a way to tell the IP stack not to accept packets larger than a certain size? Then, if the problem has to do with packet reassembly, it can be avoided.

Another detail I didn’t note before: When packets larger than 1024 bytes are received, it comes in two packets, rather than being reassembled. For example, if I send a 1025-byte packet, I receive one 1024-byte packet and one 1-byte packet. Is this a clue? Is it correct that non-reassembled packets show up as two packets?

@negativekelvin
Copy link
Contributor

@ssilverman
Copy link
Author

ssilverman commented Jun 18, 2019

Sorry, I should have been clearer: when there’s fragmentation, there isn’t a way to tell (at least from AsyncUDP’s onPacket call) that the packet is part of a fragmented packet. I was, probably erroneously, assuming that receiving a split packet was indicative of an error, especially if there isn't a way to tell the data is part of a larger whole.

@r1dd1ck
Copy link

r1dd1ck commented Jun 18, 2019

As I've already mentioned in the former issue thread espressif/arduino-esp32#2871 (comment) -

Packet size is not directly the cause, and neither is UDP nor packet handling in the onPacket callback. The only thing that seems to matter is packet cadence - eg. how many packets are received over a given time period. The network/wifi stack freezes solid after a certain amount is surpassed.

Or in other words - Denial of service (DoS) without even trying 😕

The problem with larger packets @ssilverman is experiencing is most probably a side-effect of packet "fragmentation", because this obviously causes packet cadence to raise 2+ times. The 1024-byte breakpoint suggests that the sending application is using a 1024-byte payload buffer (eg. payloads larger than that get sent out in multiple packets).

That said, earlier I was able to push 100+ packets/sec without any problems (TCP, 10-byte payload), and now I can't even get to 30/sec without having to deal with stability issues (network/wifi stack choking & freezing up at some point).

Currently this is happening on latest master of the arduino-esp32 core. I haven't had time to check at which point it got broken @ IDF, but I guess it must have been sometime between Aug/2018 & Jun/2019, because I did not update the core during that time span (updated just recently).

💩

@negativekelvin
Copy link
Contributor

iperf example can handle large bandwidth so you may want to look at Arduino sdkconfig options or library code

@r1dd1ck
Copy link

r1dd1ck commented Jun 18, 2019

@negativekelvin
So you are of the opinion that some arduino library or SDK config parameter bricking the whole network stack is OK? -__-

I would be quite surprised if the culprit was not somewhere more upstream (eg. ESP-IDF). The arduino core is essentially just a direct copy of this repo, with pre-compiled pieces and some added "ease of use" functionality..

@r1dd1ck
Copy link

r1dd1ck commented Jun 18, 2019

Update:
So I have compiled it with the iperf example sdkconfig defaults, and sure enough - now it runs stable for ~1 hour already, even at 100 packets/sec.

I guess that the arduino initiative was trying to save some memory, and as a result some of the buffers were set to values 10+ times lower than in the iperf example 😐

But the point is, that it should not be possible to brick the WiFi stack so easily, or at all. This should certainly be looked at, and the underlying cause fixed.

@ssilverman
Copy link
Author

ssilverman commented Jun 19, 2019

@r1dd1ck which file did you replace when compiling arduino-esp32? The one in tools/sdk/? Forgive me, I'm new to building this...

@r1dd1ck
Copy link

r1dd1ck commented Jun 19, 2019

@ssilverman
Actually, you don't need to re-compile the arduino-esp32 core to change the buffer values.

All you need to do is:

  • Edit \tools\sdk\include\config\sdkconfig.h and adjust the values to match sdkconfig.default from the iperf example folder.
  • Force a full rebuild when compiling your project (not sure how to do this directly from the UI, but changing the flash frequency or upload speed is usually enough to force it).

This applies to the arduino environment. I have no experience with PiO, so can't help you there.

You should see a 13-14kB drop in free heap (as compared to arduino-esp32 defaults). If you don't see this drop, then the changes are not active, and you need to check what you did wrong 💩

On a side note: The buffer values from the iperf config are rather extreme, so later you should probably have a go at it and adjust them to best match your specific application needs.

@ssilverman
Copy link
Author

Thanks! I’ll try these suggestions.

@negativekelvin
Copy link
Contributor

Suggest you change the issue to this:

WiFi stack can freeze if CONFIG_ESP32_WIFI_RX_BA_WIN > CONFIG_ESP32_WIFI_STATIC_RX_BUFFER_NUM.

Tested on esp-idf v3.2

This configuration should be prevented or checked and an error thrown.

@negativekelvin
Copy link
Contributor

Ok, I think you can close this issue since you made the new one

@ssilverman
Copy link
Author

For posterity, this issue still happens in the v1.0.3 Arduino core: espressif/arduino-esp32#3287

@mikesowter
Copy link

Gentlemen, I am so grateful for this fix. My application is an NTP server which polls for UDP packets (only 48 bytes) 1e5 times per second. It lasted random times from 0-60s between hardware resets before the sdkconfig fix, and now runs indefinitely. Thanks also to espressif for their good thread discipline on this site.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants