Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
[TW#27458] power_save example does not maintain an association #2711
I am trying to get the wifi/power_save example to work but what happens is that at boot-up it associates with my AP but 5 seconds later looses the beacons and then cannot reconnect. I see the same behavior:
After 5 seconds the wifi disconnects due to reason 200 (lost beacons) and then reconnect attempts fail due to reason 201.
Steps to reproduce
After this is continues indefinitely with reason 201.
I did some more tests using esp-idf master . If I disable light sleep in the power_save example the module does maintain its association:
> diff -u /home/src/esp32/esp-idf/examples/wifi/power_save/main/power_save.c main/power_save.c --- /home/src/esp32/esp-idf/examples/wifi/power_save/main/power_save.c 2018-11-16 17:38:21.32848406 5 -0800 +++ main/power_save.c 2018-11-18 10:58:44.309225040 -0800 @@ -53,7 +53,7 @@ ip4addr_ntoa(&event->event_info.got_ip.ip_info.ip)); break; case SYSTEM_EVENT_STA_DISCONNECTED: - ESP_LOGI(TAG, "SYSTEM_EVENT_STA_DISCONNECTED"); + ESP_LOGI(TAG, "SYSTEM_EVENT_STA_DISCONNECTED: %d", event->event_info.disconnected.reason); ESP_ERROR_CHECK(esp_wifi_connect()); break; default: @@ -103,7 +103,7 @@ .max_freq_mhz = CONFIG_EXAMPLE_MAX_CPU_FREQ_MHZ, .min_freq_mhz = CONFIG_EXAMPLE_MIN_CPU_FREQ_MHZ, #if CONFIG_FREERTOS_USE_TICKLESS_IDLE - .light_sleep_enable = true + //.light_sleep_enable = true #endif }; ESP_ERROR_CHECK( esp_pm_configure(&pm_config) );
But as I was writing up this post I noticed that it crashed (several minutes after the boot):
I tried the same with v3.1.1 and it behaves the same as far as I can tell.
I let the last test run and it panic'ed a few minutes later. This is esp-idf master running the wifi/power_save example with
It connected fine to the AP and just sat there, but a few minutes later it crashed (I don't have the exact amount of time elapsed)
And as I'm writing this comment it crashed again in the same place with a slightly different backtrace and what is worse, it sits in an endless reboot loop:
After this it's in an infinite loop of RTCWDT_RTC_RESET.
I switched to a different AP that is open (no WPA2) and I see similar crashes:
This time the reboot worked.
Do you have an elf of a rock-solid power_save example that runs on an esp-wroom-32 module that I could try?
NB: I can be found on the arduino-esp32 gitter as @tve in case you want to ask some quick questions. I'm on USA west coast time.
I spent some time using a wifi monitor to capture packets "in the air", What I observe is that a few seconds after entering light sleep the esp32 starts sending probe requests. The AP responds with many probe responses but none of them are ACKed. Obviously the AP also sends its normal beacons, so the esp32 should not send probe requests to start with. What it looks like to me is that the RX section does not wake-up as it should to receive the beacons or the probe responses.
HI @tve, we have debugged the beacon timeout issue in light sleep for more than 1 weeks and make some progress, however we still not sure we have found the root cause.
Need you help to check whether the beacon timeout issue is fixed on your side with following change, the rtc_sleep_init() is defined in $IDF_PATH/components/soc/esp32/rtc_sleep.c :
void rtc_sleep_init(rtc_sleep_config_t cfg)
For the crash issue, I just can't reproduce it, could you help to describe the detailed reproduce steps, or send your code to me then I can reproduce it myself.
The result is unchanged:
I'm attaching my ELF file:
I changed the SSID to "test", no password.
OK, I now have travis builds working. I forked the esp-idf repo and only added .travis.yml, see https://github.com/tve/esp-idf/blob/master/.travis.yml
Here is my console log in downloading, flashing and running this to reproduce the issue:
Continued in the next comment...
I then created a new branch
The travis builds are (the first one is the full 280MB tree, second is just the image):
I'm still working through what I'm observing... The Adafruit Huzzah32 fails with the fix. I'm also testing a NodeMCU ESP32S and so far I can't reproduce the problem using the unmodified IDF. I'm trying to make sure it's not a power supply or similar issue...
Something else noticed looking at the logs of my AP, which is using hostapd. What I see is:
and then a little later:
and a little later the esp32 goes through its beacon time-out and reconnection:
and the AP reports as expected:
So the whole thing gets triggered by an association timeout on the AP side as far as I can tell. (The above was captured with the NodeMCU running the un-modified IDF).
NB: if you need some help with Travis, let me know. I don't know whether you're familiar with this type of continuous integration service.
I have been running many more tests. Unfortunately I don't know what to say... I have 3 esp32 boards:
On a NodeMCU with esp-wroom-32 module I cannot reproduce the SYSTEM_EVENT_STA_DISCONNECTED loop problem at all.
On a TTGO w/LoRa board that has an esp32 chip I cannot reproduce the problem either.
On an Adafruit Huzzah32 with esp-wroom-32 I have lots of problems. I can reproduce the SYSTEM_EVENT_STA_DISCONNECTED loop problem easily. But if I try the version with the fix, after associating it gets a Guru Meditation Error, then restarts one or two times, then goes into an infinite loop of
I tried different USB cables, ports, power supplies, LiPo. No change. I checked the strapping (expected to be correct on an Adafruit board which I'm not connecting to anything but USB). I am wondering whether there is some HW problem on this board/module. I think I will order another one to compare.
Haha, yes I can, it'll look a little odd, though. I'm also stubborn/tenacious so I kept fiddling...
First, the NodeMCU board has been running the original unmodified IDF power_save. I wanted to know whether it reproduces the issue. It sort of did:
I.e. it "recovered" on its own. This type of sequence has happened a few times over the past few hours.
About the Adafruit Huzzah32, I decided to take the shield off and touch up the solder on the flash chip. That didn't help, so I decided to replace the flash chip. But I used the wrong tools and butchered it, lifting two pads in the process. The result is an "artistic" replacement with a Winbond 25Q128 (I didn't have a 25Q32 handy). Here's a photo of that, and please don't laugh too hard :-) :-) You can make out the shield and the module PCB markings.
Now the interesting part: I flashed the image with the fix and after ~10 minutes it has not shown any signs of problems. So it seems likely that the problem was with the flash chip (a Gigadevice).
I'm planning to let it run for another half hour and then flash the unmodified version to see whether the problem reproduces then. I will report back...
Update: guess what, as soon as I posted the comment it went through some disconnects, which look OK to me (assuming disconnects should happen at all):
Again, this is the Huzzah32 with the TW27458 1-line fix
The Huzzah32 ran for an hour with the fix and while it showed individual disconnections they immediately recovered.
Hi, @tve Do you still have a problem on lightsleep example of wifi disconnected ?
try add this line in the file rtc_sleep.c on the function rtc_sleep_init()
void rtc_sleep_init(rtc_sleep_config_t cfg)
added a commit
Dec 10, 2018
Phone is Samsung Galaxy S6 running Android 5.0.1 (sigh!).
I can also reproduce the problem running an access point using an ODROID C1+ with Ubuntu 16.04 kernel 3.10, hostapd v0.8.x_rtw_r7475.20130812_beta (that has RealTek drivers), and a USB-Wifi "Edimax Technology Co., Ltd EW-7811Un 802.11n Wireless Adapter [Realtek RTL8188CUS]". The command-line I use is:
I can see whether it's reproducible on an rPi-0w, if that's easy for you to try?