New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wifi breaks RMT with WS2812B on ESP32 #713

Open
dr-froehlich opened this Issue Jan 4, 2019 · 9 comments

Comments

Projects
None yet
5 participants
@dr-froehlich
Copy link

dr-froehlich commented Jan 4, 2019

FastLED.show() stops updating the WS2812B LEDs when Wifi is enabled.
Configuration:
4 strips connected to GPIO 5, 16,17,18 with 21 LEDs each running a simple animation.
FastLED version 3.2.1
Chip is ESP32D0WDQ6 (revision 1)
arduino-esp32 release 1.0 and 1.0.1-rc2 (no change in behavior)

With no Wifi, the program runs forever without issue.

With Wifi enabled, the update of LEDs stalls after an arbitrary time (typically some minutes). When I use FastLEDShowESP32() as wrapper and FastLED.show() running in core 0, code taken verbatim from DemoReelESP32.ino, my application program in core 1 continues to run, however seemingly very slow. (I assume it's slow because it waits for the timeout to occur after every show). It never receives the notification from the last show in "ulTaskNotifyTake(pdTRUE, xMaxBlockTime);" I've also run FastLED.show() directly in the core 1 application, then FastLED.show() doesn't return. Wifi connectivity remains ok (before narrowing down the issue I had MQTT running and could still send messages to the ESP32, I have now removed the MQTT code for testing).

I could narrow down the problem to this level:

in FastLED.CPP:
void CFastLED::show(uint8_t scale) {
// guard against showing too rapidly
[...]
CLEDController *pCur = CLEDController::head();
while(pCur) {
uint8_t d = pCur->getDither();
if(m_nFPS < 100) { pCur->setDither(0); }
Serial.print("s"); // s ist the last message I get when stalling
pCur->showLeds(scale); // this does not return
Serial.print("d"); // d is not shown anymore when stalled
pCur->setDither(d);
[...]

The four strips seem to have been partially updated: three show a correct pattern, the fourth a correct pattern but strange colors for each LED.

I don't know how to further debug nor narrow down the issue. Any help is greatly appreciated! Thank you.
Peter

Wifi code:
void connectToWifi() {
Serial.println("Connecting to Wi-Fi...");
WiFi.begin(WIFI_SSID, WIFI_PASSWORD);
}

@marcmerlin

This comment has been minimized.

Copy link
Contributor

marcmerlin commented Jan 4, 2019

@dr-froehlich you don't need to do the special ESP32 code anymore. Does it work ok if you remove all the special ESP32 code and just define you LED strip like you would on another CPU?

@dr-froehlich

This comment has been minimized.

Copy link

dr-froehlich commented Jan 4, 2019

Hi Marc, thanks for your quick answer. I believe the special code you refer to is creation of a task in FreeRTOS and pinning it to core 0. Actually I started out without this code, but the behavior is the same with the exception that then core 1 stalls entirely (my application). I don't think that is the issue.

I did another test with DemoReelESP32.ino taken directly from Github and just the Wifi code added. It runs well with one strip. I can get it to stall after 1..30 minutes as soon as I move to two strips. Played around with the update frequency as well. Above 300..400 I run into issues including reboot as direct response. At 120 it seems to run fine for quite a while - until it stalls. So I don't think it's an direct load issue, but higher frequency might improve chances of failing earlier.

I don't know how to trace from FastLED.cpp (see above) any further, as the involved code of controller implementation exceeds my capabilities.
Thanks again for any help
Peter

@dr-froehlich

This comment has been minimized.

Copy link

dr-froehlich commented Jan 4, 2019

OK, as suggested, I moved the FastLED.show() back to core 1. For now, this seems to be working with Wifi and ASyncMqttClient enabled. Initially I had a larger application including a webserver running when I had tested that, so may be there was another issue overlaying this one.

So for now, it looks to me like several RMT lines run by FastLED.show() in core 0 are not compatible with Wifi running in core 0. If the system now runs stable on core 1 for a few hours, I'll close the issue.

Thanks Marc, for putting me on correct track (I keep my fingers crossed :-) )
Peter

@marcmerlin

This comment has been minimized.

Copy link
Contributor

marcmerlin commented Jan 5, 2019

Glad it seems to be working better. This has been changing a bit as it seems not to be exact science and more trial and error (sadly). @samguyer wrote that driver and not that his tree is not identical to the FastLED tree. If FastLED master isn't working well, try his tree (github.com/samguyer).
Also, as per another bug, it seems that FastLED needs to remove the DemoReelESP32 which is not necessary anymore and actually may be making things worse.

@samguyer

This comment has been minimized.

Copy link
Contributor

samguyer commented Jan 5, 2019

@marcmerlin I need to spend some time digging through the WiFi code to understand why it interacts with the RMT-based FastLED driver. My guess is that the WiFi code does not expect to be sharing the CPU core with any other code, so it does not yield frequently enough.

Maybe we should just remove DemoReelESP32 and NOT encourage people to run FastLED on the other core.

marcmerlin added a commit to marcmerlin/FastLED that referenced this issue Jan 5, 2019

Delete delete confusing demo.
- standard demoreeel works on ESP32 now
- FastLED#711 says this demo worked
  badly.
- FastLED#713 says it's incompatible
  with Wifi.
@marcmerlin

This comment has been minimized.

Copy link
Contributor

marcmerlin commented Jan 5, 2019

@samguyer didn't you move this pinning to cores code inside the FastLED driver later?
#711 is another example of another user having issues with DemoReelESP32 which went away when he just used the driver and let it do the task pinning on its own (I didn't check how it's doing it differently), just that whatever is in the driver works better than doing the pinning outside of the driver.
I just submitted #714 to delete the demo if you agree.

@dr-froehlich

This comment has been minimized.

Copy link

dr-froehlich commented Jan 6, 2019

@samguyer: Thank you for solving the RMT access for us! Thank you to all for contributing your efforts and expertise to the community!

I'd like to wrap up my further testing: After moving FastLED.show() back to the main application on core 1, I had no issues with WIFI activated, but only tested for a short time. Then I moved AsyncMqttClient back in. Mqtt Client was connected to Mosquito on another computer and working. Without receiving messages the application still stalled after 90min. I used the DemoReelESP32 code to move FastLED.show to a nother task but running on core 1. This ran ok for a few hours. Then I increased the update speed to 200Hz and went to bed. Next morning it had stalled, meaning the show taks would no return. The main application still ran, since it was sitting on the same core but in a separate process.

It seems to me that increasing the burden on the processor with other task increases an extremely slim chance of the RMT code to run into some kind of deadlock when handling mutliple strands. Hard to debug because it's very seldom.

To me DemoReelESP32 code has been educational, and I'll keep the separate task on the same core for now. This gives me the option to add resilience by restarting the task should it have ever stalled.

Thank you both samguyer and marcmerlin for your great support!

Peter

Unless there are further comments I'll close the issue.

@RococoN8R

This comment has been minimized.

Copy link

RococoN8R commented Jan 6, 2019

@X-WL

This comment has been minimized.

Copy link

X-WL commented Jan 7, 2019

@samguyer didn't you move this pinning to cores code inside the FastLED driver later?
#711 is another example of another user having issues with DemoReelESP32 which went away when he just used the driver and let it do the task pinning on its own (I didn't check how it's doing it differently), just that whatever is in the driver works better than doing the pinning outside of the driver.
I just submitted #714 to delete the demo if you agree.

I tried this solution in my project. It uses FastLED and the usual WebServer. Earlier in 1-2 minutes, the FastLED code hung, and the main thread continued to work.
Updating via FastLED.show () caused FastLED not to freeze immediately. However, after 20 minutes the whole code stopped. Without exception.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment