-
-
Notifications
You must be signed in to change notification settings - Fork 7.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DHT measure() locks up (core0) when _thread is running (core1) if WIZNET5K networking is enabled #10448
Comments
Did some more testing
|
I did some more testing My current theory is if you do not use core 1 the wiznet driver will use core 1 and if you do it will use core 0 and I have not been able to get a PICO W to crash so far
|
I agree that the current 1.20 WIZNET5K network module is NOT Running literally anything on the second core - even something incredibly trivial - will usually result in abnormal behaviour either during or shortly after the network has been initialised. Sometimes it may work normally for a while, which is what makes narrowing this down to the network module so much more time consuming. So long as you stick to a single core, the W5500-EVB-PICO is a very nice board - just don't try to use the second RP2040 core! |
my success varies with what is being done on core 0, using core 1 works fine as lone as you do not use something that blocks interrupts on core 0, due to this i have the complex stuff on core 1 and the simple stuff (polling) on core 0 i have had it crash or appear to crash, but it was more like it froze than crashed and where it stopped processing made logical sense as to what happened, i suspect some a EMI flipped a bit in memory and wear stuff happened as this is the most plausible explanation i can think of i have had request return a error still annoying that the way i have it set up using the network hogs both cores as i call the request on core 1 and then core 0 does the work of getting data for core 1 to process the data received the PICO W is definitionally easier to work with in this regard and you do not give up a bunch of GPIO pins (i have one w5500 pico that ran out of gpio pins) now if i want to add stuff i need to use a multiplex IC |
I have a main script that is running running on core0 which handles a GPIO digitial input (ie. button, actually it's a relay output but to all intents and purposes it's a button!) via IRQ and processes that button "press" (outside of the IRQ) in an infinite I then added a second thread using For me, starting a trivial second thread (that does almost nothing) before initialising the WIZNET5K network is sufficient for the RP2040 to hang completely. I now avoid At the very least though, a warning about this incompatibility with the WIZNET5K network library and |
I am also affected by this on a W5500-EVB-Pico board running micropython v1.20.0. Here is some code that gets my board to freeze reliable within a minute when sending packets to the NIC from a second machine: On core0 it receives data from a UDP socket and prints it, then writes some data to a NeoPixel LED strip. from machine import Pin
import network
import socket
from time import sleep
import _thread
from neopixel import NeoPixel
LED_PIN = Pin(25, Pin.OUT)
PIXEL_PIN = Pin(3, Pin.OUT)
def initialize_nic():
print("initializing NIC")
nic = network.WIZNET5K()
nic.active(True)
print("waiting for connection to come up")
for i in range(0, 10):
if nic.isconnected():
break
print(f"waited {i} seconds for nic to come up")
print(nic.isconnected())
sleep(1)
else:
return None
print("network is up, address is:")
print(nic.ifconfig())
return nic
def second_thread():
while True:
LED_PIN.on()
sleep(1)
LED_PIN.off()
sleep(1)
def handle_packet(message, pixels):
print(message)
for i in range(10):
pixels[i] = (0, 255, 0)
pixels.write()
def main():
pixels = NeoPixel(PIXEL_PIN, 10)
for i in range(10):
pixels[i] = (255, 0, 0)
pixels.write()
nic = initialize_nic()
if nic is None:
print("failed to initialize network")
return
core_1 = _thread.start_new_thread(second_thread, ())
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.settimeout(1.0)
sock.bind(("", 6454))
while True:
try:
message, address = sock.recvfrom(1024)
print(f"received a message from {address}: { message }")
handle_packet(message, pixels)
except OSError:
print("Timeout receiving from the socket")
print(nic.ifconfig())
print(f"connected: { nic.isconnected() }")
sleep(1)
if __name__ == "__main__":
main() Replacing the neopixel code in the The neopixel code itself also works just fine, when I don't start the second thread, everything works as intended. (Note that my code does not depend on the input of other devices (except for sending UDP datagrams to the Pico). You can run it without the LEDs connected, so this is the minimal example code that I could come up with, that reproduces the issue and does not depend on other hardware.) |
Here's another example. Perhaps I misremembered, but it seems that it DOES require an active IRQ callback in order to trigger the crash, at least for me (pretty sure it was crashing during In the test below, it configures the network with dhcp, then rapidly loops around blinking the LED and checking the The second thread is just spinning around, doing practically nothing. It can take anywhere from between 1 and a couple of dozen button presses, but eventually the RP2040 WILL lock up hard. No additional network IO required beyond the initial network configuration. Disable the Note that by default CORE0 output is written to the the UART while CORE1 output goes to
|
Quick follow-up: I have code that is using the WIZNET5K driver, PIO and DMA on a RP2040 but without multithreading and that also locks up randomly. |
Using the micropython build? thus far i have had much better luck using the v2.0 build from wiznet's repo: https://github.com/Wiznet/RP2040-HAT-MicroPython/releases this happen with the code i posted her as well, but random lockup are far less frequent, no idea why, i have no idea why i am getting them in the 1st place, but with the micropython build it locks up about every day as compared to every month or 2 |
|
I wonder if the Waveshare RP2040-ETH will be more reliable and capable of using both cores - for my use case it would be a more-or-less drop in replacement for the Wiznet W5500-EVB-Pico (I'd just need to switch from USB-Micro power to USB-C, and the smaller size Waveshare might even be an advantage). I'm starting to get the feeling the Wiznet5K module isn't going to get any better, so moving on to something else might be the best long-term option than flogging this dead horse. |
i have a project using a pico W, it has not had a single issue (that was not my fault) the only thing core 1 does is directly control the 7 segment displays in software, if i end up replacing my W5500s board i am just gonna use a PICO W, i like the idea of wired more than wifi, but the pico not crashing takes priority, at this time i do not know if the crashes i have been having are from EMI causing memory corruption, takes a long time to debug a unknown point of failure when you only get 1 single boolean value every month or so (i ran out of I/O pins) |
The W5500 has been totally fine for me aside from these second core/multithreading issues that were a pain to debug and are totally undocumented (this issue is possibly the only "documentation" that exists!) To be honest I don't really need the second core for my current project - which is just as well! - but it's nice to have different hardware options so I may pick up the Waveshare ETH board next time I'm ordering and see if it works any better than the W5500-EVB-Pico. I'm not expecting it to be a packet shifting monster - and the W5500 certainly isn't that! - but I only require occasional and very limited network IO so it should be fine. For reference, a single HTTPS GET request over the LAN takes between 5 and 6 seconds with the W5500-EVB-Pico, while the same query implemented as a raw TCP socket request completes in 110ms. TLS on the Pico (or maybe just the W5500?) seems to be a bit of an issue, so my project will use only the socket requests - far too much latency with the HTTPS requests! Assuming there are no dual core issues with the Waveshare RP2040-ETH I would certainly use it in any future projects in favour of the W5500, or any future Wiznet products assuming this issue with Wiznet5K is never fixed, just in case it became necessary to use the second core. |
please let me know if you run into any issues with it i have not noticed network times like 5-6 seconds with HTTP GET request (no need for HTTPS on my local network) i finally have both of my 5500 units deployed, we will see if this unit acts up with anything over the next 3 months, i managed to allocate every pin on both for use, i need to add the 2 fan plugs and a led circuit to one, if nothing goes wrong on this second deployment i am going to blame my old network switch (i did replace a capacitor on the power board, so maybe another is faulty causing intermittent issues) again if you have a w5500 board and are getting fedup with it crashing or locking up try the build v2.0 build here this is what i am not running on both of my w5500s now, maybe i will try the daily micropython build in a few months, i want to see if this thing will crash (hopefully i do not loose power over the next couple weeks with t-storms every day that my 3 UPS units can't deal with) i have 2 PICO W and 2 W5500-EVB-PICO controllers deployed, i have had one of each act up in what may be the same way, this is why a suspect my switch, other PICO W has been running for probably 6 months or more without issue (aside from i need a update for connecting to wifi cause the PICO boot faster than the router) |
Of course, will do - although not sure when I'll be ordering next, it could be some time.
Yeah HTTP is fine, but HTTPS is really quite bad (way too much latency for my needs, which is 1.5 seconds maximum response time). I'm seeing 5-6 seconds for a trivially simple HTTPS GET on my internal-only development LAN, but in "production" the HTTPS-only web server will be internet-facing, so likely even worse performance. Fortunately the production web server will be on the same LAN as the W5500(s) - potentially 9 of them all on the same LAN - and so in the end it was easier to spin up a LAN-only facing socket server that listens for the W5500 requests and proxies those requests through to the HTTPS web server and back to the W5500 in less than 110ms. A bit of a hack, but actually works fine. And for various reasons the socket server also turned out to be much easier to implement than creating and configuring an HTTP server... 😀 By the way I'm using the official "v1.20.0 (2023-04-26) .uf2" release for W5500. |
in the interest of testing running a second thread that sleeps some time then runs gc.collect(), should not be hard to work in for your single threaded application was was trying to debug some code to see how i was able to get a lock up on my code using the 1.2 build (i get no crash errors in my code) but when i tried to make test code i managed to get a memory error in a very short time period (less then 10 loops)
i'm gonna try running |
not a memory issue, be sure you i think the only way you are safely going to read a sensor and have a W5[1/5]00 NIC is todo this dirty hack adding ~414ms of overhead to reading the sensor reading
if you are reading multiple sensors you could do something like this and save some time overall
|
so much for that it still locks up with my actual code that reads a few sensors, sends the post data to the server then i press a button a couple times (1 to 5 times) each time it makes a get request at this point i suspect if you so much as have a pin configured for DHT you are gonna crash with the network loaded |
MilhouseVH i think i figured something out try running |
i have done even more testing even if you disable the NIC before using the second core then enable it after you are done with the second core it will crash so my guess (not even sure this is a thing) would be a clock de-sync is to blame, at this time i have the |
firmware file name: v1.19.1-782-g699477d12 (2022-12-20) .uf2
Board: https://micropython.org/download/W5500_EVB_PICO/
I have 4 DHT22 wired up using 3 pins , they all report numbers just file when _thread is not in use, at 1st i figured i could not access the GPIO at the same instant as, so i made a lock to stop core1 from access the GPIO during a measure, this did nothing, If I comment out my newThread it seems to run till the cows come home
this seems to be all it takes to break it (usually happens within 4 loops w/ 4 sensors):
no errors show up in Thonny, it just hangs
The text was updated successfully, but these errors were encountered: