-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eth: pinging frdm k64f eventually leads to unresponsive ethernet device #16639
Comments
Interestingly, I have not been able to recreate the issue with the echo_client sample i 1.13. I will take a look at the code differences in the morning. Also, I suspect #3129 might be related to this issue, the symptoms sound awfully familiar. |
I tried frdm-k64f last week and basically the gPTP code was not working at all (time sync packets were not received properly). Dunno what was wrong with it, could be related to #16089. Has the HAL changed recently that could explain this bitrotting? |
@aunsbjerg i was able to reproduce this issue on my setup with the latest code. |
This driver is really broken see also #16089 |
@aunsbjerg: Can you please provide more information:
That's effectively a flood ping. a) Did you ever run such a command against any other type of Zephyr device? (What were results?) b) What happens if you run just normal |
The number is not consistent between different runs. Some general, anecdotal observations:
I have only tried running the ping flood against a qemu target running the echo_client sample. I was not able to reproduce the issue over the course of one hour. I do not have any other development boards, so I cannot reproduce on other zephyr devices. Hope that helps. Let me know if there is anything else I can do. |
@aunsbjerg: Thanks for the detailed info, should be helpful when reproducing and investigating the issue.
Do you use procedure described in https://docs.zephyrproject.org/latest/guides/networking/qemu_setup.html? I.e. SLIP networking in QEMU, the classical one? I find the above report strange, as QEMU has problems with UART emulation, which affects SLIP, which in turn affects networking stability. It was found that such a setup is not suitable for any load testing. That's why newer, non-default, setup with Ethernet emulation was put out (https://docs.zephyrproject.org/latest/guides/networking/qemu_eth_setup.html#networking-with-eth-qemu).
Thanks for the report and information. I guess next step would be bisecting the tree to find the point where it broke, as we definitely had it working better than failing on 185th ping (but I never used flood pings). I'm glad @agansari has got this ticket, I hope to be able to help with any confirmations or testing of the results needed (but otherwise concentrating on other tasks now). |
@agansari can you take a look at this? |
@MaureenHelm i've been debugging this issue, so far this pull #17396 improves the behavior of the driver, but does not completely fix the issue. |
@agansari I'm currently on holiday but will be back in the middle of next week, then I'll test your fix |
Short bug description: Long bug description: Solutions:
Debug code
TODO:
|
Further debugging I found that it's a MPU related issue; ENET device tries to acces via it's uDMA a user space address in RAM. Will continue further debugging on Monday. Disabling MPU in eth device's initialization bypasses the issue. |
@aunsbjerg issue is related to MPU disabling ENET's DMA acces to RAM. See pull #17603 |
@agansari I just did a test run with your PR, and it seems to be working perfectly - good job! |
Describe the bug
Pinging the frdm_k64f board works for a while, but eventually leads to the board not responding to any ethernet requests. The board also becomes unable to send ping requests to a host. This unresponsiveness persists until the board is reset.
I am reasonably sure that this issue is not specific to ICMP as I have observed the same unresponsive behaviour doing UDP stuff - ICMP is just the most reliable way of provoking the issue.
This issue could be an mcux variant of #11255
Issue is seen on latest master and on v1.14.0 tag.
To Reproduce
Expected behavior
I expect the frdm_k64f to not become unresponsive
Impact
The project I'm working on is based on the frdm_k64f and will rely heavily on a stable ethernet connection. This issue is therefore a showstopper for me, especially since it seems to require a device reset to restore functionality.
Screenshots or console output
Wireshark log of when communication stops.
Environment (please complete the following information):
The text was updated successfully, but these errors were encountered: