Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network fails to transmit STM32H747DISC0 board zephyr v2.6.0 #36356

Closed
hamishguthrie opened this issue Jun 17, 2021 · 9 comments
Closed

Network fails to transmit STM32H747DISC0 board zephyr v2.6.0 #36356

hamishguthrie opened this issue Jun 17, 2021 · 9 comments
Assignees
Labels
bug The issue is a bug, or the PR is fixing a bug platform: STM32 ST Micro STM32 priority: low Low impact/importance bug Waiting for response Waiting for author's response

Comments

@hamishguthrie
Copy link

Describe the bug
With the telnet sample built for the above board, the following error messages are seen shortly after startup:

*** Booting Zephyr OS build zephyr-v2.5.0-2669-gc276088567b5  ***
[00:00:00.006,000] <inf> shell_telnet: Telnet shell backend initialized
[00:00:00.006,000] <inf> net_config: Initializing network
[00:00:00.006,000] <inf> net_config: Waiting interface 1 (0x240011d4) to be up...
[00:00:01.610,000] <err> eth_stm32_hal: Failed to enqueue frame into RX queue: -62
[00:00:01.610,000] <inf> net_config: Interface 1 (0x240011d4) coming up
[00:00:01.610,000] <inf> net_config: IPv4 address: 192.168.1.10
[00:00:01.610,000] <err> net_if: Cannot join solicit node address ff02::1:ff00:1 (-12)
[00:00:01.610,000] <inf> net_telnet_sample: Starting Telnet sample
[00:00:01.610,000] <inf> net_telnet_sample: IPv4 address: 192.168.1.10
[00:00:01.610,000] <inf> net_telnet_sample: IPv6 address: 2001:db8::1
[00:00:01.611,000] <err> eth_stm32_hal: HAL_ETH_DMAErrorCallback errorcode:8 dmaerror:1100
[00:00:01.710,000] <inf> net_config: IPv6 address: <log_strdup alloc failed>
[00:00:01.710,000] <inf> net_config: IPv6 address: <log_strdup alloc failed>
[00:00:02.610,000] <err> eth_stm32_hal: HAL_ETH_Transmit: failed!
[00:00:03.610,000] <err> eth_stm32_hal: HAL_ETH_Transmit: failed!

To Reproduce
Steps to reproduce the behavior:

  1. Using west initialise and update a new clean version of zephyr v2.6.0
  2. west build -b stm32h747i_disco_m7 samples/net/telnet
  3. Program the resultant image to the discovery board
  4. The above error is shown and it is not possible to ping the device or for the device to ping another device on the same network

Expected behavior
As a minimum the network should come up cleanly and it should be possible to ping the DUT and from the device to ping other devices on the same network.

Impact
showstopper

Logs and console output

Environment (please complete the following information):

  • OS: Linux
  • Toolchain: zephyr 0.12.4 (I also tested with gnuarmemb)
  • v2.6.0

Additional context
I have modified my STM32H747DISC0 board as per the instructions https://docs.zephyrproject.org/latest/boards/arm/stm32h747i_disco/doc/index.html#ethernet

This example works perfectly when built with zephyr v2.5.0
I ran a git bisect between v2.5.0 and v2.6.0 and the commit where it stops working is c276088

@hamishguthrie hamishguthrie added the bug The issue is a bug, or the PR is fixing a bug label Jun 17, 2021
@jukkar jukkar added the platform: STM32 ST Micro STM32 label Jun 17, 2021
@erwango
Copy link
Member

erwango commented Jun 18, 2021

^^ @reloZid FYI. Would you mind to have a look ?

@mjaun
Copy link
Contributor

mjaun commented Jun 18, 2021

What I can tell after a quick test is that for me the following boards are working:

  • Nucleo H743
  • Nucleo H745

Also used the telnet sample, Zephyr v2.6.0 and SDK v0.12.4 on Ubuntu 20.04.

Unfortunately I don't have a H747 discovery board at hand. In my point of view the MCU should be very similar to the H745 though. Any ideas what could be the difference? I can check later whether I can spot something.

@hamishguthrie
Copy link
Author

Unfortunately the only board I have is an H747, however I will have a board with an H742 tomorrow, and I will try it with that - I am just wondering as a wild guess if this has some conflict between the 2 cores in the H747? I will also try completely disabling the M4 core and see if that makes any difference, but I am very new to the Zephyr and STM32 environments. My major skillset is embedded Linux on multiple platforms.

@mjaun
Copy link
Contributor

mjaun commented Jun 21, 2021

I am just wondering as a wild guess if this has some conflict between the 2 cores in the H747? I will also try completely disabling the M4 core and see if that makes any difference, but I am very new to the Zephyr and STM32 environments.

I would say this is definitely worth a try. According to the board documentation, the M4 core can be disabled by modifying the option bytes using the STM32CubeProgrammer. Otherwise I'd suggest to at least make sure that the other core doesn't mess something up by flashing something simple. On the Nucleo H745 I just tested running the blinky sample on the M4 core in parallel with the telnet sample on the M7 core which worked for me.

Another wild guess from my side is that the driver for the external RAM on the H747 discovery board could define additional MPU regions which might interfere with the MPU regions defined by the Ethernet driver. Therefore you could also try disabling the external RAM and the FMC in the device tree to check whether the bug is in this spot.

@hamishguthrie
Copy link
Author

I now suspect that I have a faulty STM32H747DISC0 board. I received this board (brand new) a few weeks ago. When I first received that board I was using Zephyr v2.5.0 and I was able to get networking running, however, if I built images using the Zephyr SDK, I could easily crash the device simply by pinging it, however, if I built using the GNUARMEMB toolchain it worked, which also puzzled me completely. Yesterday a colleague of mine in a different location built the telnet demo and tested it on his STM32H747DISC0 board, and that works as expected. I then sent him my binary which fails on my board and that works perfectly on his board. I checked and re-checked all of the option resistors and jumpers and they are identical to my colleague's board, so we have ordered a few new DISC0 boards, so once I have new boards I will test again and see if I still experience the same issues.

@hamishguthrie
Copy link
Author

I have now also managed to bring up our own custom board which has an H742 on it and the Ethernet works correctly. There is currently no support for the H742 in Zephyr, but I am working on support for that - as far as I can tell the only difference between that and the H743 is the amount of SRAM available, so I have created a new dts for the H742 and will make a pull request once I have it stable. I do not see any ST eval boards with the H742 on them, so there is no commonly available platform with that SoC, so I am not sure if my additions will be suitable for inclusion in upstream Zephyr?

@nashif nashif added the priority: low Low impact/importance bug label Jun 22, 2021
@erwango
Copy link
Member

erwango commented Jun 23, 2021

so I am not sure if my additions will be suitable for inclusion in upstream Zephyr?

No issue, you can add this SoC support, even if no board is available.

@erwango
Copy link
Member

erwango commented Jul 30, 2021

@hamishguthrie Did you had a chance to confirm that issue was coming from a faulty board ?
If yes would you min closing this issue?

@erwango erwango added the Waiting for response Waiting for author's response label Jul 30, 2021
@hamishguthrie
Copy link
Author

Yes, this issue was from a faulty board

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue is a bug, or the PR is fixing a bug platform: STM32 ST Micro STM32 priority: low Low impact/importance bug Waiting for response Waiting for author's response
Projects
None yet
Development

No branches or pull requests

5 participants