Pinging sam_e70 leads to unresponsive ethernet device at some point #11255

tbursztyka · 2018-11-09T13:46:07Z

On target a net ping shows it can send/recv packets.

From an external host: ping -i 0.00001 will work for some time until it cannot reach the target. On target side, a net ping does not work anymore.

Seems like either the ethernet driver or the ethernet device on the sam_e70 gets unresponsive.

aurel32 · 2018-12-05T19:33:31Z

I can't reproduce this problem with PR #11888 applied. Even with the data cache disabled, the existing code is missing a data synchronization barrier after writing the descriptor list.

I got a few packets lost, but I guess it's normal at this rate. The Ethernet interface of the board is still working fine after this.

12843508 packets transmitted, 12843327 received, 0.00140927% packet loss, time 2701ms
rtt min/avg/max/mdev = 0.084/0.153/18.829/0.041 ms, pipe 2, ipg/ewma 0.183/0.153 ms

aurel32 · 2018-12-08T11:46:03Z

As discussed in PR#11888, I am convinced this bug is due to the timeout code in the TX part of the Ethernet driver. Copying the description from the other PR:

I have found it happens when the Zephyr network stack has to sent 2 packets in row, for example an ack for the just received data + answer data. This seems to be a latent issue, but it happens more often given that instruction cache and now data cache have been enabled. Basically the CPU is now faster than Ethernet device (which is good in some sense).

The problem seems to be the following one:

the first packet goes through eth_tx, one tx_desc_sem semaphore is taken, and the tx_timeout_work delayed work is submitted.
the second packet also goes through eth_tx, another tx_desc_sem semaphore is taken, and the tx_timeout_work delayed work is not started because there is already one already submitted. Not sure that's a problem in that case.
the first packet has been sent, one tx_desc_sem semaphore is given and the tx_timeout_work delayed work is cancelled.
the second packet has been sent but given the delayed work has already been cancelled, tx_completed is not called: the tx_desc_sem semaphore is not given back and the network packet is not unreferenced.

I am not fully sure about how exactly the steps 2 and 3 are interleaved, what is sure is that the call to tx_completed is skipped for the second packet has the job has already been cancelled. Adding some delay between sending packets (for instance at the beginning of eth_tx) workarounds the issue.

@mnkp suggested:

one possible and easy fix would be to get rid of all the k_delayed_* functions and associated data structures and inside eth_tx where we have
		k_sem_take(&queue->tx_desc_sem, K_FOREVER);
replace K_FOREVER with the timeout value. It works a bit differently than the old design but does the job. We can only get stuck on the tx_queue if there is a bug in the driver or the hardware malfunctions. So this is a safety feature, not part of the normal functional behavior. We would only have to be careful writing the small piece of code that cleans up the data in case we do fail to get the semaphore on time.

This is basically the strategy used in some the other Ethernet drivers. I am not sure it applies here, as there is still the risk of filling descriptors while the previous frame from the queue hasn't been fully sent. I am not sure how the Ethernet device split the data in the queue into multiple frames, or even if it is able to do that.

If we need to ensure we have a single frame in a queue at a given moment, I guess we should use the same technique, but with an additional semaphore working on frames instead of fragments.

aurel32 · 2018-12-08T12:06:11Z

This is basically the strategy used in some the other Ethernet drivers. I am not sure it applies here, as there is still the risk of filling descriptors while the previous frame from the queue hasn't been fully sent. I am not sure how the Ethernet device split the data in the queue into multiple frames, or even if it is able to do that.

I have found that this is done by tagging the last fragment of the frame with GMAC_TXW1_LASTBUFFER. So indeed the solution proposed by @mnkp works.

aurel32 · 2018-12-12T15:52:47Z

So indeed the solution proposed by @mnkp works.

I have just tried and simulated an issue by ignoring one Ethernet TX interrupt over 100. It doesn't work as it leads to a net buffer exhaustion before the semaphore exhaustion.

I think we should keep firing a delayed work after the timeout (defaulting to 5s) and just check all descriptors. If one is not empty, there has been a problem and we should just call tx_error_handler. As long as there is network traffic, this delayed work should not fire given the timeout being updated regularly. I don't think that going through the descriptors list once 5 second after the last transmitted packet is really problematic. We should just ensure that no new packets is sent during that check, maybe using a semaphore to access the descriptor list instead of disabling interrupts and comparing err_tx_flushed_count with its value at entry like now.

mnkp · 2018-12-16T00:52:39Z

I have just tried and simulated an issue by ignoring one Ethernet TX interrupt over 100. It doesn't work as it leads to a net buffer exhaustion before the semaphore exhaustion.

Even if an occasional interrupt was lost the semaphore count should be kept in sync with net buffer count. If it's not that's a driver bug. But maybe you meant net pkt count, i.e. the difference between data structures defined by NET_BUF_TX_COUNT and NET_PKT_TX_COUNT?

Sorry for a late reply. I'm no longer working with Atmel chips. That become a hobby project.

aurel32 · 2018-12-16T11:29:28Z

Even if an occasional interrupt was lost the semaphore count should be kept in sync with net buffer count. If it's not that's a driver bug. But maybe you meant net pkt count, i.e. the difference between data structures defined by NET_BUF_TX_COUNT and NET_PKT_TX_COUNT?

The TX semaphore count is not kept in sync with the TX net buffer count, but it's not a driver issue. The assumption that only TX net buffer are used is the TX path is wrong. For instance in the ICMP code, the RX net buffers are reused in the TX path. What I observe when pinging the board is an exhaustion of the RX net buffers.

In addition to that I guess that sending small packets using a single net buffer with NET_PKT_TX_COUNT lower than NET_BUF_TX_COUNT (the default) can lead to a packet buffer exhaustion.

aurel32 · 2018-12-16T11:47:29Z

I wonder if we should, at least at a first step, remove this timeout error handling. It is there to prevent a lock-up if there is a bug in the driver or in case of hardware malfunction. Right now we know that the timeout bug happens with high traffic and actually causes the lock-up it is supposed to prevent. Also note that in case of hardware malfunction detected by the hardware (i.e. through a TX error interrupt), the whole descriptor list / net buffer are correctly reinitialized.

mnkp · 2018-12-16T12:11:33Z

The assumption that only TX net buffer are used is the TX path is wrong. For instance in the ICMP code, the RX net buffers are reused in the TX path.

Thanks for pointing it out. That adds one more dimension to take into account. I'll have to think a bit more about the issue. Also, in case of sending small frames we could likely have net packet exhaustion before we run out of net buffers.

I wonder if we should, at least at a first step, remove this timeout error handling.

Yes, that's a good idea. We need more time to properly implement timeout, no need to keep the driver broken.

aurel32 · 2019-01-17T22:31:04Z

@jukkar I have seen you got this issue assigned. I have understood that more than one issue was ending up in blocking the TX path, and especially one introduced by me working with the patches from PR#11888, I finally have a set of patches almost ready. I just need to write proper commit messages, that are quite important for this kind of changes. I'll submit that in the next days.

jukkar · 2019-01-18T08:44:02Z

@aurel32 thanks for the info, I am looking forward to try any fixes for this.

tbursztyka added area: Networking area: Ethernet labels Nov 9, 2018

jukkar added the platform: Microchip SAM Microchip SAM Platform (formerly Atmel SAM) label Nov 12, 2018

nashif added the bug The issue is a bug, or the PR is fixing a bug label Nov 13, 2018

galak added the priority: low Low impact/importance bug label Nov 20, 2018

aurel32 mentioned this issue Dec 7, 2018

SAM E70: fix Ethernet driver cache coherency issues and always enable data cache #11888

Merged

nashif assigned jukkar Jan 11, 2019

aurel32 mentioned this issue Jan 18, 2019

SAM GMAC ethernet driver fixes #12591

Merged

jukkar closed this as completed in #12591 Jan 22, 2019

aunsbjerg mentioned this issue Jun 5, 2019

eth: pinging frdm k64f eventually leads to unresponsive ethernet device #16639

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pinging sam_e70 leads to unresponsive ethernet device at some point #11255

Pinging sam_e70 leads to unresponsive ethernet device at some point #11255

tbursztyka commented Nov 9, 2018

aurel32 commented Dec 5, 2018

aurel32 commented Dec 8, 2018

aurel32 commented Dec 8, 2018

aurel32 commented Dec 12, 2018

mnkp commented Dec 16, 2018

aurel32 commented Dec 16, 2018

aurel32 commented Dec 16, 2018

mnkp commented Dec 16, 2018

aurel32 commented Jan 17, 2019

jukkar commented Jan 18, 2019

Pinging sam_e70 leads to unresponsive ethernet device at some point #11255

Pinging sam_e70 leads to unresponsive ethernet device at some point #11255

Comments

tbursztyka commented Nov 9, 2018

aurel32 commented Dec 5, 2018

aurel32 commented Dec 8, 2018

aurel32 commented Dec 8, 2018

aurel32 commented Dec 12, 2018

mnkp commented Dec 16, 2018

aurel32 commented Dec 16, 2018

aurel32 commented Dec 16, 2018

mnkp commented Dec 16, 2018

aurel32 commented Jan 17, 2019

jukkar commented Jan 18, 2019