-
-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I2C Slave race condition leads to handler being out of sync #17
Comments
From @stickbreaker on August 19, 2015 1:47 I am not familiar with the DUE, but Shouldn't the I2C HW be stretching SCL, This should prohibit any new transaction before the software has completed handling the current transaction? Chuck |
From @shahokun on August 19, 2015 3:47 Clock stretching in general is not necessarily an issue. In this case, however, the HW is stretching the clock for a Master Read operation, but the software incorrectly thinks it's in a Slave Send/Master Write operation and does not respond correctly. The clock is stretched indefinitely and the transaction never completes. |
From @stickbreaker on August 19, 2015 4:29 Ok, I think I see what you have explained. You getting an slave read interrupt while processing a slave write command. |
From @shahokun on August 20, 2015 20:36 Testing showed that the proposed solution above did not fix the issue. After more debugging, it seems that the problem is simpler: the Master continues onto the next I2C transaction before the Slave (Arduino) has a chance to respond to the end of the first I2C transaction. This may be because the Slave is busy in a disable interrupts block, or some other rare sequence of events. The race condition goes something like this:
The root cause of this issue is that the Wire library assumes that the status register follows some predictable sequence of events. However, because it is interrupt driven, the hardware status register can change state before the software has a chance to poll the status. Essentially, the Wire driver should not assume that just because it is currently in the SLAVE_RECV state that it will stay that way. I would propose changing the Slave driver in Wire.cpp to something like this:
This may not be the best solution depending on how you want to handle I2C error conditions. However, I think that this issue should be at least be warned about in the Wire documentation. Yes, it results because of a combination of the Master sending I2C messages too often, and the Slave not handling I2C interrupts quickly enough, but the software driver should not be able to get stuck in this unsynchronized state. |
From @shahokun on August 18, 2015 20:42
We have an Arduino Due acting as the slave device on an I2C bus. Rarely, we will see the Arduino get "stuck" and be unresponsive. When we attached a debugger to the running processor, we saw that it was constantly getting bombarded with I2C interrupts (WIRE_ISR_HANDLER). The
status
flag indicated that the Arduino was in SLAVE_SEND mode and had already transmitted data (TWI TXRDY bit = 0). However, the I2C transmission would never complete, the I2C clock was pulled low, and the processor was stuck in an infinite loop of servicing an interrupt but not doing anything (i.e. none of the if blocks were entered).The confounding factor was that the SVREAD bit of the TWI status register was 0, indicating a Master Write. If this were the case, then the correct
status
should be SLAVE_RECV. Working backwords from this, the likely culprit would result from a race condition at the following lines (339-343) in TwoWire::onService()Assume a Master Read operation just completed and the Arduino has
status = SLAVE_SEND
. The Arduino is wrapping up the I2C transaction and enables the SVACC interrupt (TWI_EnableIt(twi, TWI_SR_SVACC);
). After this happens, but before thestatus = SLAVE_IDLE;
line is executed, a new I2C transaction comes in and triggers an interrupt. This time, it is a Master Write operation. It jumps into the handler, but becausestatus
has not been set to SLAVE_IDLE, it does not execute the first if block which would properly setstatus
to SLAVE_RECV. Moreover, all of our Master Write transactions are two or more bytes, so the TWI would stretch the clock indefinitely because it never reads the first byte due tostatus
being incorrect.This is a bit theoretical because we will need to do extended testing to see if the problem goes away. The proposed solution is to reverse the order of the cleanup steps:
This keeps the
status
flag protected against synchronization issues. It also reverses the order of the DisableIt()/EnableIt() at lines 301-303 so that you never have both groups of interrupts enabled at once. I would change the flags to use TWI_IDR_* for all DisableIt() and TWI_IER_* for all EnableIt() operations, for posterity; however, TWI_IDR_x, TWI_IER_x, and TWI_SR_x are the same value for all x, so this does not lead to any incorrect behavior.Could anyone provide advice on whether these sorts of nested interrupts could occur in the TWI peripheral? I wasn't able to find explicit information about how the NVIC on the Due interacts with the TWI interrupt registers. I assume that since specific TWI flags are explicitly enabled and disabled that they are treated separately. In the proposed scenario, the Due would be in the handler for a TXCOMP interrupt, then get interrupted by a SVACC interrupt.
Copied from original issue: arduino/Arduino#3699
The text was updated successfully, but these errors were encountered: