New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STM32F407 I2C driver hangs #22751
Comments
@lowlander even if happening from time to time, would you have some code to reproduce? |
@erwango I will try to "port" the problem to a 407 dev board. This will take me at least until early next week. As soon as I have more info I'll post it here. |
@lowlander could you be so kind to try play with optimization? |
@pavlohamov I have tried every optimization setting and it doesn't seem to influence the problem, it happens with all settings. |
Setting prio low while we're waiting for way to reproduce |
I have been trying to debug the problem, and the only thing I can say is the first IRQ (BS) happens, and the driver sets the DR register with the address, but that never lands on the bus. And no other IRQ's will be generated, so the driver "hangs". What is more worrying is that I found others that reported the same issue, but with different software; |
@erwango would it be possible to access the SDA and SCL as gpio pins inside the driver, so they can be used to "hard reset" the i2c bus in case a slave gets confused by the start-stop pulse and doesn't release the bus. Because the ST errata just has as software workaround "reset the peripheral", and than there is a risk of leaving the bus in a wrong state. So I think it would be good to have a real bus-reset function in the driver, so not everybody has to write their own hacks. |
@lowlander indeed, we're working on it via the introduction of pinctrl definition via device tree. |
@erwango but how to fix this for 1.14.X ? I can add the timeout and give the i2c hardware a reset, but I have no access to the GPIO pins the reset the bus in the worst case scenario. |
My bad I forgot you were using 1.14. This would be a significant change (and not a simple fix), I don't know the policy on adding enhancement on LTS branch. @MaureenHelm ? |
@erwango I think just a peripheral soft-reset is the only option, if it keeps failing the 1.14 user will have to build its own hard-reset via GPIO in its own application. Your timeout patch can be back ported and a 100ms timeout on the mutex should make sure the driver comes back to "userspace" where the user than must check via GPIO if the SDA or SCL are still low and than reset the bus via a "fake" clock signal and a "fake" stop-condition. Let me try to make a PR and than move the discussion there. |
OK after some more research it seems the the driver sets the STOP flag (not sure when/how), this flags stays pending until a data byte is finished or a START is generated, and than it will directly generate a STOP (and that is what I see on the bus). I now check and reset the STOP bit in CR1 before doing a START, and this seems to work. Not really sure where it goes wrong, it has to be some race-condition or else it would always case a problem. The fix I have now needs some more long term (several days run time) testing before I'll create a PR. |
Sometimes the stop bit is still set when starting the next transaction. When that happens the hardware will generate a start directly followed by a stop. This will not be detected by the driver and it will endlessly wait for the next interrupt that will never come. Fixes: zephyrproject-rtos#22751 Signed-off-by: Erwin Rol <erwin@erwinrol.com>
This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time. |
The STM32F407 I2C driver hangs from time to time where it waits on the semaphore in the driver. When that happens the signal on the bus looks like this (SDA = yellow, SCL = green);
To work around it I added a timeout to the k_sem_take calls, but I have not found the real cause of the problem. On the scope is to see it must go wrong real early, because it generates a start-stop without any data.
This is on the 1.14 branch, but I cherry-picked most commits (like the new timeout handling) from master. The target is a STM32F407 so it uses the V1 driver. Also not using IRQ's but polling causes the problem to happen less but it still happens.
The text was updated successfully, but these errors were encountered: