Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STM32F407 I2C driver hangs #22751

Closed
lowlander opened this issue Feb 12, 2020 · 14 comments · Fixed by #27291
Closed

STM32F407 I2C driver hangs #22751

lowlander opened this issue Feb 12, 2020 · 14 comments · Fixed by #27291
Assignees
Labels
area: I2C bug The issue is a bug, or the PR is fixing a bug platform: STM32 ST Micro STM32 priority: low Low impact/importance bug
Milestone

Comments

@lowlander
Copy link
Collaborator

The STM32F407 I2C driver hangs from time to time where it waits on the semaphore in the driver. When that happens the signal on the bus looks like this (SDA = yellow, SCL = green);

i2c_error

To work around it I added a timeout to the k_sem_take calls, but I have not found the real cause of the problem. On the scope is to see it must go wrong real early, because it generates a start-stop without any data.

This is on the 1.14 branch, but I cherry-picked most commits (like the new timeout handling) from master. The target is a STM32F407 so it uses the V1 driver. Also not using IRQ's but polling causes the problem to happen less but it still happens.

@erwango erwango added this to the v1.14.2 milestone Feb 12, 2020
@erwango erwango added platform: STM32 ST Micro STM32 area: I2C bug The issue is a bug, or the PR is fixing a bug labels Feb 12, 2020
@erwango erwango self-assigned this Feb 12, 2020
@erwango
Copy link
Member

erwango commented Feb 12, 2020

@lowlander even if happening from time to time, would you have some code to reproduce?

@lowlander
Copy link
Collaborator Author

@erwango I will try to "port" the problem to a 407 dev board. This will take me at least until early next week. As soon as I have more info I'll post it here.

@pavlohamov
Copy link
Contributor

@lowlander could you be so kind to try play with optimization?
CONFIG_SIZE_OPTIMIZATIONS=n
CONFIG_NO_OPTIMIZATIONS/CONFIG_DEBUG_OPTIMIZATIONS=y
and vise-versa

@lowlander
Copy link
Collaborator Author

@pavlohamov I have tried every optimization setting and it doesn't seem to influence the problem, it happens with all settings.

@erwango
Copy link
Member

erwango commented Feb 14, 2020

Setting prio low while we're waiting for way to reproduce

@erwango erwango added the priority: low Low impact/importance bug label Feb 14, 2020
@lowlander
Copy link
Collaborator Author

I have been trying to debug the problem, and the only thing I can say is the first IRQ (BS) happens, and the driver sets the DR register with the address, but that never lands on the bus. And no other IRQ's will be generated, so the driver "hangs".

What is more worrying is that I found others that reported the same issue, but with different software;
https://community.st.com/s/question/0D50X00009Xkhfn/stm32f2xx-i2c-not-sending-address-after-start

@lowlander
Copy link
Collaborator Author

@erwango would it be possible to access the SDA and SCL as gpio pins inside the driver, so they can be used to "hard reset" the i2c bus in case a slave gets confused by the start-stop pulse and doesn't release the bus.

Because the ST errata just has as software workaround "reset the peripheral", and than there is a risk of leaving the bus in a wrong state.

So I think it would be good to have a real bus-reset function in the driver, so not everybody has to write their own hacks.

@erwango
Copy link
Member

erwango commented Feb 25, 2020

@lowlander indeed, we're working on it via the introduction of pinctrl definition via device tree.
In this model, each peripheral driver would have access at pin definitions and would be able to use them for multiple purpose: reset, low power, ...
Topic has just started, but you could find some info here: #22748

@lowlander
Copy link
Collaborator Author

@erwango but how to fix this for 1.14.X ? I can add the timeout and give the i2c hardware a reset, but I have no access to the GPIO pins the reset the bus in the worst case scenario.

@erwango
Copy link
Member

erwango commented Feb 25, 2020

@lowlander

@erwango but how to fix this for 1.14.X

My bad I forgot you were using 1.14. This would be a significant change (and not a simple fix), I don't know the policy on adding enhancement on LTS branch. @MaureenHelm ?

@lowlander
Copy link
Collaborator Author

@erwango I think just a peripheral soft-reset is the only option, if it keeps failing the 1.14 user will have to build its own hard-reset via GPIO in its own application.

Your timeout patch can be back ported and a 100ms timeout on the mutex should make sure the driver comes back to "userspace" where the user than must check via GPIO if the SDA or SCL are still low and than reset the bus via a "fake" clock signal and a "fake" stop-condition.

Let me try to make a PR and than move the discussion there.

@lowlander
Copy link
Collaborator Author

OK after some more research it seems the the driver sets the STOP flag (not sure when/how), this flags stays pending until a data byte is finished or a START is generated, and than it will directly generate a STOP (and that is what I see on the bus).

I now check and reset the STOP bit in CR1 before doing a START, and this seems to work.

Not really sure where it goes wrong, it has to be some race-condition or else it would always case a problem.

The fix I have now needs some more long term (several days run time) testing before I'll create a PR.

@lowlander
Copy link
Collaborator Author

@erwango I made an initial patch set #23663, that fixes my problems. The ECM problem was very tricky because in only happened every 20 to 30 hours, until I figured out that it can be triggered by simply short circuit SDA to GND, that will directly hang the current driver.

lowlander added a commit to lowlander/zephyr that referenced this issue Mar 21, 2020
Sometimes the stop bit is still set when starting the next transaction.
When that happens the hardware will generate a start directly followed
by a stop. This will not be detected by the driver and it will endlessly
wait for the next interrupt that will never come.

Fixes: zephyrproject-rtos#22751

Signed-off-by: Erwin Rol <erwin@erwinrol.com>
@github-actions
Copy link

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: I2C bug The issue is a bug, or the PR is fixing a bug platform: STM32 ST Micro STM32 priority: low Low impact/importance bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants