Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stm32/...hal_dma.c: Use the -O2 option for the F7 MCU DMA driver. #13549

Closed
wants to merge 1 commit into from

Conversation

robert-hh
Copy link
Contributor

Using the default -Os option causes SPI transactions to fail at e.g. a PYBD_SF6 if the source and destination in memory are identical. That is always the case for spi.read() and spi.readinto(). Instead of the data to read the data in the transmit buffer was returned.

Using the default -Os option causes SPI transactions to fail at e.g.
a PYBD_SF6 if the source and destination in memory are identical. That
is always the case for spi.read() and spi.readinto(). Instead of the
data to read the data in the transmit buffer was returned.

Signed-off-by: robert-hh <robert@hammelrath.com>
@dpgeorge
Copy link
Member

Thanks for the patch. It may seem to fix the issue but I'd like to fully understand what is going on.

For example it could be related to the following: in RM0410 rev 2 section 8.3.11 (Single and Burst transfers) it says:

rm0410

We are using burst mode for SPI DMA with the constants DMA_MBURST_INC4 and DMA_PBURST_INC4, and I don't think there's anything preventing the data from crossing a 1K boundary. So this could be a problem.

@projectgus projectgus linked an issue Jan 31, 2024 that may be closed by this pull request
@robert-hh
Copy link
Contributor Author

robert-hh commented Jan 31, 2024

We are using burst mode for SPI DMA with the constants DMA_MBURST_INC4 and DMA_PBURST_INC4, and I don't think there's anything preventing the data from crossing a 1K boundary. So this could be a problem.

This could be a different problem. The actual behavior happens with transfers >= 2 byte length. If works is source and destination buffers are different, so it seems not to be related to crossing a 1k boundary.

Edit: This error does not show at a PYB_V11, which has the same restriction in the reference manual.

@robert-hh
Copy link
Contributor Author

So I understand that the bursts have a length of 4, and the item size is 4 = burst length 4 bytes. These 4 bytes must not cross a 1k memory boundary. In MicroPython buffers seems to be allocated at 16 byte boundaries. Thus, there should be no problem with the above restriction.

@andrewleech
Copy link
Sponsor Contributor

Considering the issue didn't exist in 1.18 and seems to be attributed to the HAL_DMA_Start_IT() function, was there a hal library update around then this function can be looked at before & after to narrow down a root cause?

@robert-hh
Copy link
Contributor Author

V1.18 was published Jan 17, 2022. The most recent commit for stm32f7xx_hal_dma.c was Aug 27, 2017. Similar for stm32f7xx_hal_spi.c.

git log  -- stm32f7xx_hal_dma.c
commit 85e9eeca9955df23e0acb1810c76f2373c003c8e
Author: Damien George <damien.p.george@gmail.com>
Date:   Sun Aug 27 22:12:55 2017 +1000

I compared the F7xx version of the file also to the F4xx version, which is used by the Pyboard V1.1, which does not show the behavior. Differences exist only in comments and integer constants written e.g. like 10 vs. 10U. Besides that the code is identical.
The behavior is spooky. Scanning through the commits between v1.18 and v1.19 I also found a commit pair, where the behavior changed. But that was not related at all to SPI and DMA. So maybe that just changed the caching of code and therefore the timing of code execution.

@robert-hh
Copy link
Contributor Author

@dpgeorge
Different to previous tests, compiling the HAL SPI driver with -O2 option instead of the HAL DMA driver lets the problem disappear as well. So the patch here cannot be expected as reliable.

@robert-hh robert-hh marked this pull request as draft February 1, 2024 10:56
@dpgeorge
Copy link
Member

dpgeorge commented Feb 2, 2024

Let's close this PR then.

@dpgeorge dpgeorge closed this Feb 2, 2024
@robert-hh
Copy link
Contributor Author

I was not sure whether I should close it or keep it visible for people seeking a temporary fix. And so I had set it to the draft state.

@robert-hh
Copy link
Contributor Author

@dpgeorge
After two days break, I looked again at the topic and repeated the delay test suggested by @andrewleech . Adding a short delay before the first call to HAL_DMA_Start_IT() after line 1808 in stm32f7xx_hal_spi.c made SPI work. The same happens if I insert that delay at the begin of HAL_DMA_Start_IT(). Even HAL_Delay(0) seems sufficient, reducing the delay to upmost three function calls, one assignment and one compare.
Strange! But that complies at least with the observation, that changing the optimization for either stm32f7xx_hal_spi.c or stm32f7xx_dma_spi.c had a positive effect.

@andrewleech
Copy link
Sponsor Contributor

The delay suggestion was from @projectgus but regardless that's an interesting finding.
I wonder if it's a race condition between configuration of different bits of the SPI or dma hardware, or if it's related to the CPU caching.

@robert-hh
Copy link
Contributor Author

Sorry @projectgus. The other aspect: SPI using DMA work fine if the memory addresses for read and write are different. So it looks like a DMA config problem.

@robert-hh
Copy link
Contributor Author

Just an additional note. The call to HAL_Delay() can be placed in the function HAL_SPI_TransmitReceive_DMA() at any place before the first call to HAL_DMA_Start_IT(), even at the first line of that function. However adding it to spi.c before the call to HAL_SPI_TransmitReceive_DMA() does not fix the problem. Strange, isn't it?

@robert-hh robert-hh deleted the stm32_dma_patch branch February 5, 2024 10:25
@robert-hh
Copy link
Contributor Author

Another observation: If spi.read() is called with larger sizes it can be seen, that the number of wrong bytes seems always to be a power of 2 less or equal to 32, regardless of the requested size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

STM32: SPI.read(n) fails if n > 1 (regression)
3 participants