Skip to content

Conversation

@zhongnuo-tang
Copy link
Contributor

  1. ch8 requires both INT and EXT DMA, but the last valid data is from EXT DMA, so we should only trigger callback to parse data when EXT dma done

1. ch8 requires both INT and EXT DMA, but the last valid data is from EXT DMA, so we should only trigger callback to parse data when EXT dma done
@zhongnuo-tang
Copy link
Contributor Author

Hi @allen-kim-sec ,
Could you help to check this PR?

I have tried with sensor recordsamples 2048 and all data matched between LA and APP print.
One thing i noticed is that there are 1 sample where channel 0 is 0000, for both LA and APP print are 0000 and 1 sample where channel 1 is 0000 for both LA and APP.

LA:
F295 0000 E26F
APP:
f295 0000 e26f

LA:
0000 FFC7 EF22
APP:
0000 ffc7 ef22

Could the 0000 case actually from sensor really reading 0000 data?

@allen-kim-sec allen-kim-sec changed the title os/arch/arm/src/amebasmart: remove INT DMA isr callback when ch8 used [TDM] os/arch/arm/src/amebasmart: remove INT DMA isr callback when ch8 used Nov 3, 2025
@allen-kim-sec
Copy link

It's the same result

0033: 026d 000b 1e99 ffff ffff ffff ffff ffff
0034: 0000 0000 1eaf ffff 0000 0000 ffff ffff
0035: 0000 0000 1ebf ffff 0000 0000 ffff ffff
0036: 0000 0000 1ec1 ffff 0000 0000 ffff ffff
0037: 0000 0000 1ebb ffff 0000 0000 ffff ffff
Continuous trailing zeroes found!!!!! count: 1
0038: 0000 0000 1eb2 0000 0000 0000 0000 0000
0039: 0000 0000 1ea8 0000 0000 0000 0000 0000
0040: 0000 0000 1ea1 ffff 0000 0000 ffff ffff
0041: 0000 0000 1e9d ffff 0000 0000 ffff ffff
0042: 0270 001c 1ea3 ffff ffff ffff ffff ffff
0043: 026c 0026 1eae 0000 0000 0000 0000 0000

@zhongnuo-tang
Copy link
Contributor Author

It's the same result

0033: 026d 000b 1e99 ffff ffff ffff ffff ffff 0034: 0000 0000 1eaf ffff 0000 0000 ffff ffff 0035: 0000 0000 1ebf ffff 0000 0000 ffff ffff 0036: 0000 0000 1ec1 ffff 0000 0000 ffff ffff 0037: 0000 0000 1ebb ffff 0000 0000 ffff ffff Continuous trailing zeroes found!!!!! count: 1 0038: 0000 0000 1eb2 0000 0000 0000 0000 0000 0039: 0000 0000 1ea8 0000 0000 0000 0000 0000 0040: 0000 0000 1ea1 ffff 0000 0000 ffff ffff 0041: 0000 0000 1e9d ffff 0000 0000 ffff ffff 0042: 0270 001c 1ea3 ffff ffff ffff ffff ffff 0043: 026c 0026 1eae 0000 0000 0000 0000 0000

This PR should not affect the 0000, it is to prevent multiple callback from both prime and ext DMA.
how about measuring with LA to confirm if the 0000 data is from sensor read or DMA?

@allen-kim-sec
Copy link

It's the same result
0033: 026d 000b 1e99 ffff ffff ffff ffff ffff 0034: 0000 0000 1eaf ffff 0000 0000 ffff ffff 0035: 0000 0000 1ebf ffff 0000 0000 ffff ffff 0036: 0000 0000 1ec1 ffff 0000 0000 ffff ffff 0037: 0000 0000 1ebb ffff 0000 0000 ffff ffff Continuous trailing zeroes found!!!!! count: 1 0038: 0000 0000 1eb2 0000 0000 0000 0000 0000 0039: 0000 0000 1ea8 0000 0000 0000 0000 0000 0040: 0000 0000 1ea1 ffff 0000 0000 ffff ffff 0041: 0000 0000 1e9d ffff 0000 0000 ffff ffff 0042: 0270 001c 1ea3 ffff ffff ffff ffff ffff 0043: 026c 0026 1eae 0000 0000 0000 0000 0000

This PR should not affect the 0000, it is to prevent multiple callback from both prime and ext DMA. how about measuring with LA to confirm if the 0000 data is from sensor read or DMA?

Hello? zhongnuo ,

After reading your question, I felt that you might not fully understand the situation, so I am replying again.
The data from LA is not zero. I am trying to find the cause that makes 64 bytes zero,
but actually, the breakpoint is not triggered during the memory write operation in Trace32.
This is challenging because the occurrence location keeps changing, making it very difficult to reproduce reliably.

For now, I believe this PR is safer as it uses a single handler, as I mentioned earlier.

Patterns:

  1. The channels that turn into zero are always ch1, ch2, ch5, and ch6 Only
  2. Channels ch1, ch2, ch5, and ch6 belong to the Prime GDMA Buffer.
  3. The range that turns into zero is the 64 bytes of the Prime GDMA page.
  4. It happened in the front part of the first DMA page, and the location changes.
  5. Naman and I reproduced this symptom.
  6. we used "9. loadable_ext_ddr_mems" as Build Configuration

@allen-kim-sec
Copy link

allen-kim-sec commented Nov 4, 2025

Hi @allen-kim-sec , Could you help to check this PR?

I have tried with sensor recordsamples 2048 and all data matched between LA and APP print. One thing i noticed is that there are 1 sample where channel 0 is 0000, for both LA and APP print are 0000 and 1 sample where channel 1 is 0000 for both LA and APP.

LA: F295 0000 E26F APP: f295 0000 e26f

LA: 0000 FFC7 EF22 APP: 0000 ffc7 ef22

Could the 0000 case actually from sensor really reading 0000 data?

It's not proper case : problem is 64Bytes zero , therefore ch1 , ch2 , ch5 , ch6 should be zero...
Could we schedule a CC call at your convenient time?

@zhongnuo-tang
Copy link
Contributor Author

It's the same result
0033: 026d 000b 1e99 ffff ffff ffff ffff ffff 0034: 0000 0000 1eaf ffff 0000 0000 ffff ffff 0035: 0000 0000 1ebf ffff 0000 0000 ffff ffff 0036: 0000 0000 1ec1 ffff 0000 0000 ffff ffff 0037: 0000 0000 1ebb ffff 0000 0000 ffff ffff Continuous trailing zeroes found!!!!! count: 1 0038: 0000 0000 1eb2 0000 0000 0000 0000 0000 0039: 0000 0000 1ea8 0000 0000 0000 0000 0000 0040: 0000 0000 1ea1 ffff 0000 0000 ffff ffff 0041: 0000 0000 1e9d ffff 0000 0000 ffff ffff 0042: 0270 001c 1ea3 ffff ffff ffff ffff ffff 0043: 026c 0026 1eae 0000 0000 0000 0000 0000

This PR should not affect the 0000, it is to prevent multiple callback from both prime and ext DMA. how about measuring with LA to confirm if the 0000 data is from sensor read or DMA?

Hello? zhongnuo ,

After reading your question, I felt that you might not fully understand the situation, so I am replying again. The data from LA is not zero. I am trying to find the cause that makes 64 bytes zero, but actually, the breakpoint is not triggered during the memory write operation in Trace32. This is challenging because the occurrence location keeps changing, making it very difficult to reproduce reliably.

For now, I believe this PR is safer as it uses a single handler, as I mentioned earlier.

Patterns:

  1. The channels that turn into zero are always ch1, ch2, ch5, and ch6 Only
  2. Channels ch1, ch2, ch5, and ch6 belong to the Prime GDMA Buffer.
  3. The range that turns into zero is the 64 bytes of the Prime GDMA page.
  4. It happened in the front part of the first DMA page, and the location changes.
  5. Naman and I reproduced this symptom.
  6. we used "9. loadable_ext_ddr_mems" as Build Configuration

Hi @allen-kim-sec ,
I am able to reproduce using 9.loadable_ext_ddr_mems.. Let me debug first..

1. when using rxbuff in struct, it is not 64bytes aligned, and result in cache coherence issue when using DMA
2. change it to static buffer instead.
@zhongnuo-tang
Copy link
Contributor Author

Hi @allen-kim-sec ,
Could you try again with the new commit, tested over 400+ times and no 0x0000 data found using 9.loadable_ext_ddr_mems

@allen-kim-sec
Copy link

Hi @allen-kim-sec , Could you try again with the new commit, tested over 400+ times and no 0x0000 data found using 9.loadable_ext_ddr_mems

It's fine... clear... working fine.

Copy link

@allen-kim-sec allen-kim-sec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

match frame data and not find any trailed 0 --> is it move another address or clear defect ?

@zhongnuo-tang
please attach tech report

(*(g_i2sdevice[2])).i2s_rx_buf address is 0x60145df0 .
always occured at the first page. it's the first DMA Page ,,,,
I think it is not align issue... is it possible ?
it is not happened Breakpoint on the trace32 , noone write 0x0000 value in that memory range( i2s_rx_buf buffer )

@allen-kim-sec allen-kim-sec merged commit ab9c175 into Samsung:TDM Nov 5, 2025
9 of 11 checks passed
@allen-kim-sec
Copy link

allen-kim-sec commented Nov 7, 2025

@zhongnuo-tang
I will test again with the previous commit
-> I think Flat Config is not reason for this issue , it just attach to another memory .
looking for memset 64Bytes as 0

DMA Page Buffer address is changed to 0x60115000 from 0x60145df0.
look like Taking shelter from the rain for a moment.

@zhongnuo-tang
Copy link
Contributor Author

@zhongnuo-tang I will test again with the previous commit -> I think Flat Config is not reason for this issue , it just attach to another memory . looking for memset 64Bytes as 0

DMA Page Buffer address is changed to 0x60115000 from 0x60145df0. look like Taking shelter from the rain for a moment.

Hi Allen, i strongly thinks it is alignment issue. Our GDMA requires the memory to be 64bytes aligned and thats why we always use aligned(64) to the buffer, but in this case, first page 0x60145df0 is not aligned when we use the buff in the structure, i think maybe because aligned(64)in struct is used incorrectly. But for static buffer, aligned(64) ensures the addr to be aligned.

@allen-kim-sec
Copy link

@zhongnuo-tang I will test again with the previous commit -> I think Flat Config is not reason for this issue , it just attach to another memory . looking for memset 64Bytes as 0
DMA Page Buffer address is changed to 0x60115000 from 0x60145df0. look like Taking shelter from the rain for a moment.

Hi Allen, i strongly thinks it is alignment issue. Our GDMA requires the memory to be 64bytes aligned and thats why we always use aligned(64) to the buffer, but in this case, first page 0x60145df0 is not aligned when we use the buff in the structure, i think maybe because aligned(64)in struct is used incorrectly. But for static buffer, aligned(64) ensures the addr to be aligned.

What I found through repeated Trace32 analysis is that I couldn't catch any instance where 64 bytes were written as 0.
The writing was likely done by DMA.
However, I need further explanation on how the first part starts exactly at 0x60145df0,
which is not aligned to 64 bytes, and why this issue does not occur after the first page.
Upon consideration, 0x60145df0 is not aligned to 64 bytes, as it can be expressed as 25186679 * 64 + 48.
Therefore, I believe DMA processing should occur at either 0x60145dc0 or 0x60145e00,
but the value at 0x60145dc0 has never changed.
I would like to get feedback from the silicon team regarding whether the incorrect address usage in GDMA results in only 64 bytes being written as 0.

@zhongnuo-tang
Copy link
Contributor Author

@zhongnuo-tang I will test again with the previous commit -> I think Flat Config is not reason for this issue , it just attach to another memory . looking for memset 64Bytes as 0
DMA Page Buffer address is changed to 0x60115000 from 0x60145df0. look like Taking shelter from the rain for a moment.

Hi Allen, i strongly thinks it is alignment issue. Our GDMA requires the memory to be 64bytes aligned and thats why we always use aligned(64) to the buffer, but in this case, first page 0x60145df0 is not aligned when we use the buff in the structure, i think maybe because aligned(64)in struct is used incorrectly. But for static buffer, aligned(64) ensures the addr to be aligned.

What I found through repeated Trace32 analysis is that I couldn't catch any instance where 64 bytes were written as 0. The writing was likely done by DMA. However, I need further explanation on how the first part starts exactly at 0x60145df0, which is not aligned to 64 bytes, and why this issue does not occur after the first page. Upon consideration, 0x60145df0 is not aligned to 64 bytes, as it can be expressed as 25186679 * 64 + 48. Therefore, I believe DMA processing should occur at either 0x60145dc0 or 0x60145e00, but the value at 0x60145dc0 has never changed. I would like to get feedback from the silicon team regarding whether the incorrect address usage in GDMA results in only 64 bytes being written as 0.

Hi @allen-kim-sec,

The main reason for enforcing 64-byte alignment is related to the cache line boundary, not the DMA hardware itself.

After DMA transfers data from the peripheral to memory, the CPU typically performs a cache invalidate to ensure it reads the updated data. However, if the first page is not aligned to a 64-byte cache line, the cache line at the start of the buffer overlaps with unrelated memory.

For example, when the cache line covering the range 0x60145dc0 ~ 0x60145dff is invalidated, the entire 64 bytes are affected, including the start of the DMA buffer at 0x60145df0. Therefore, even if the data at 0x60145dc0 never changes, invalidating this cache line can overwrite part of the DMA buffer (e.g., with 0x0000) if the invalidate happens after the I2S callback performs its own cache maintenance.

In short, the corruption occurs because the first DMA buffer is not 64-byte aligned, causing its initial cache line to overlap unrelated memory. Subsequent pages are aligned and therefore not affected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants