Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load-from-misaligned-address failures on Hexagon simulator #6306

Open
steven-johnson opened this issue Apr 18, 2024 · 3 comments
Open

Load-from-misaligned-address failures on Hexagon simulator #6306

steven-johnson opened this issue Apr 18, 2024 · 3 comments

Comments

@steven-johnson
Copy link
Contributor

Several tests fail when built for Hexagon and run under the simulator; the QuRT exit code (0x2001) indicates the failures are load from misaligned addresses:

test/qs8_dwconv_minmax_multipass_fp32_test
test/qs8_qc8w_dwconv_minmax_multipass_fp32_test
test/qu8_dwconv_minmax_multipass_fp32_test

(Haven't tested on hardware yet, will update this bug once I've done so.)

@steven-johnson
Copy link
Contributor Author

Update: these fail in similar ways on Samsung S22.

@fbarchard
Copy link
Contributor

8 bit (or 4 bit) weights can cause an alignment issue for bias and scale that are 32 bit elements and usually vectors.
dwconv is an igemm. igemm is a gemm, but also has MR pointers embedded in the weights.
A (i)gemm has NR int32 bias values
An igemm then has MR pointers
A gemm/igemm has NR*KC weights
A gemm/igemm has NR floats and optional float bias
If the number of weights is odd, the bias, indirect pointers and scale can be unaligned.

For packw kernel the hexagon crashed on the int32 bias when the kernel size is smaller than 4 bytes.
The work around was #6303
that added an attribute to the pointers to allow unaligned stores. But that is slow.

I think if hexagon kernels carefully use sizes that are at least a multiple of 4 bytes, we can use float and int values with memw instead of memb.
For vectors it wont always be possible. An IGEMM (or dwconv) has MR pointers before the weights, so MR would need to be large or padded to ensure vector aligned values.
For gemm is should be possible, if the NR is a multiple of vector size.

@fbarchard
Copy link
Contributor

If the multipass specifically has the issue but single pass works, its likely the temporary accumulation buffer is not int32 aligned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants