Fix template argument order for batch_bool loads by kalenedrael · Pull Request #1181 · xtensor-stack/xsimd

kalenedrael · 2025-10-28T03:59:19Z

batch_bool<T, A>::load_(un)aligned dispatches to kernel::load_(un)aligned<A>, which requires that the architecture be the first template parameter.

serge-sans-paille · 2025-10-28T07:15:48Z

mmmh, it means that it was mostly untested, would you mind adding some tests in test/test_batch.cpp ?

kalenedrael · 2025-10-29T04:01:42Z

I'm not really sure how to test this. We need to check that the "highest" overload is called for a given architecture, but I don't know how to do that without just checking each overload and architecture individually.

DiamonDinoia · 2025-10-29T14:21:36Z

Hi @kalenedrael,

Just my two cents, feel free to ignore if not accurate.
If this works as the other tests it is enough to call batch_bool<T, A>::load_(un)aligned with the T, A provided by the test class and verify the result against reference answers. Then these tests are compiled for a wide variety of archs in CI with means for that for sse2 only the sse2 overload exsist and only that arch is tested. For avx, it calles the highest overload as you said and so avx is tested even if sse is available and so on.

Cheers,
Marco

kalenedrael · 2025-10-30T03:51:28Z

The test coverage for batch_bool<T, A>::load_(un)aligned seems reasonably complete. The problem is that because the template parameters are swapped, the compiler ignores the architecture-specific overloads for AVX2 and Neon due to incompatible types and falls back to the common implementation. It still compiles and isn't expected to cause any test failures; I only noticed because I was using batch_bool and performance was much worse than I expected.

The dispatching is what I don't know how to test. In general, it seems like you more or less have to hard-code the expected overloads for each architecture, and find a way to disable the other ones so they can't be used as a fallback. This is both convoluted and the worst kind of test, because it's essentially copying the code itself.

`batch_bool<T, A>::load_(un)aligned` dispatches to `kernel::load_(un)aligned<A>`, which requires that the architecture be the first template parameter.

DiamonDinoia · 2025-11-01T19:11:57Z

Oh yes, that would be hard to test. My previous suggestion missed the point. I apologize.

Then I guess the only way to make this impossible is to either use static_asserts i.e std::is_base_of or sfinae on the common implementation making sure that the type are what they should be.

Fix template argument order for batch_bool loads

8705787

`batch_bool<T, A>::load_(un)aligned` dispatches to `kernel::load_(un)aligned<A>`, which requires that the architecture be the first template parameter.

kalenedrael force-pushed the booldispatch branch from 5e3f2de to 8705787 Compare November 1, 2025 06:39

remove unneeded overload

84a07b4

serge-sans-paille merged commit 21d9634 into xtensor-stack:master Nov 3, 2025
57 of 59 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix template argument order for batch_bool loads#1181

Fix template argument order for batch_bool loads#1181
serge-sans-paille merged 2 commits into
xtensor-stack:masterfrom
kalenedrael:booldispatch

kalenedrael commented Oct 28, 2025

Uh oh!

serge-sans-paille commented Oct 28, 2025

Uh oh!

kalenedrael commented Oct 29, 2025

Uh oh!

DiamonDinoia commented Oct 29, 2025

Uh oh!

kalenedrael commented Oct 30, 2025

Uh oh!

DiamonDinoia commented Nov 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kalenedrael commented Oct 28, 2025

Uh oh!

serge-sans-paille commented Oct 28, 2025

Uh oh!

kalenedrael commented Oct 29, 2025

Uh oh!

DiamonDinoia commented Oct 29, 2025

Uh oh!

kalenedrael commented Oct 30, 2025

Uh oh!

DiamonDinoia commented Nov 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants