Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different instructions may be generated for different immediates #1

Closed
nemequ opened this issue Jun 14, 2021 · 1 comment
Closed

Different instructions may be generated for different immediates #1

nemequ opened this issue Jun 14, 2021 · 1 comment

Comments

@nemequ
Copy link
Owner

nemequ commented Jun 14, 2021

For instructions with immediate-mode parameters, the compiler may generate different instruction(s). For example,

#include <stdint.h>

typedef int32_t i32x4 __attribute__((__vector_size__(16)));

i32x4
replace_lane_0(i32x4 v, int32_t value) {
    v[0] = value;
    return v;
}

i32x4
replace_lane_1(i32x4 v, int32_t value) {
    v[1] = value;
    return v;
}

Compiler Explorer: https://godbolt.org/z/9beqbP4Wj

Generates:

replace_lane_0(int __vector(4), int):               # @replace_lane_0(int __vector(4), int)
        movd    xmm1, edi
        movss   xmm0, xmm1                      # xmm0 = xmm1[0],xmm0[1,2,3]
        ret
replace_lane_1(int __vector(4), int):               # @replace_lane_1(int __vector(4), int)
        movd    xmm1, edi
        punpcklqdq      xmm1, xmm0              # xmm1 = xmm1[0],xmm0[0]
        shufps  xmm1, xmm0, 226                 # xmm1 = xmm1[2,0],xmm0[2,3]
        movaps  xmm0, xmm1
        ret

I haven't worked out all the details yet, but for most instructions I think it will be possible to work around this by generating all possible values; with the exception of shuffle, worst case is only 16. We'll have to do some work parsing the assembly to deduplicate the results, but nothing too complex.

Shuffle is obviously the exception here, but that's nothing new. For that, I think the best option is just to link to Compiler Explorer and let people plug in their own values.

@nemequ
Copy link
Owner Author

nemequ commented Jun 17, 2021

I tihnk this is basically resolved by 2be4bca. We now try all possible values for the immediates, and merge implementations where the output differs only in by the value of an immediate parameter in the assembly. The code is a mess, but it seems to basically work.

@nemequ nemequ closed this as completed Jun 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant