Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIMD Helpers #249

Open
ubitux opened this issue Mar 22, 2019 · 4 comments
Open

SIMD Helpers #249

ubitux opened this issue Mar 22, 2019 · 4 comments

Comments

@ubitux
Copy link

ubitux commented Mar 22, 2019

Is there any SIMD helpers for the decompiler?

Here is an intrinsics example:

#include <immintrin.h>

__m128 intrinsics(__m128 src, __m128 x, __m128 y)
{
    __m128 s = _mm_shuffle_ps(x, y, _MM_SHUFFLE(0, 1, 2, 3));
    return _mm_add_ps(src, s);
}

This compiles to the following 3 instructions:

        00100000 0f c6 ca 1b     SHUFPS     XMM1,XMM2,0x1b
        00100004 0f 58 c1        ADDPS      XMM0,XMM1
        00100007 c3              RET

But the decompiler doesn't like it at all:

undefined  [16] intrinsics(undefined auParm1 [16],undefined auParm2 [16],undefined auParm3 [16])
{
    return CONCAT412(SUB164(auParm1 >> 0x60,0) + SUB164(auParm3,0),
                     CONCAT48(SUB164(auParm1 >> 0x40,0) + SUB164(auParm3 >> 0x20,0),
                              CONCAT44(SUB164(auParm1 >> 0x20,0) + SUB164(auParm2 >> 0x40,0),
                                       SUB164(auParm1,0) + SUB164(auParm2 >> 0x60,0))));
}

I wasn't able to hint the decompiler about a native __m128 type.

This is a very simple case, but in most multimedia related code, it leads to an insane amount of garbage. Combine this with the auto-vectorization of the compiler and you can forfeit the use of the decompiler for the whole function.

@ubitux ubitux added the Type: Question Further information is requested label Mar 22, 2019
@MagnificentS
Copy link

I really hope this gets fixed soon. It makes it hard to work on games or anything that uses a lot of floats

@ubitux
Copy link
Author

ubitux commented Dec 23, 2019

Note that on AArch64 typically, while far from perfect, the NEON instructions are recognized as special SIMD code, and you end up with code such as:

auVar7 = SIMD_INT_SEXT(uVar21,2);
auVar5 = SIMD_INT_MULT(auVar5,auVar7,4);
auVar6 = SIMD_INT_ADD(auVar4,auVar5,4);

This is already much better than what you get in x86, even though other limitations arises (such as type primitive confusions due to their sizes, and other maybe unrelated issues driving Ghidra crazy).

If someone is willing to implement this correctly, they may be interested in test materials. For that I would typically suggest projects making heavy uses of intrinsics. libvpx is one of them (git grep '[ie]mmintrin.h' to identify these objects), and then maybe x264 for code with pure SIMD assembly code as a stress test.

@Coder-256
Copy link

I ran into this same issue with ARM, the decompiler completely failed to understand a function containing some very basic VFP instructions (vstr, vmov, vcvt, etc.). It would be really nice to see some basic support for these.

@Danil6969
Copy link

Danil6969 commented Sep 16, 2020

For now there is no devectorization in any form other than just using 4 functions/macros for manipulation over arrays of undefined (aka uchar): CONCAT, SUB, SEXT, ZEXT. The outputting of any vector instruction to C is restricted to only use them, no pointer arithmetic over reading/writing varnodes is permitted (it's actually transformed into those functions after heritage or something if there was an indirect access to register by its address). The latter would require serious rewriting of the decompiler engine to allow this. There are even more restrictions when memory is already dereferenced to e.g. an array of 16 or more bytes (type of such pointers is "undefined (*) [16]"), they must be written/stored just after filling in values in temporal registers.

@ryanmkurtz ryanmkurtz removed the Type: Question Further information is requested label Aug 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants