cmake: add support for Intel compiler [SDL2] #7520

madebr · 2023-03-23T06:38:49Z

This is the SDL2 equivalent of #7516 (minus the intrinsics patches).

With these patches, SDL3 can be built on Linux and Windows using both the classic and new LLVM-based toolchain.

It reroutes intrinsic _byteswap_u(short|long|int64) call to libc function.

The /Gs argument controls the number of bytes that local variables can occupy before a stack probe is initiated. By setting it to a huge value, no calls to __chkstk are inserted. This change is needed for the classic Intel C compiler to build SDL with -DSDL_LIBC=OFF.

The classic Intel Compiler does not clear the ecx register prior to executing the cpuid opcode.

Fixes this warning: warning: comparison with infinity always evaluates to false in fast floating point modes [-Wtautological-constant-compare]

The new Intel LLVM library needs this because when building SDL2 in release mode.

The classic Intel compiler generates calls to these functions when building the SDL library with SDL_LIBC=OFF.

SDL_BlitCopyMMX ends with _mm_empty(), so the MMX state should be emptied. (_mm_empty is the intrinsic function for emms)

sezero · 2023-03-24T08:07:13Z

Are _intel_fast_(memcpy|memset) not needed in SDL3 side?

madebr · 2023-03-24T11:44:55Z

Yes it is. That's an oversight. Thanks for noticing!

sezero · 2023-03-26T12:55:22Z

What else is left for this -- and for the SDL3 version? Are all the differences between the two resolved?

madebr · 2023-03-26T13:47:20Z

Both branches build with both classic and llvm intel compilers on Linux and Windows.
The classic Intel compiler still emits warnings on Windows, but I think these are incorrect. They are not silenced.

e.g. at this location, the following warning appears:

C:\Users\maarten\source\repos\SDL\src\render\software\SDL_blendfillrect.c(197): warning #592: variable "sa" is used before its value is set
              FILLRECT(Uint32, DRAW_SETPIXEL_MUL_RGBA);

But when I expand it, sa is guaranteed to be set.

Expansion of ` FILLRECT(Uint32, DRAW_SETPIXEL_MUL_RGBA)`:

Declared in: SDL_draw.h  Definition:  
#define FILLRECT(type, op)                                             \
    do {                                                               \
        int width = rect->w;                                           \
        int height = rect->h;                                          \
        int pitch = (dst->pitch / dst->format->BytesPerPixel);         \
        int skip = pitch - width;                                      \
        type *pixel = (type *)dst->pixels + rect->y * pitch + rect->x; \
        while (height--) {                                             \
            {                                                          \
                int n = (width + 3) / 4;                               \
                switch (width & 3) {                                   \
                case 0:                                                \
                    do {                                               \
                        op;                                            \
                        pixel++;                                       \
                        SDL_FALLTHROUGH;                               \
                    case 3:                                            \
                        op;                                            \
                        pixel++;                                       \
                        SDL_FALLTHROUGH;                               \
                    case 2:                                            \
                        op;                                            \
                        pixel++;                                       \
                        SDL_FALLTHROUGH;                               \
                    case 1:                                            \
                        op;                                            \
                        pixel++;                                       \
                    } while (--n > 0);                                 \
                }                                                      \
            }                                                          \
            pixel += skip;                                             \
        }                                                              \
    } while (0)
 Replacement:  
do {
    int width = rect->w;
    int height = rect->h;
    int pitch = (dst->pitch / dst->format->BytesPerPixel);
    int skip = pitch - width;
    Uint32 *pixel = (Uint32 *)dst->pixels + rect->y * pitch + rect->x;
    while (height--) {
        {
            int n = (width + 3) / 4;
            switch (width & 3) {
            case 0:
                do {
                    do {
                        unsigned sr, sg, sb, sa;
                        (void)sa;
                        {
                            sr = SDL_expand_byte[fmt->Rloss][((*pixel & fmt->Rmask) >> fmt->Rshift)];
                            sg = SDL_expand_byte[fmt->Gloss][((*pixel & fmt->Gmask) >> fmt->Gshift)];
                            sb = SDL_expand_byte[fmt->Bloss][((*pixel & fmt->Bmask) >> fmt->Bshift)];
                            sa = SDL_expand_byte[fmt->Aloss][((*pixel & fmt->Amask) >> fmt->Ashift)];
                        };
                        sr = (((unsigned)(sr) * (r)) / 255) + (((unsigned)(inva) * (sr)) / 255);
                        if (sr > 0xff)
                            sr = 0xff;
                        sg = (((unsigned)(sg) * (g)) / 255) + (((unsigned)(inva) * (sg)) / 255);
                        if (sg > 0xff)
                            sg = 0xff;
                        sb = (((unsigned)(sb) * (b)) / 255) + (((unsigned)(inva) * (sb)) / 255);
                        if (sb > 0xff)
                            sb = 0xff;
                        {
                            *pixel = ((sr >> fmt->Rloss) << fmt->Rshift) | ((sg >> fmt->Gloss) << fmt->Gshift) | ((sb >> fmt->Bloss) << fmt->Bshift) | ((sa >> fmt->Aloss) << fmt->Ashift);
                        };
                    } while (0);
                    pixel++;
                    SDL_FALLTHROUGH;
                case 3:
                    do {
                        unsigned sr, sg, sb, sa;
                        (void)sa;
                        {
                            sr = SDL_expand_byte[fmt->Rloss][((*pixel & fmt->Rmask) >> fmt->Rshift)];
                            sg = SDL_expand_byte[fmt->Gloss][((*pixel & fmt->Gmask) >> fmt->Gshift)];
                            sb = SDL_expand_byte[fmt->Bloss][((*pixel & fmt->Bmask) >> fmt->Bshift)];
                            sa = SDL_expand_byte[fmt->Aloss][((*pixel & fmt->Amask) >> fmt->Ashift)];
                        };
                        sr = (((unsigned)(sr) * (r)) / 255) + (((unsigned)(inva) * (sr)) / 255);
                        if (sr > 0xff)
                            sr = 0xff;
                        sg = (((unsigned)(sg) * (g)) / 255) + (((unsigned)(inva) * (sg)) / 255);
                        if (sg > 0xff)
                            sg = 0xff;
                        sb = (((unsigned)(sb) * (b)) / 255) + (((unsigned)(inva) * (sb)) / 255);
                        if (sb > 0xff)
                            sb = 0xff;
                        {
                            *pixel = ((sr >> fmt->Rloss) << fmt->Rshift) | ((sg >> fmt->Gloss) << fmt->Gshift) | ((sb >> fmt->Bloss) << fmt->Bshift) | ((sa >> fmt->Aloss) << fmt->Ashift);
                        };
                    } while (0);
                    pixel++;
                    SDL_FALLTHROUGH;
                case 2:
                    do {
                        unsigned sr, sg, sb, sa;
                        (void)sa;
                        {
                            sr = SDL_expand_byte[fmt->Rloss][((*pixel & fmt->Rmask) >> fmt->Rshift)];
                            sg = SDL_expand_byte[fmt->Gloss][((*pixel & fmt->Gmask) >> fmt->Gshift)];
                            sb = SDL_expand_byte[fmt->Bloss][((*pixel & fmt->Bmask) >> fmt->Bshift)];
                            sa = SDL_expand_byte[fmt->Aloss][((*pixel & fmt->Amask) >> fmt->Ashift)];
                        };
                        sr = (((unsigned)(sr) * (r)) / 255) + (((unsigned)(inva) * (sr)) / 255);
                        if (sr > 0xff)
                            sr = 0xff;
                        sg = (((unsigned)(sg) * (g)) / 255) + (((unsigned)(inva) * (sg)) / 255);
                        if (sg > 0xff)
                            sg = 0xff;
                        sb = (((unsigned)(sb) * (b)) / 255) + (((unsigned)(inva) * (sb)) / 255);
                        if (sb > 0xff)
                            sb = 0xff;
                        {
                            *pixel = ((sr >> fmt->Rloss) << fmt->Rshift) | ((sg >> fmt->Gloss) << fmt->Gshift) | ((sb >> fmt->Bloss) << fmt->Bshift) | ((sa >> fmt->Aloss) << fmt->Ashift);
                        };
                    } while (0);
                    pixel++;
                    SDL_FALLTHROUGH;
                case 1:
                    do {
                        unsigned sr, sg, sb, sa;
                        (void)sa;
                        {
                            sr = SDL_expand_byte[fmt->Rloss][((*pixel & fmt->Rmask) >> fmt->Rshift)];
                            sg = SDL_expand_byte[fmt->Gloss][((*pixel & fmt->Gmask) >> fmt->Gshift)];
                            sb = SDL_expand_byte[fmt->Bloss][((*pixel & fmt->Bmask) >> fmt->Bshift)];
                            sa = SDL_expand_byte[fmt->Aloss][((*pixel & fmt->Amask) >> fmt->Ashift)];
                        };
                        sr = (((unsigned)(sr) * (r)) / 255) + (((unsigned)(inva) * (sr)) / 255);
                        if (sr > 0xff)
                            sr = 0xff;
                        sg = (((unsigned)(sg) * (g)) / 255) + (((unsigned)(inva) * (sg)) / 255);
                        if (sg > 0xff)
                            sg = 0xff;
                        sb = (((unsigned)(sb) * (b)) / 255) + (((unsigned)(inva) * (sb)) / 255);
                        if (sb > 0xff)
                            sb = 0xff;
                        {
                            *pixel = ((sr >> fmt->Rloss) << fmt->Rshift) | ((sg >> fmt->Gloss) << fmt->Gshift) | ((sb >> fmt->Bloss) << fmt->Bshift) | ((sa >> fmt->Aloss) << fmt->Ashift);
                        };
                    } while (0);
                    pixel++;
                } while (--n > 0);
            }
        }
        pixel += skip;
    }
} while (0)

It also keeps emitting the EMMS warning, that got fixed in the latest commit.

sezero · 2023-03-26T13:55:08Z

But when I expand it, sa is guaranteed to be set.

I think it's emitting a false warning because of (void)sa; and should be ignored

…cpyMMX

madebr added 10 commits March 23, 2023 04:35

byteswap: Don't use intrinsic byteswap functions with Intel C compiler

2f61d3f

It reroutes intrinsic _byteswap_u(short|long|int64) call to libc function.

cpuinfo: use __cpuidex instead of __cpuid

db492be

The classic Intel Compiler does not clear the ecx register prior to executing the cpuid opcode.

cmake: new LLVM based Intel compiler does not recognize MSVC's /MP

04bd6de

testautomation_math: avoid equality tests with INFINITY

42dcf54

Fixes this warning: warning: comparison with infinity always evaluates to false in fast floating point modes [-Wtautological-constant-compare]

cmake: add support for building with Intel C compiler

3d0fb7f

ci: test with (old) Intel compiler + (new) oneAPI compiler

4162bee

cmake: add /Q_no-use-libirc flag when building a no-libc library

a0bcd1a

The new Intel LLVM library needs this because when building SDL2 in release mode.

Implement _intel_fast_(memcpy|memset)

26fd754

The classic Intel compiler generates calls to these functions when building the SDL library with SDL_LIBC=OFF.

cmake: disable warnings in libm + warning about EMMS instruction

7a884c7

SDL_BlitCopyMMX ends with _mm_empty(), so the MMX state should be emptied. (_mm_empty is the intrinsic function for emms)

madebr mentioned this pull request Mar 23, 2023

[Intel C++] error : undefined symbol: __declspec(dllimport) ldexp #7510

Closed

madebr force-pushed the intelcc-SDL2 branch from 2fbee9b to fd5cbb7 Compare March 26, 2023 13:32

SDL_blit_copy: Don't call potentially FPU using SDL_memcpy in SDL_mem…

fd5cbb7

…cpyMMX

madebr merged commit cd64e0b into libsdl-org:SDL2 Mar 27, 2023
37 checks passed

madebr deleted the intelcc-SDL2 branch March 27, 2023 06:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmake: add support for Intel compiler [SDL2] #7520

cmake: add support for Intel compiler [SDL2] #7520

madebr commented Mar 23, 2023

sezero commented Mar 24, 2023

madebr commented Mar 24, 2023

sezero commented Mar 26, 2023

madebr commented Mar 26, 2023

sezero commented Mar 26, 2023

cmake: add support for Intel compiler [SDL2] #7520

cmake: add support for Intel compiler [SDL2] #7520

Conversation

madebr commented Mar 23, 2023

sezero commented Mar 24, 2023

madebr commented Mar 24, 2023

sezero commented Mar 26, 2023

madebr commented Mar 26, 2023

sezero commented Mar 26, 2023