Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmake: add support for Intel compiler [SDL2] #7520

Merged
merged 11 commits into from Mar 27, 2023

Conversation

madebr
Copy link
Contributor

@madebr madebr commented Mar 23, 2023

This is the SDL2 equivalent of #7516 (minus the intrinsics patches).

With these patches, SDL3 can be built on Linux and Windows using both the classic and new LLVM-based toolchain.

It reroutes intrinsic _byteswap_u(short|long|int64) call to libc function.
The /Gs argument controls the number of bytes that local variables
can occupy before a stack probe is initiated.
By setting it to a huge value, no calls to __chkstk are inserted.

This change is needed for the classic Intel C compiler to build SDL
with -DSDL_LIBC=OFF.
The classic Intel Compiler does not clear the ecx register prior
to executing the cpuid opcode.
Fixes this warning:
 warning: comparison with infinity always evaluates to false in fast floating point modes [-Wtautological-constant-compare]
The new Intel LLVM library needs this because when building SDL2
in release mode.
The classic Intel compiler generates calls to these functions when
building the SDL library with SDL_LIBC=OFF.
SDL_BlitCopyMMX ends with _mm_empty(), so the MMX state should be emptied.

(_mm_empty is the intrinsic function for emms)
@sezero
Copy link
Contributor

sezero commented Mar 24, 2023

Are _intel_fast_(memcpy|memset) not needed in SDL3 side?

@madebr
Copy link
Contributor Author

madebr commented Mar 24, 2023

Yes it is. That's an oversight. Thanks for noticing!

@sezero
Copy link
Contributor

sezero commented Mar 26, 2023

What else is left for this -- and for the SDL3 version? Are all the differences between the two resolved?

@madebr
Copy link
Contributor Author

madebr commented Mar 26, 2023

Both branches build with both classic and llvm intel compilers on Linux and Windows.
The classic Intel compiler still emits warnings on Windows, but I think these are incorrect. They are not silenced.

e.g. at this location, the following warning appears:

C:\Users\maarten\source\repos\SDL\src\render\software\SDL_blendfillrect.c(197): warning #592: variable "sa" is used before its value is set
              FILLRECT(Uint32, DRAW_SETPIXEL_MUL_RGBA);

But when I expand it, sa is guaranteed to be set.

Expansion of ` FILLRECT(Uint32, DRAW_SETPIXEL_MUL_RGBA)`:
Declared in: SDL_draw.h  Definition:  
#define FILLRECT(type, op)                                             \
    do {                                                               \
        int width = rect->w;                                           \
        int height = rect->h;                                          \
        int pitch = (dst->pitch / dst->format->BytesPerPixel);         \
        int skip = pitch - width;                                      \
        type *pixel = (type *)dst->pixels + rect->y * pitch + rect->x; \
        while (height--) {                                             \
            {                                                          \
                int n = (width + 3) / 4;                               \
                switch (width & 3) {                                   \
                case 0:                                                \
                    do {                                               \
                        op;                                            \
                        pixel++;                                       \
                        SDL_FALLTHROUGH;                               \
                    case 3:                                            \
                        op;                                            \
                        pixel++;                                       \
                        SDL_FALLTHROUGH;                               \
                    case 2:                                            \
                        op;                                            \
                        pixel++;                                       \
                        SDL_FALLTHROUGH;                               \
                    case 1:                                            \
                        op;                                            \
                        pixel++;                                       \
                    } while (--n > 0);                                 \
                }                                                      \
            }                                                          \
            pixel += skip;                                             \
        }                                                              \
    } while (0)
 Replacement:  
do {
    int width = rect->w;
    int height = rect->h;
    int pitch = (dst->pitch / dst->format->BytesPerPixel);
    int skip = pitch - width;
    Uint32 *pixel = (Uint32 *)dst->pixels + rect->y * pitch + rect->x;
    while (height--) {
        {
            int n = (width + 3) / 4;
            switch (width & 3) {
            case 0:
                do {
                    do {
                        unsigned sr, sg, sb, sa;
                        (void)sa;
                        {
                            sr = SDL_expand_byte[fmt->Rloss][((*pixel & fmt->Rmask) >> fmt->Rshift)];
                            sg = SDL_expand_byte[fmt->Gloss][((*pixel & fmt->Gmask) >> fmt->Gshift)];
                            sb = SDL_expand_byte[fmt->Bloss][((*pixel & fmt->Bmask) >> fmt->Bshift)];
                            sa = SDL_expand_byte[fmt->Aloss][((*pixel & fmt->Amask) >> fmt->Ashift)];
                        };
                        sr = (((unsigned)(sr) * (r)) / 255) + (((unsigned)(inva) * (sr)) / 255);
                        if (sr > 0xff)
                            sr = 0xff;
                        sg = (((unsigned)(sg) * (g)) / 255) + (((unsigned)(inva) * (sg)) / 255);
                        if (sg > 0xff)
                            sg = 0xff;
                        sb = (((unsigned)(sb) * (b)) / 255) + (((unsigned)(inva) * (sb)) / 255);
                        if (sb > 0xff)
                            sb = 0xff;
                        {
                            *pixel = ((sr >> fmt->Rloss) << fmt->Rshift) | ((sg >> fmt->Gloss) << fmt->Gshift) | ((sb >> fmt->Bloss) << fmt->Bshift) | ((sa >> fmt->Aloss) << fmt->Ashift);
                        };
                    } while (0);
                    pixel++;
                    SDL_FALLTHROUGH;
                case 3:
                    do {
                        unsigned sr, sg, sb, sa;
                        (void)sa;
                        {
                            sr = SDL_expand_byte[fmt->Rloss][((*pixel & fmt->Rmask) >> fmt->Rshift)];
                            sg = SDL_expand_byte[fmt->Gloss][((*pixel & fmt->Gmask) >> fmt->Gshift)];
                            sb = SDL_expand_byte[fmt->Bloss][((*pixel & fmt->Bmask) >> fmt->Bshift)];
                            sa = SDL_expand_byte[fmt->Aloss][((*pixel & fmt->Amask) >> fmt->Ashift)];
                        };
                        sr = (((unsigned)(sr) * (r)) / 255) + (((unsigned)(inva) * (sr)) / 255);
                        if (sr > 0xff)
                            sr = 0xff;
                        sg = (((unsigned)(sg) * (g)) / 255) + (((unsigned)(inva) * (sg)) / 255);
                        if (sg > 0xff)
                            sg = 0xff;
                        sb = (((unsigned)(sb) * (b)) / 255) + (((unsigned)(inva) * (sb)) / 255);
                        if (sb > 0xff)
                            sb = 0xff;
                        {
                            *pixel = ((sr >> fmt->Rloss) << fmt->Rshift) | ((sg >> fmt->Gloss) << fmt->Gshift) | ((sb >> fmt->Bloss) << fmt->Bshift) | ((sa >> fmt->Aloss) << fmt->Ashift);
                        };
                    } while (0);
                    pixel++;
                    SDL_FALLTHROUGH;
                case 2:
                    do {
                        unsigned sr, sg, sb, sa;
                        (void)sa;
                        {
                            sr = SDL_expand_byte[fmt->Rloss][((*pixel & fmt->Rmask) >> fmt->Rshift)];
                            sg = SDL_expand_byte[fmt->Gloss][((*pixel & fmt->Gmask) >> fmt->Gshift)];
                            sb = SDL_expand_byte[fmt->Bloss][((*pixel & fmt->Bmask) >> fmt->Bshift)];
                            sa = SDL_expand_byte[fmt->Aloss][((*pixel & fmt->Amask) >> fmt->Ashift)];
                        };
                        sr = (((unsigned)(sr) * (r)) / 255) + (((unsigned)(inva) * (sr)) / 255);
                        if (sr > 0xff)
                            sr = 0xff;
                        sg = (((unsigned)(sg) * (g)) / 255) + (((unsigned)(inva) * (sg)) / 255);
                        if (sg > 0xff)
                            sg = 0xff;
                        sb = (((unsigned)(sb) * (b)) / 255) + (((unsigned)(inva) * (sb)) / 255);
                        if (sb > 0xff)
                            sb = 0xff;
                        {
                            *pixel = ((sr >> fmt->Rloss) << fmt->Rshift) | ((sg >> fmt->Gloss) << fmt->Gshift) | ((sb >> fmt->Bloss) << fmt->Bshift) | ((sa >> fmt->Aloss) << fmt->Ashift);
                        };
                    } while (0);
                    pixel++;
                    SDL_FALLTHROUGH;
                case 1:
                    do {
                        unsigned sr, sg, sb, sa;
                        (void)sa;
                        {
                            sr = SDL_expand_byte[fmt->Rloss][((*pixel & fmt->Rmask) >> fmt->Rshift)];
                            sg = SDL_expand_byte[fmt->Gloss][((*pixel & fmt->Gmask) >> fmt->Gshift)];
                            sb = SDL_expand_byte[fmt->Bloss][((*pixel & fmt->Bmask) >> fmt->Bshift)];
                            sa = SDL_expand_byte[fmt->Aloss][((*pixel & fmt->Amask) >> fmt->Ashift)];
                        };
                        sr = (((unsigned)(sr) * (r)) / 255) + (((unsigned)(inva) * (sr)) / 255);
                        if (sr > 0xff)
                            sr = 0xff;
                        sg = (((unsigned)(sg) * (g)) / 255) + (((unsigned)(inva) * (sg)) / 255);
                        if (sg > 0xff)
                            sg = 0xff;
                        sb = (((unsigned)(sb) * (b)) / 255) + (((unsigned)(inva) * (sb)) / 255);
                        if (sb > 0xff)
                            sb = 0xff;
                        {
                            *pixel = ((sr >> fmt->Rloss) << fmt->Rshift) | ((sg >> fmt->Gloss) << fmt->Gshift) | ((sb >> fmt->Bloss) << fmt->Bshift) | ((sa >> fmt->Aloss) << fmt->Ashift);
                        };
                    } while (0);
                    pixel++;
                } while (--n > 0);
            }
        }
        pixel += skip;
    }
} while (0)

It also keeps emitting the EMMS warning, that got fixed in the latest commit.

@sezero
Copy link
Contributor

sezero commented Mar 26, 2023

But when I expand it, sa is guaranteed to be set.

I think it's emitting a false warning because of (void)sa; and should be ignored

@madebr madebr merged commit cd64e0b into libsdl-org:SDL2 Mar 27, 2023
37 checks passed
@madebr madebr deleted the intelcc-SDL2 branch March 27, 2023 06:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants