Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shifts and rotates - count operand extending #206

Closed
michalsc opened this issue Aug 5, 2023 · 11 comments
Closed

Shifts and rotates - count operand extending #206

michalsc opened this issue Aug 5, 2023 · 11 comments
Labels
close pending the ticket will be closed soon, whether fixed or not enhancement please test

Comments

@michalsc
Copy link

michalsc commented Aug 5, 2023

This is actually not related to Your changes to gcc, this is a generic gcc issue when generating m68k code. In case of shifts and rotations where count is given by register, gcc always extends the count to 32-bit operand, generating unnecessary code. The shift/rotate count is always taken modulo 64 so it does not matter if it is specified as 8, 16 or 32-bit variable.

What we have now is following:

unsigned lsl_b(unsigned a, char b)
{
    return a << b;
}

unsigned lsl_w(unsigned a, short b)
{
    return a << b;
}

unsigned lsl_l(unsigned a, unsigned b)
{
    return a << b;
} 

compiled with -Os -fomit-frame-pointer -m68020 -mregparm=2 gives:

__Z5lsl_bjc:
        extb.l d1
        lsl.l d1,d0
        rts
__Z5lsl_wjs:
        ext.l d1
        lsl.l d1,d0
        rts
__Z5lsl_ljj:
        lsl.l d1,d0
        rts

both extb.l d1 and ext.l d1 are unnecessary. Expected result would be:

__Z5lsl_bjc:
        lsl.l d1,d0
        rts
__Z5lsl_wjs:
        lsl.l d1,d0
        rts
__Z5lsl_ljj:
        lsl.l d1,d0
        rts

The same issue applies for all rotate/shift operations and gcc adds unnecessary signed/unsigned extension to long in all cases.

@bebbo
Copy link
Owner

bebbo commented Aug 6, 2023

if you solve this properly you also have to consider cases without regparam.
gcc transforms this often and since the 680** don't have a single instruction for both steps:

  1. read mem
  2. expand
    the internal code looks like
   d1:QI=[sp:SI+0xb]
   d1:SI=sign_extend(d1:QI)
   d0:SI=[sp:SI+0x4]
   d0:SI=d0:SI<<d1:SI
   use d0:SI

Thus you have to handle cases where there are instructions inbetween the sign_extend and the shift.

It can be done, just needs time...

@bebbo
Copy link
Owner

bebbo commented Aug 7, 2023

@michalsc
Copy link
Author

michalsc commented Aug 8, 2023

yeah, there it is. Now, the question is, do we need this:
d1:SI=sign_extend(d1:QI)
or can we just convince gcc to use d1:QI in any case? Ah, jus noticed the gcc rtl way of writing it: can we tell gcc to leave d1:SI, d1:QI or d1:HI intact for shifts? Or can we just tell gcc that m68k supports shifts by QI, HI and SI operands and just declare them all as the very same ASL/ASR and so on?

@bebbo
Copy link
Owner

bebbo commented Aug 28, 2023

please test, this http://franke.ms/cex/z/qGfbec looks ok now.

@michalsc
Copy link
Author

Yeah, on this example it looks great. However, the same should apply also to shifts right and it doesn't yet (unless this was just a proof of concept to see if you are on right track).

BTW: What I have also noticed is that gcc does not emit other LSL/LSR sizes as long. E.g. following code

unsigned short lsl_b(unsigned short a, char b)
{
    return a >> b;
}

unsigned short lsl_w(signed short a, short b)
{
    return (unsigned short)(a << b);
}

will be compiled to

__Z5lsl_btc:
        and.l #65535,d0
        extb.l d1
        asr.l d1,d0
        rts
__Z5lsl_wss:
        ext.l d0
        lsl.l d1,d0
        rts

where asr.l or lsl.l could be simply be replaced with asr.w or lsl.w without actually needing to mask or sign extend the operand in d0 register.

bebbo added a commit that referenced this issue Aug 28, 2023
@bebbo
Copy link
Owner

bebbo commented Aug 28, 2023

available in ~36 mins

@bebbo
Copy link
Owner

bebbo commented Aug 28, 2023

The used instructions do not contain shifts for shorter modes. Maybe that can be added, but I'm not sure about this...

@bebbo
Copy link
Owner

bebbo commented Aug 28, 2023

well, the main problem is, that C/C++ converts everything to int before the shift gets applied.

  _2 = (int) a;
  _4 = (int) b;
  _5 = _2 >> _4;
  _6 = (short unsigned int) _5;
  return _6;

bebbo added a commit that referenced this issue Sep 14, 2023
@bebbo
Copy link
Owner

bebbo commented Sep 14, 2023

please test: http://franke.ms/cex/z/4Teznb

@bebbo
Copy link
Owner

bebbo commented Sep 14, 2023

@bebbo bebbo added enhancement close pending the ticket will be closed soon, whether fixed or not please test labels Sep 14, 2023
bebbo added a commit that referenced this issue Sep 14, 2023
bebbo added a commit that referenced this issue Sep 14, 2023
@michalsc
Copy link
Author

Looks great for me! Thanks!

I've also checked ROL/ROR and they seem to behave correctly too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
close pending the ticket will be closed soon, whether fixed or not enhancement please test
Projects
None yet
Development

No branches or pull requests

2 participants