Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Transform X % C == 0 to X & (C-1) == 0 #25744

Open
wants to merge 9 commits into
base: master
from

Conversation

@EgorBo
Copy link
Contributor

commented Jul 17, 2019

Optimize X % C == 0 to X & (C-1) == 0. This expression is used in DateTime.IsLeapYear.

C#:

bool Test(int year) => year % 4 == 0;

Old:

; Method T:Test(int):bool
G_M29623_IG01:

G_M29623_IG02:
       mov      eax, edx
       sar      eax, 31
       and      eax, 3
       add      eax, edx
       and      eax, -4
       sub      edx, eax
       sete     al
       movzx    rax, al

G_M29623_IG03:
       ret      
; Total bytes of code: 22

New:

; Method T:Test(int):bool

G_M29623_IG01:

G_M29623_IG02:
       test     dl, 3
       sete     al
       movzx    rax, al

G_M29623_IG03:
       ret      
; Total bytes of code: 10

Will run jit diffs later.

@EgorBo EgorBo changed the title JIT: Transform X % C == 0 to % & (C-1) == 0 JIT: Transform X % C == 0 to X & (C-1) == 0 Jul 17, 2019

src/jit/morph.cpp Outdated Show resolved Hide resolved
src/jit/morph.cpp Outdated Show resolved Hide resolved
src/jit/morph.cpp Outdated Show resolved Hide resolved
src/jit/morph.cpp Outdated Show resolved Hide resolved
src/jit/morph.cpp Outdated Show resolved Hide resolved
src/jit/morph.cpp Outdated Show resolved Hide resolved
@mikedn

This comment has been minimized.

Copy link
Contributor

commented Jul 17, 2019

Overall seems fine but I'm not sure if this is the best way to do it. Another way is to simply change the existing GT_MOD to GT_UMOD, that will allow lowering to transform it into the X & (C - 1).

EgorBo added some commits Jul 17, 2019

@EgorBo

This comment has been minimized.

Copy link
Contributor Author

commented Jul 17, 2019

@mikedn hm... indeed! Changed MOD to UMOD.

TODO:

        bool Test1(int year) => year % 4 != 0; // GT_GT instead of GT_NE

        int Test2(int year)
        {
            if (year > 0)
                return year % 4; // year is guaranteed to be positive
            return 0;
        }
@mikedn

This comment has been minimized.

Copy link
Contributor

commented Jul 17, 2019

The part about figuring out that year is positive due to the if check is currently problematic in the JIT.

@mikedn

This comment has been minimized.

Copy link
Contributor

commented Jul 17, 2019

bool Test1(int year) => year % 4 != 0; // GT_GT instead of GT_NE

I don't quite understand where does GT_GT comes from.

@EgorBo

This comment has been minimized.

Copy link
Contributor Author

commented Jul 17, 2019

bool Test1(int year) => year % 4 != 0; // GT_GT instead of GT_NE

I don't quite understand where does GT_GT comes from.

I guess from Roslyn:

bool Test1(int year) => year % 4 != 0;
IL_0000: ldarg.1
IL_0001: ldc.i4.4
IL_0002: rem
IL_0003: ldc.i4.0
IL_0004: cgt.un
IL_0006: ret
[000006] ------------              *  STMT      void  (IL 0x000...  ???)
[000005] ------------              \--*  RETURN    int   
[000004] N--------U--                 \--*  GT        int   
[000002] ------------                    +--*  MOD       int   
[000000] ------------                    |  +--*  LCL_VAR   int    V01 arg1         
[000001] ------------                    |  \--*  CNS_INT   int    4
[000003] ------------                    \--*  CNS_INT   int    0
@mikedn

This comment has been minimized.

Copy link
Contributor

commented Jul 17, 2019

Ah, right. I got confused because the code uses int rather than uint but then the GT_GT is actually unsigned. AFAIR I added code to transform that GT_NE though I don't remember where I put that right now. Must be somewhere in morph already.

@mikedn

This comment has been minimized.

Copy link
Contributor

commented Jul 17, 2019

#9454 does GT_GT -> GT_NE. Not sure it's not picking up your case. Perhaps these two transforms run in the wrong order.

@EgorBo

This comment has been minimized.

Copy link
Contributor Author

commented Jul 17, 2019

@mikedn thanks! will try to play with the order. Btw, F# doesn't generate cgt: https://sharplab.io/#v2:DYLgZgzgNALiCWwA+B7ADgUwHYAIDKAnhDBgLYCwAUMBjDicQIw4EYCGATjgLwvtcBSHABYcAHgB8OAAxA== (ceq, ceq) and I, honestly, expected the same from Roslyn 🙂

@mikedn

This comment has been minimized.

Copy link
Contributor

commented Jul 17, 2019

Btw, F# doesn't generate cgt:

Basically what C# does is an optimization. But AFAIR it's mentioned in ECMA so it's basically an IL idiom so it's perhaps surprising that F# does not use it.

EgorBo added some commits Jul 17, 2019

@BruceForstall

This comment has been minimized.

Copy link
Member

commented Jul 18, 2019

@danmosemsft

This comment has been minimized.

Copy link
Member

commented Jul 19, 2019

@EgorBo do you expect this to significantly close this regression (which of course was due to a different cause)? #25728
If so, perhaps it's a 3.0 candidate.

@EgorBo

This comment has been minimized.

Copy link
Contributor Author

commented Jul 19, 2019

@danmosemsft definitely not enough to cover even ~2% of that regression 🙁 (according to my benchmarks)

@filipnavara

This comment has been minimized.

Copy link
Collaborator

commented Jul 19, 2019

@danmosemsft The regression is mostly caused by the different kernel32.dll method being invoked. The P/Invoke itself accounts for bulk of the time difference.

@EgorBo

This comment has been minimized.

Copy link
Contributor Author

commented Jul 19, 2019

Summary:
(Lower is better)
Total bytes of diff: -319 (-0.01% of base)
    diff is an improvement.
Top file improvements by size (bytes):
        -319 : System.Private.CoreLib.dasm (-0.01% of base)
1 total files with size differences (1 improved, 0 regressed), 0 unchanged.
Top method regressions by size (bytes):
           3 ( 2.78% of base) : System.Private.CoreLib.dasm - GCHandle:AddrOfPinnedObject():long:this
           1 ( 0.70% of base) : System.Private.CoreLib.dasm - GCHandle:set_Target(ref):this
           1 ( 6.67% of base) : System.Private.CoreLib.dasm - GCHandle:IsPinned(long):bool
Top method improvements by size (bytes):
         -22 (-8.15% of base) : System.Private.CoreLib.dasm - MemoryExtensions:Overlaps(struct,struct,byref):bool (2 methods)
         -18 (-2.53% of base) : System.Private.CoreLib.dasm - GregorianCalendar:AddMonths(struct,int):struct:this
         -17 (-4.35% of base) : System.Private.CoreLib.dasm - GregorianCalendar:GetDaysInYear(int,int):int:this
         -16 (-2.96% of base) : System.Private.CoreLib.dasm - Enum:TryParseRareEnum(ref,ref,struct,bool,bool,byref):bool
         -14 (-17.28% of base) : System.Private.CoreLib.dasm - DateTime:IsLeapYear(int):bool
Top method regressions by size (percentage):
           1 ( 6.67% of base) : System.Private.CoreLib.dasm - GCHandle:IsPinned(long):bool
           3 ( 2.78% of base) : System.Private.CoreLib.dasm - GCHandle:AddrOfPinnedObject():long:this
           1 ( 0.70% of base) : System.Private.CoreLib.dasm - GCHandle:set_Target(ref):this
Top method improvements by size (percentage):
         -13 (-29.55% of base) : System.Private.CoreLib.dasm - JulianCalendar:IsLeapYear(int,int):bool:this
         -14 (-17.28% of base) : System.Private.CoreLib.dasm - DateTime:IsLeapYear(int):bool
         -14 (-14.74% of base) : System.Private.CoreLib.dasm - EastAsianLunisolarCalendar:GregorianIsLeapYear(int):bool
         -10 (-10.99% of base) : System.Private.CoreLib.dasm - Random:GetSampleForLargeRange():double:this
         -13 (-10.92% of base) : System.Private.CoreLib.dasm - JulianCalendar:GetDaysInMonth(int,int,int):int:this
27 total methods with size differences (24 improved, 3 regressed), 18902 unchanged.
Completed analysis in 2.68s
@BruceForstall

This comment has been minimized.

Copy link
Member

commented Jul 19, 2019

@EgorBo Why are there any regressions?

@EgorBo

This comment has been minimized.

Copy link
Contributor Author

commented Jul 19, 2019

@BruceForstall trying to figure out, this the first time I use jittools so probably messed up a bit.
the problematic method is

bool IsPinned(long handle) => (handle & 1) != 0; // should be a single `and` instruction
; Method ConsoleApp172.Program:IsPinned(long):bool:this

G_M43362_IG01:

G_M43362_IG02:
-      test     dl, 1
+      mov      eax, edx
+      test     al, 1
       setne    al
       movzx    rax, al

G_M43362_IG03:
       ret      
-; Total bytes of code: 10
+; Total bytes of code: 11

I guess it's what my goto did.

@mikedn

This comment has been minimized.

Copy link
Contributor

commented Jul 19, 2019

It appears that the goto CMP_EQ_OP you have added picks up more cases of GT_GT unsigned and something doesn't work quite right CQ wise:

               [000162] N-C------U--                 |  \--*  GT        int   
               [000159] --C---------                 |     +--*  AND       long  
               [000166] ------------                 |     |  +--*  LCL_VAR   long   V06 tmp3         
               [000158] ------------                 |     |  \--*  CAST      long <- int
               [000157] ------------                 |     |     \--*  CNS_INT   int    1
               [000161] ------------                 |     \--*  CAST      long <- int
               [000160] ------------                 |        \--*  CNS_INT   int    0

The change to NE results in tree narrowing:

               [000162] N----+-N----              \--*  NE        int   
               [000159] -----+------                 +--*  AND       int   
               [000498] ------------                 |  +--*  CAST      int <- long
               [000166] -----+------                 |  |  \--*  LCL_VAR   long   V01 loc0         
               [000158] -----+------                 |  \--*  CNS_INT   int    1
               [000161] -----+------                 \--*  CNS_INT   int    0

That long to int cast should be a no-op but it actually generates mov eax, ecx.

@mikedn

This comment has been minimized.

Copy link
Contributor

commented Jul 19, 2019

Note that the diff you did only has corelib. The entire FX diff is:

Total bytes of diff: -1765 (-0.01% of base)
    diff is an improvement.
Top file improvements by size (bytes):
        -319 : System.Private.CoreLib.dasm (-0.01% of base)
        -192 : System.Net.WebClient.dasm (-0.43% of base)
        -156 : System.Private.DataContractSerialization.dasm (-0.02% of base)
        -123 : System.Security.AccessControl.dasm (-0.18% of base)
        -120 : System.Net.Sockets.dasm (-0.07% of base)
29 total files with size differences (29 improved, 0 regressed), 100 unchanged.
Top method regressions by size (bytes):
          16 ( 0.64% of base) : System.Private.Uri.dasm - Uri:GetUriPartsFromUserString(int):ref:this
           4 ( 1.30% of base) : System.Private.Uri.dasm - Uri:CreateHostStringHelper(ref,ushort,ushort,byref,byref):ref
           3 ( 2.78% of base) : System.Private.CoreLib.dasm - GCHandle:AddrOfPinnedObject():long:this
           3 ( 0.58% of base) : System.Private.Uri.dasm - Uri:.ctor(ref,ref):this (2 methods)
           1 ( 0.70% of base) : System.Private.CoreLib.dasm - GCHandle:set_Target(ref):this
Top method improvements by size (bytes):
         -48 (-1.52% of base) : System.Private.DataContractSerialization.dasm - Base64Encoding:GetBytes(ref,int,int,ref,int):int:this (2 methods)
         -36 (-0.28% of base) : System.Reflection.Metadata.dasm - MetadataReader:InitializeTableReaders(struct,ubyte,ref,ref):this
         -32 (-0.77% of base) : System.Private.Uri.dasm - Uri:ReCreateParts(int,ushort,int):ref:this
         -32 (-1.48% of base) : System.Reflection.Metadata.dasm - MetadataBuilder:.ctor(int,int,int,int):this
         -31 (-3.98% of base) : Microsoft.CodeAnalysis.dasm - CommandLineParser:SplitCommandLineIntoArguments(ref,bool,byref):ref
Top method regressions by size (percentage):
           1 ( 6.67% of base) : System.Private.CoreLib.dasm - GCHandle:IsPinned(long):bool
           1 ( 5.00% of base) : System.Private.Uri.dasm - Uri:get_IsImplicitFile():bool:this
           1 ( 5.00% of base) : System.Private.Uri.dasm - Uri:get_UserDrivenParsing():bool:this
           1 ( 5.00% of base) : System.Private.Uri.dasm - Uri:get_IsDosPath():bool:this
           1 ( 5.00% of base) : System.Private.Uri.dasm - Uri:get_IsUncOrDosPath():bool:this
Top method improvements by size (percentage):
         -13 (-29.55% of base) : System.Private.CoreLib.dasm - JulianCalendar:IsLeapYear(int,int):bool:this
         -16 (-16.84% of base) : System.Security.Cryptography.Primitives.dasm - SymmetricAlgorithm:set_FeedbackSize(int):this
         -14 (-14.74% of base) : System.Data.Common.dasm - SqlDateTime:IsLeapYear(int):bool
         -14 (-14.74% of base) : System.Private.CoreLib.dasm - EastAsianLunisolarCalendar:GregorianIsLeapYear(int):bool
         -25 (-14.45% of base) : System.Reflection.Metadata.dasm - MethodBodyStreamEncoder:.ctor(ref):this
170 total methods with size differences (157 improved, 13 regressed), 146579 unchanged.

@EgorBo EgorBo force-pushed the EgorBo:jit-opt-mod branch from 3cc9bd4 to 6fc00a5 Aug 9, 2019

@EgorBo

This comment has been minimized.

Copy link
Contributor Author

commented Aug 9, 2019

Workarounded the regression (not sure how to properly fix it):

Summary:
(Lower is better)
Total bytes of diff: -1027 (-0.00% of base)
    diff is an improvement.
Top file improvements by size (bytes):
        -300 : System.Private.CoreLib.dasm (-0.01% of base)
        -156 : System.Private.DataContractSerialization.dasm (-0.02% of base)
        -123 : System.Security.AccessControl.dasm (-0.18% of base)
         -70 : System.Reflection.Metadata.dasm (-0.02% of base)
         -66 : Microsoft.CodeAnalysis.dasm (-0.01% of base)
20 total files with size differences (20 improved, 0 regressed), 109 unchanged.
Top method improvements by size (bytes):
         -48 (-1.52% of base) : System.Private.DataContractSerialization.dasm - Base64Encoding:GetBytes(ref,int,int,ref,int):int:this (2 methods)
         -32 (-1.48% of base) : System.Reflection.Metadata.dasm - MetadataBuilder:.ctor(int,int,int,int):this
         -31 (-3.98% of base) : Microsoft.CodeAnalysis.dasm - CommandLineParser:SplitCommandLineIntoArguments(ref,bool,byref):ref
         -29 (-3.73% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - CtfInteger:Read(ref,int):ref:this
         -26 (-1.15% of base) : System.Collections.dasm - SortedSet`1:ConstructRootFromSortedArray(ref,int,int,ref):ref (2 methods)
Top method improvements by size (percentage):
         -13 (-29.55% of base) : System.Private.CoreLib.dasm - JulianCalendar:IsLeapYear(int,int):bool:this
         -16 (-16.84% of base) : System.Security.Cryptography.Primitives.dasm - SymmetricAlgorithm:set_FeedbackSize(int):this
         -14 (-14.74% of base) : System.Data.Common.dasm - SqlDateTime:IsLeapYear(int):bool
         -14 (-14.74% of base) : System.Private.CoreLib.dasm - EastAsianLunisolarCalendar:GregorianIsLeapYear(int):bool
         -25 (-14.45% of base) : System.Reflection.Metadata.dasm - MethodBodyStreamEncoder:.ctor(ref):this
66 total methods with size differences (66 improved, 0 regressed), 148723 unchanged.
Completed analysis in 12.51s

0 regressions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.