Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removes unnecessary shift count masking #11594

Merged
merged 1 commit into from May 19, 2017

Conversation

Projects
None yet
7 participants
@mikedn
Copy link
Contributor

mikedn commented May 13, 2017

The C# compiler translates code like x << y to x << (y & 31) because the behavior of IL shift operations is unspecified if the shift count is greater than or equal to the bit width of the shifted value. However, x86/x64 shift instructions do mask the shift count so & 31 is redundant.

It's also possible that developers also mask the shift count, not knowing that the C# language specification guarantees masking. In that case we end up with 2 and x, 31 instructions.

@mikedn

This comment has been minimized.

Copy link
Contributor Author

mikedn commented May 13, 2017

jit-diff fx summary:

Total bytes of diff: -617 (0.00 % of base)
    diff is an improvement.
Total byte diff includes 0 bytes from reconciling methods
        Base had    0 unique methods,        0 unique bytes
        Diff had    0 unique methods,        0 unique bytes
Top file improvements by size (bytes):
        -100 : System.Private.CoreLib.dasm (0.00 % of base)
         -89 : Microsoft.CodeAnalysis.dasm (-0.01 % of base)
         -67 : System.IO.Compression.dasm (-0.09 % of base)
         -64 : System.Linq.Expressions.dasm (-0.01 % of base)
         -52 : Microsoft.CSharp.dasm (-0.01 % of base)
20 total files with size differences (20 improved, 0 regressed).
Top method regessions by size (bytes):
           1 : System.Collections.dasm - BitHelper:MarkBit(int):this
Top method improvements by size (bytes):
         -24 : Microsoft.CodeAnalysis.CSharp.dasm - Binder:FoldNeverOverflowBinaryOperators(int,ref,ref):ref
         -21 : System.Private.CoreLib.dasm - TlsOverPerCoreLockedStacksArrayPool`1:.ctor():this (3 methods)
         -18 : Microsoft.CodeAnalysis.dasm - FloatingPointType:AssembleFloatingPointValue(long,int,bool,byref):int:this
         -18 : Microsoft.VisualBasic.dasm - Operators:LeftShiftObject(ref,ref):ref
         -18 : Microsoft.VisualBasic.dasm - Operators:RightShiftObject(ref,ref):ref
118 total methods with size differences (117 improved, 1 regressed).

The one byte regression in MarkBit is caused by LSRA's selection of r8d instead of edx, a few instructions (4) end up needing a REX prefix.

@mikedn

This comment has been minimized.

Copy link
Contributor Author

mikedn commented May 13, 2017

#8744 contains an alternative implementation in morph. While it's tempting to remove useless instructions earlier that approach seems a bit risky because we need to ensure that JIT optimizations account for masking done by the xarch shift operators.

Removes unnecessary shift count masking
The C# compiler translates code like `x << y` to `x << (y & 31)` because the behavior of IL shift operations is unspecified if the shift count is greater than or equal to the bit width of the shifted value. However, x86/x64 shift instructions do mask the shift count so `& 31` is redundant.

It's also possible that developers also mask the shift count, not knowing that the C# language specification guarantees masking. In that case we end up with 2 `and x, 31` instructions.
#endif

for (GenTree* andOp = shift->gtGetOp2(); andOp->OperIs(GT_AND); andOp = andOp->gtGetOp1())
{

This comment has been minimized.

Copy link
@briansull

briansull May 15, 2017

Contributor

Why is this written as a for loop.
This will match expressions that are not appropriate such as

(val<< ( t - ( q & 31 ) )

This comment has been minimized.

Copy link
@mikedn

mikedn May 15, 2017

Author Contributor

Why is this written as a for loop.

To catch multiple and instructions. See the PR description:

It's also possible that developers also mask the shift count, not knowing that the C# language specification guarantees masking. In that case we end up with 2 and x, 31 instructions.

This will match expressions that are not appropriate such as

Hmm, how so? The loop condition is andOp->OperIs(GT_AND) so it will stop at the first node that isn't an and.

This comment has been minimized.

Copy link
@briansull

briansull May 15, 2017

Contributor

Ok, I didn't catch that

@briansull
Copy link
Contributor

briansull left a comment

Looks Good

@BruceForstall

This comment has been minimized.

Copy link
Member

BruceForstall commented May 16, 2017

@CarolEidt
Copy link
Member

CarolEidt left a comment

Looks great!

@sdmaclea

This comment has been minimized.

Copy link
Member

sdmaclea commented May 16, 2017

@dotnet/arm64-contrib Looks like this will be needed by arm64 also.

@mikedn

This comment has been minimized.

Copy link
Contributor Author

mikedn commented May 17, 2017

Interesting, as far as I can tell ARM64 masks the shift count exactly like XARCH does, it doesn't zero out the destination register like ARM32. Should this code be moved from lowerxarch.cpp to lower.cpp and ifdefed out for ARM32?

@BruceForstall BruceForstall merged commit 9abc526 into dotnet:master May 19, 2017

18 checks passed

CROSS Check Build finished.
Details
CentOS7.1 x64 Debug Build and Test Build finished.
Details
FreeBSD x64 Checked Build Build finished.
Details
OSX10.12 x64 Checked Build and Test Build finished.
Details
Tizen armel Cross Debug Build Build finished.
Details
Tizen armel Cross Release Build Build finished.
Details
Ubuntu arm Cross Release Build Build finished.
Details
Ubuntu x64 Checked Build and Test Build finished.
Details
Ubuntu x64 Formatting Build finished.
Details
Ubuntu16.04 arm Cross Debug Build Build finished.
Details
Windows_NT arm Cross Debug Build Build finished.
Details
Windows_NT arm Cross Release Build Build finished.
Details
Windows_NT x64 CoreCLR Perf Tests Correctness Build finished.
Details
Windows_NT x64 Debug Build and Test Build finished.
Details
Windows_NT x64 Formatting Build finished.
Details
Windows_NT x64 Release Priority 1 Build and Test Build finished.
Details
Windows_NT x86 Checked Build and Test Build finished.
Details
Windows_NT x86 CoreCLR Perf Tests Correctness Build finished.
Details
@BruceForstall

This comment has been minimized.

Copy link
Member

BruceForstall commented May 19, 2017

Should this code be moved from lowerxarch.cpp to lower.cpp and ifdefed out for ARM32?

Seems reasonable. I merged it for now. Perhaps someone on @dotnet/arm64-contrib should create a GitHub issue to do the move and test it for arm64.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.