Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: optimize (T)x | (T)y to (T)(x | y) #13816

Closed
EgorBo opened this issue Nov 22, 2019 · 5 comments · Fixed by #58727
Closed

JIT: optimize (T)x | (T)y to (T)(x | y) #13816

EgorBo opened this issue Nov 22, 2019 · 5 comments · Fixed by #58727
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI good first issue Issue should be easy to implement, good for first-time contributors help wanted [up-for-grabs] Good issue for external contributors JitUntriaged CLR JIT issues needing additional triage optimization
Milestone

Comments

@EgorBo
Copy link
Member

EgorBo commented Nov 22, 2019

Noticed in https://github.com/dotnet/coreclr/issues/27917#issuecomment-557519701 We can optimize some movs away for patterns like (byte)x | (byte)y with (byte)(x | y), etc.
For example:

static int Downcast(int a, int b)
{
    return (byte)a | (byte)b;
}

static long Upcast(int a, int b)
{
    return (long)a & (long)b; // not only "|"
}

Current codegen:

; Method Foo:Downcast(int,int):int
       movzx    rax, cl
       movzx    rdx, dl
       or       eax, edx
       ret      


; Method Foo:Upcast(int,int):long
       movsxd   rax, ecx
       movsxd   rdx, edx
       and      rax, rdx
       ret      

Expected codegen (from godbolt and linux abi)

; Method Foo:Downcast(int,int):int
       or       edi, esi
       movzx    eax, dil
       ret


; Method Foo:Upcast(int,int):long
       and      edi, esi
       movsxd   rax, edi
       ret

Should be a simple optimization, e.g.:

\--*  OR        int   
   +--*  CAST      int <- ubyte <- int
   |  \--*  LCL_VAR   int    V00 arg0    
   \--*  CAST      int <- ubyte <- int
      \--*  LCL_VAR   int    V01 arg1   

To:

\--*  CAST      int <- ubyte <- int
   \--*  OR        int   
      +--*  LCL_VAR   int    V00 arg0    
      \--*  LCL_VAR   int    V01 arg1 

Also, should work for pointers:

static unsafe int DowncastPtr(int a, int b)
{
    return *((byte*)&a) | *((byte*)&b);
}

category:cq
theme:basic-cq
skill-level:beginner
cost:small

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
@EgorBo EgorBo added good first issue Issue should be easy to implement, good for first-time contributors help wanted [up-for-grabs] Good issue for external contributors labels Mar 28, 2021
@vijaya-lakshmi-venkatraman

Hi,
I would like to work on this issue if still available.
My understanding is that wherever we have the pattern as above, we can shorthand it.
Is that right?

@EgorBo
Copy link
Member Author

EgorBo commented Apr 28, 2021

Hi,
I would like to work on this issue if still available.
My understanding is that wherever we have the pattern as above, we can shorthand it.
Is that right?

Exactly, feel free to ask any questions

@vijaya-lakshmi-venkatraman

Is there a specific folder(s) or set of files that I should consider for making these changes?

@trionia
Copy link

trionia commented Jun 15, 2021

I was also wondering how to approach this issue and came up with the following shell command

(types="bool|s?byte|u?short|u?int|u?long"; find . -type f -name "*.cs" -exec grep -En "\(($types)\)\w+\s?(\||&)\s?\(\1\)\w+" {} \; -print)

I attached the result sample for the whole runtime directory. I'm not sure how and if the results should be narrowed down further.
runtime_13816.txt

@benjamin-hodgson
Copy link
Contributor

I've opened a pull request for this: #58727

EgorBo pushed a commit that referenced this issue Oct 12, 2021
* Better codegen for `(T)x | (T)y`

I added a morph pass to fold expressions like `(T)x | (T)y`
into `(T)(x | y)`. This results in fewer `movzx` instructions
in the asm.

Fixes #13816

* Code review updates

* Rename function to fgMorphCastedBitwiseOp
* Don't fold checked arithmetic
* Reuse op1 node for return value
* Don't run outside global morphing
* Various code style and comment tweaks

* Don't call gtGetOp2 if tree was folded.

If it was folded, it was folded to a unary (cast) operation
and gtGetOp2() will crash.

I also tweaked fgMorphCastedBitwiseOp to return nullptr
if it didn't do anything (to match behaviour of fgMorphCommutative)

* Code review changes for tests

* Removed all but one csproj
* Added tests for scenarios with overflow, compound exprs, side effects

* Add in some asserts and remove a redundant call

* fix typo

* Code review changes:

* Formatting
* Use getters instead of fields
* set flags on op1

* Fix formatting
@ghost ghost locked as resolved and limited conversation to collaborators Nov 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI good first issue Issue should be easy to implement, good for first-time contributors help wanted [up-for-grabs] Good issue for external contributors JitUntriaged CLR JIT issues needing additional triage optimization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants