Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace several Nullable<T>.Value with .GetValueOrDefault() #22297

Merged
merged 1 commit into from Feb 1, 2019

Conversation

@stephentoub
Copy link
Member

commented Jan 30, 2019

The former does extra work that the latter doesn't do, and they're equivalent when we know it contains a value, such as immediately after a HasValue check.

@AndyAyersMS, I was a little surprised in some of these cases that the JIT isn't able to compile down to the same code for Value as it does for GetValueOrDefault. As an example, for this:

using System;
using System.Runtime.CompilerServices;

class Program
{
    static void Main() { Positive1(42); Positive2(42); }

    [MethodImpl(MethodImplOptions.NoInlining)]
    private static bool Positive1(int? i) => i.HasValue && i.Value > 0;

    [MethodImpl(MethodImplOptions.NoInlining)]
    private static bool Positive2(int? i) => i.HasValue && i.GetValueOrDefault() > 0;
}

I get this for Positive1 that uses Value:

G_M60389_IG01:
       4883EC28             sub      rsp, 40
       90                   nop
       48894C2430           mov      qword ptr [rsp+30H], rcx

G_M60389_IG02:
       0FB6442430           movzx    rax, byte  ptr [rsp+30H]
       85C0                 test     eax, eax
       7414                 je       SHORT G_M60389_IG05
       85C0                 test     eax, eax
       7417                 je       SHORT G_M60389_IG07

G_M60389_IG03:
       837C243400           cmp      dword ptr [rsp+34H], 0
       0F9FC0               setg     al
       0FB6C0               movzx    rax, al

G_M60389_IG04:
       4883C428             add      rsp, 40
       C3                   ret

G_M60389_IG05:
       33C0                 xor      eax, eax

G_M60389_IG06:
       4883C428             add      rsp, 40
       C3                   ret

G_M60389_IG07:
       E815FFFFFF           call     ThrowHelper:ThrowInvalidOperationException_InvalidOperation_NoValue()
       CC                   int3

and this for Positive2 that uses GetValueOrDefault:

G_M60387_IG01:
       0F1F440000           nop
       48894C2408           mov      qword ptr [rsp+08H], rcx

G_M60387_IG02:
       807C240800           cmp      byte  ptr [rsp+08H], 0
       740C                 je       SHORT G_M60387_IG04
       837C240C00           cmp      dword ptr [rsp+0CH], 0
       0F9FC0               setg     al
       0FB6C0               movzx    rax, al

G_M60387_IG03:
       C3                   ret

G_M60387_IG04:
       33C0                 xor      eax, eax

G_M60387_IG05:
       C3                   ret

The former ends up doing the HasValue check twice, even though they happen one right after the other as part of Value getting inlined, and then I'd have expected it to be able to eliminate the dead branch that includes the ThrowHelper call.

Is it expected that it can't get this?

The former does extra work that the latter doesn't do, and they're equivalent when we know it contains a value, such as immediately after a HasValue check.
@benaadams

This comment has been minimized.

Copy link
Collaborator

commented Jan 30, 2019

Different check, but looks similar to #22246 (though probably different areas handling the conditions in the Jit)

@AndyAyersMS

This comment has been minimized.

Copy link
Member

commented Jan 30, 2019

Haven't drilled in to confirm, but I suspect what you're seeing is that Nullable<int> (or any small type T) is an unfortunate special case where the struct is small enough on x64 to be passed and returned by value, but has multiple fields.... at the ABI boundaries we retype this as a long and then the consequent long<->struct "casts" in JIT IR block promotion and without this the jit can't optimize very well.

Similar issues seen over in #22079 with Range. Does not seem as simple to unwedge as one might hope.

Might be amusing to look at x86 codegen here and see if it is better. Or try long?.

@omariom

This comment has been minimized.

Copy link
Contributor

commented Jan 30, 2019

Interesting to see what code Roslyn and .NET Framework JIT generate for different approaches.

@mikedn

This comment has been minimized.

Copy link
Contributor

commented Jan 30, 2019

Haven't drilled in to confirm, but I suspect what you're seeing is that Nullable (or any small type T) is an unfortunate special case where the struct is small enough on x64 to be passed and returned by value, but has multiple fields.... at the ABI boundaries we retype this as a long and then the consequent long<->struct "casts" in JIT IR block promotion and without this the jit can't optimize very well.

It looks like it's indeed related to struct handling. Though it can also be blamed on assertion propagation.

Due to the struct issues we end up with a GT_LCL_FLD that's not so friendly to optimizations. CSE sort of gets rid of it but the way it does this, by creating a COMMA tree, causes problems later. We have something like:

     ( 15, 14) [000009] ------------              *  STMT      void  (IL   ???...  ???)
N008 ( 15, 14) [000008] -A----------              \--*  JTRUE     void  
N006 (  1,  1) [000006] ------------                 |  /--*  CNS_INT   int    0 $40
N007 ( 13, 12) [000007] JA-----N----                 \--*  EQ        int    $180
N004 (  3,  2) [000050] ------------                    |  /--*  LCL_VAR   int    V02 cse0          $140
N005 ( 11, 10) [000051] -A----------                    \--*  COMMA     int    $140
N001 (  4,  5) [000026] ------------                       |  /--*  LCL_FLD   bool   V00 arg0         u:1[+0] Fseq[hasValue] $140
N003 (  8,  8) [000049] -A------R---                       \--*  ASG       int    $VN.Void
N002 (  3,  2) [000048] D------N----                          \--*  LCL_VAR   int    V02 cse0          $140

------------ BB02 [009..014) -> BB05 (cond), preds={BB01} succs={BB03,BB05}

***** BB02, stmt 2
     (  7,  6) [000036] ------------              *  STMT      void  (IL 0x009...  ???)
N004 (  7,  6) [000035] ----G-------              \--*  JTRUE     void  
N002 (  1,  1) [000033] ------------                 |  /--*  CNS_INT   int    0 $40
N003 (  5,  4) [000034] J---G--N----                 \--*  EQ        int    $180
N001 (  3,  2) [000052] ------------                    \--*  LCL_VAR   int    V02 cse0          $140

and assertion propagation doesn't seem too happy to deal with a LCL_VAR node hidden behind a COMMA.

Ultimately I'd say "blame struct handling". Because it's not only that it causes other optimizations issues, it also forces the argument to memory. This:

       mov      qword ptr [rsp+08H], rcx
G_M60387_IG02:
       cmp      byte  ptr [rsp+08H], 0
       je       SHORT G_M60387_IG04

should be

G_M60387_IG02:
       test     cl, cl
       je       SHORT G_M60387_IG04

So we need to either make struct promotion somehow handle this or make it so that GT_LCL_FLD doesn't force the variable into memory.

@AndyAyersMS

This comment has been minimized.

Copy link
Member

commented Jan 30, 2019

Yeah, that seems to be the case -- here LCL_FLD is the thing that extracts the hasValue field from the struct. Interestingly the jit does value number the two compares the same, so there's some hope that perhaps we could get at this even without promotion.

------------ BB01 [000..009) -> BB04 (cond), preds={} succs={BB02,BB04}

***** BB01, stmt 1
     (  8,  9) [000009] ------------              *  STMT      void  (IL   ???...  ???)
N004 (  8,  9) [000008] ------------              \--*  JTRUE     void  
N002 (  1,  1) [000006] ------------                 |  /--*  CNS_INT   int    0 $40
N003 (  6,  7) [000007] J------N----                 \--*  EQ        int    $180
N001 (  4,  5) [000026] ------------                    \--*  LCL_FLD   bool   V00 arg0         u:1[+0] Fseq[hasValue] $140

------------ BB02 [009..014) -> BB05 (cond), preds={BB01} succs={BB03,BB05}

***** BB02, stmt 2
     (  8,  9) [000036] ------------              *  STMT      void  (IL 0x009...  ???)
N004 (  8,  9) [000035] ----G-------              \--*  JTRUE     void  
N002 (  1,  1) [000033] ------------                 |  /--*  CNS_INT   int    0 $40
N003 (  6,  7) [000034] J---G--N----                 \--*  EQ        int    $180
N001 (  4,  5) [000031] ------------                    \--*  LCL_FLD   bool   V00 arg0         u:1[+0] Fseq[hasValue] $140
@AndyAyersMS

This comment has been minimized.

Copy link
Member

commented Jan 30, 2019

Even with long? we don't get rid of this redundant branch (note we promote here and load the constituent parts into registers)... that is a bit surprising.

G_M51227_IG02:
       0FB601               movzx    rax, byte  ptr [rcx]
       488B5108             mov      rdx, qword ptr [rcx+8]
       84C0                 test     al, al
       7412                 je       SHORT G_M51227_IG05
       84C0                 test     al, al
       7415                 je       SHORT G_M51227_IG07

Looks like JTRUE assertion prop is blocked by the casts we insert as we widen the bool field to an int when we promote...

------------ BB02 [000..009) -> BB05 (cond), preds={BB01} succs={BB03,BB05}

***** BB02, stmt 2
     (  8,  8) [000009] ------------              *  STMT      void  (IL   ???...  ???)
N005 (  8,  8) [000008] ------------              \--*  JTRUE     void  
N003 (  1,  1) [000006] ------------                 |  /--*  CNS_INT   int    0 $40
N004 (  6,  6) [000007] J------N----                 \--*  EQ        int    <l:$283, c:$282>
N002 (  4,  4) [000063] ------------                    \--*  CAST      int <- bool <- int <l:$281, c:$280>
N001 (  3,  2) [000028] ------------                       \--*  LCL_VAR   int    V02 tmp1         u:1 <l:$100, c:$140>

------------ BB03 [009..015) -> BB06 (cond), preds={BB02} succs={BB04,BB06}

***** BB03, stmt 3
     (  8,  8) [000037] ------------              *  STMT      void  (IL 0x009...  ???)
N005 (  8,  8) [000036] ----G-------              \--*  JTRUE     void  
N003 (  1,  1) [000034] ------------                 |  /--*  CNS_INT   int    0 $40
N004 (  6,  6) [000035] J---G--N----                 \--*  EQ        int    <l:$283, c:$282>
N002 (  4,  4) [000064] ------------                    \--*  CAST      int <- bool <- int <l:$281, c:$280>
N001 (  3,  2) [000033] ------------                       \--*  LCL_VAR   int    V02 tmp1         u:1 (last use) <l:$100, c:$140>
@stephentoub stephentoub merged commit 1b2810b into dotnet:master Feb 1, 2019
25 checks passed
25 checks passed
CentOS7.1 x64 Checked Innerloop Build and Test Build finished.
Details
CentOS7.1 x64 Debug Innerloop Build Build finished.
Details
Linux-musl x64 Debug Build Build finished.
Details
OSX10.12 x64 Checked Innerloop Build and Test Build finished.
Details
Ubuntu arm Cross Checked crossgen_comparison Build and Test Build finished.
Details
Ubuntu arm Cross Release crossgen_comparison Build and Test Build finished.
Details
Ubuntu x64 Checked CoreFX Tests Build finished.
Details
Ubuntu x64 Checked Innerloop Build and Test Build finished.
Details
Ubuntu x64 Checked Innerloop Build and Test (Jit - TieredCompilation=0) Build finished.
Details
Ubuntu x64 Formatting Build finished.
Details
WIP Ready for review
Details
Windows_NT x64 Checked CoreFX Tests Build finished.
Details
Windows_NT x64 Checked Innerloop Build and Test Build finished.
Details
Windows_NT x64 Checked Innerloop Build and Test (Jit - TieredCompilation=0) Build finished.
Details
Windows_NT x64 Formatting Build finished.
Details
Windows_NT x64 Release CoreFX Tests Build finished.
Details
Windows_NT x64 full_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
Windows_NT x64 min_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
Windows_NT x86 Checked Innerloop Build and Test Build finished.
Details
Windows_NT x86 Checked Innerloop Build and Test (Jit - TieredCompilation=0) Build finished.
Details
Windows_NT x86 Release Innerloop Build and Test Build finished.
Details
Windows_NT x86 full_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
Windows_NT x86 min_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
coreclr-ci Build #20190130.715 succeeded
Details
license/cla All CLA requirements met.
Details
@stephentoub stephentoub deleted the stephentoub:replacenullable branch Feb 1, 2019
@stephentoub

This comment has been minimized.

Copy link
Member Author

commented Feb 1, 2019

There were only a few cases here, so I changed them. But there are a bunch in corefx, and I'm not currently planning on going through and doing all those... besides, having them there should give @AndyAyers and @mikedn something to verify a fix against :)

@jnm2

This comment has been minimized.

Copy link
Collaborator

commented May 16, 2019

Ever since dotnet/roslyn#22800 shipped in 16.0, I've been preferring foo ?? default or foo ?? 0 over foo.GetValueOrDefault() since the codegen is the same. It's also a more consistent style if the default were to ever differ in another place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.