Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Support Encoding devirtualization #9276

Merged
merged 1 commit into from
Mar 14, 2017

Conversation

benaadams
Copy link
Member

@benaadams benaadams commented Feb 2, 2017

@AndyAyersMS
Copy link
Member

I think it should work. I'll capture this pattern in a test case and make sure.

@@ -25,9 +25,11 @@ namespace System.Text
[System.Runtime.InteropServices.ComVisible(true)]
public class ASCIIEncoding : Encoding
{
// Allow for devirtualization (see https://github.com/dotnet/coreclr/pull/9230)
internal sealed class ASCIIEncodingSealed : ASCIIEncoding { }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can you add a blank line below this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do we need to do something for serialization?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes. The type will both need to be annotated add Serializable and have a serialization ctor added.

Copy link
Member

@stephentoub stephentoub Feb 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take that back. I thought the base types implemented ISerializable, but it doesn't look like they do. So just the attribute is fine.

internal sealed class UTF8EncodingSealed : UTF8Encoding
{
public UTF8EncodingSealed() : base(encoderShouldEmitUTF8Identifier: true) { }
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can you add a blank line below this?

Copy link
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as it gets the expected benefits, LGTM.

@AndyAyersMS
Copy link
Member

Unfortunately we may not get this fully optimized in the first go-round, see notes over on #1166.

@benaadams
Copy link
Member Author

Will close for now

@benaadams
Copy link
Member Author

Rebased and addressed feedback now #9230 has been merged

@@ -23,9 +23,13 @@ namespace System.Text
[Serializable]
public class ASCIIEncoding : Encoding
{
// Allow for devirtualization (see https://github.com/dotnet/coreclr/pull/9230)
[Serializable]
internal sealed class ASCIIEncodingSealed : ASCIIEncoding { }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: private?

@@ -53,9 +53,16 @@ public class UTF8Encoding : Encoding

private const int UTF8_CODEPAGE = 65001;

// Allow for devirtualization (see https://github.com/dotnet/coreclr/pull/9230)
[Serializable]
internal sealed class UTF8EncodingSealed : UTF8Encoding
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: private?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its exposed via Encoding as

Encoding UTF8 => UTF8Encoding.s_default;

so it needs to be able to see the class; else s_default being that type will be a compile error; if the s_default field was UTF8Encoding it might be too much of a stretch to infer it was actually UTF8EncodingSealed as the type would have to store what it had actually been initialised to?

Don't know if the devirtualization extends to inspecting what a readonly static actually is?

@stephentoub
Copy link
Member

stephentoub commented Mar 3, 2017

Is it getting devirtualized as expected? Any resulting positive benefit on consumption?

@AndyAyersMS
Copy link
Member

There is an example test case here that shows what works today.

You can use COMPlus_JitPrintDevirtualizedMethods=1 (in CHK/DBG) or inspect disassembly to see if devirtualization is happening.

@benaadams
Copy link
Member Author

Is it getting devirtualized as expected?

Not sure; it is doing a lot of good stuff but not sure its picking this up

Successfully inlined Encoding:get_UTF8():ref (6 IL bytes) (depth 1) [below ALWAYS_INLINE size]
**************** Inline Tree
Inlines into 06002D5B CustomAttributeBuilder:EmitString(ref,ref):this
  [1 IL=0000 TR=000001 060031A4] [below ALWAYS_INLINE size] Encoding:get_UTF8():ref
  [0 IL=0006 TR=000005 06003187] [FAILED: target not direct] Encoding:GetBytes(ref):ref:this

Digging a bit deeper

@benaadams
Copy link
Member Author

Utf8String:ToString():ref:this doesn't seem to pick it up

Marking Utf8String:ToString():ref:this as NOINLINE because of too many il bytes
Successfully inlined Encoding:get_UTF8():ref (6 IL bytes) (depth 1) [below ALWAYS_INLINE size]
Successfully inlined Encoding:get_UTF8():ref (6 IL bytes) (depth 1) [below ALWAYS_INLINE size]
**************** Inline Tree
Inlines into 06000CF9 Utf8String:ToString():ref:this
  [1 IL=0065 TR=000058 060031A4] [below ALWAYS_INLINE size] Encoding:get_UTF8():ref
  [0 IL=0077 TR=000064 0600318E] [FAILED: target not direct] Encoding:GetCharCount(long,int):int:this
  [2 IL=0090 TR=000079 060031A4] [below ALWAYS_INLINE size] Encoding:get_UTF8():ref
  [0 IL=0104 TR=000087 06003193] [FAILED: target not direct] Encoding:GetChars(long,int,long,int):int:this
  [0 IL=???? TR=000097 06000367] [FAILED: not inline candidate] String:.ctor(long,int,int):this
Budget: initialTime=417, finalTime=413, initialBudget=4170, currentBudget=4170

With "target not direct"; going to try a few things.

@AndyAyersMS
Copy link
Member

For Utf8String:ToString():ref:this -- presumably in crossgen -- the first time the jit sees the virtual call (in the importer) it can't devirtualize, and because of this it can't inline either:

impDevirtualizeCall: type available, attempting devirt
$$$ In System.Utf8String:ToString():ref:this: Maybe devirt?
    class for 'this' is System.Text.Encoding (attrib 21000400)
    base method is System.Text.Encoding::GetCharCount
    devirt to System.Text.Encoding::GetCharCount
               [000064] --CXG-------             *  callv ind int    System.Text.Encoding.GetCharCount
               [000061] ------------ arg0        +--*  lclVar    long   V01 loc0         
               [000063] ---XG------- arg1        \--*  field     int    m_StringHeapByteLength
               [000062] ------------                \--*  lclVar    byref  V00 this         
    Class NOT final or exact, no devirtualization
INLINER: during 'impMarkInlineCandidate' result 'failed this call site' reason 'target not direct' for 'System.Utf8String:ToString():ref:this' calling 'System.Text.Encoding:GetCharCount(long,int):int:this'

but after the jit inlines the property getter, it gives devirtualization another try:

INLINER: during 'fgInline' result 'success' reason 'below ALWAYS_INLINE size' for 'System.Utf8String:ToString():ref:this' calling 'System.Text.Encoding:get_UTF8():ref'
...
**** Late devirt opportunity
               [000064] --CXG-------             *  callv ind int    System.Text.Encoding.GetCharCount
               [000116] --CXG------- this in rcx +--*  indir     ref   
               [000114] ------------             |  |  /--*  const     int    0x1130 Fseq[s_default]
               [000115] --CXG-------             |  \--*  +         byref 
               [000113] H-CXG-------             |     \--*  call help byref  HELPER.CORINFO_HELP_GETSHARED_GCSTATIC_BASE
               [000109] #----------- arg0        |        +--*  indir     long  
               [000108] ------------             |        |  \--*  const(h)  long   0x1b8f5d10988 cid
               [000110] ------------ arg1        |        \--*  const     int    983
               [000061] ------------ arg1        +--*  lclVar    long   V01 loc0         
               [000063] ---XG------- arg2        \--*  field     int    m_StringHeapByteLength
               [000062] ------------                \--*  lclVar    byref  V00 this   
impDevirtualizeCall: no type available (op=IND)      

Unfortunately it is then tripped up because when prejitting it doesn't directly see the static field access. If this method were being jitted then late devirtualization would kick in.

I'll have to add some more logic to handle the prejit case. Should not be too difficult to look through the IND and see if it is encapsulating a static field access, and if so, forward the type on to the devirtualization logic.

@benaadams
Copy link
Member Author

If seal UTF8Encoding instead and pass that back it also doesn't seem to devirtualize; maybe because its doing the opposite direction? E.g. its an Encoding type and is relying on the inliner to discover its a derived type? (Rather than being a derived type that wants it functions devirtualized?)

@benaadams
Copy link
Member Author

Ah ok, let me build an app

@AndyAyersMS
Copy link
Member

I have a proof of concept for some of the prejit static field cases. But it still gets hung up when I use the code from this PR

**** Late devirt opportunity
    ...
    Trying to tunnel through indir...
impDevirtualizeCall: type available, attempting devirt
$$$ In System.Utf8String:ToString():ref:this: Maybe devirt?
    class for 'this' is System.Text.UTF8Encoding (attrib 21000000)
    base method is System.Text.Encoding::GetChars
    devirt to System.Text.UTF8Encoding::GetChars
    ...
    Class NOT final or exact, no devirtualization

If I modify the code so the static field has the sealed type

        internal sealed class Sealed : UTF8Encoding {
            internal Sealed() : base(encoderShouldEmitUTF8Identifier: true)           
            {
            }
        } 

        // Used by Encoding.UTF8 for lazy initialization
        internal static readonly Sealed s_default = new Sealed();

then we finally devirtualize during prejitting:

**** Late devirt opportunity
     ...
     Trying to tunnel through indir
impDevirtualizeCall: type available, attempting devirt
$$$ In System.Utf8String:ToString():ref:this: Maybe devirt?
    class for 'this' is Sealed [final] (attrib 21000010)
    base method is System.Text.Encoding::GetCharCount
    devirt to System.Text.UTF8Encoding::GetCharCount
    ....
!!! final class or method; can devirtualize
... after devirt...
               [000064] --CXG-------             *  call nullcheck int    System.Text.UTF8Encoding.GetCharCount

The challenge on the jit side is that ldsfld can expand into a huge number of variations -- it's not practical or robust to try and pattern match them like I am doing in the proof of concept. Handling this properly requires some thought, so it may take a bit of time to get it all pulled together.

@benaadams
Copy link
Member Author

benaadams commented Mar 4, 2017

@AndyAyersMS @stephentoub Yes it does pick it up later as runtime jit (in ASP.NET Kestrel)

Devirtualized call to Encoding:GetString to directly call UTF8Encoding:GetString
Devirtualized call to Encoding:GetBytes to directly call ASCIIEncoding:GetBytes

@benaadams
Copy link
Member Author

Can't measure any perf impact currently as am running a bit of a zombie build to work with the latest coreclr/corefx

@benaadams
Copy link
Member Author

Is more needed for this? It is getting devirtualized correctly

@stephentoub stephentoub merged commit b178cb3 into dotnet:master Mar 14, 2017
@karelz karelz added this to the 2.0.0 milestone Aug 28, 2017
@karelz karelz added this to the 2.0.0 milestone Aug 28, 2017
@benaadams benaadams deleted the sealed-encodings branch March 27, 2018 05:12
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
6 participants