Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inliner: Extend IL limit for profiled call-sites, allow inlining for switches. #55478

Merged
merged 21 commits into from
Jul 14, 2021

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Jul 11, 2021

This PR:

  • Extends IL limit for methods with profile data
  • Removes the Amount-of-BBs-limit
  • Recognizes "x isinst/cast" as foldable when x is exact (we know its exact class)
  • Enables inlining for methods with foldable switch (closes JIT: Methods with switches aren't inlineable #55336) - these used to be non-inlineable always.

As the result - it significantly improves several TE benchmarks in FullPGO mode (will post the results for all of the TE benchmarks later):
image

SPC's R2R size with --Ot: 10.35Mb, +0.9% size increase (it's 10.42Mb in Main).
SPC's R2R size with -O (default mode): 9.93Mb

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 11, 2021
@EgorBo EgorBo marked this pull request as ready for review July 12, 2021 13:14
@EgorBo EgorBo mentioned this pull request Jul 12, 2021
Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we see any switches getting inlined in the P0/P1 tests? If not, consider adding some test cases for them.

Also we might want to enhance jit stress to explore some of these same new inlining capabilities.

Otherwise LGTM.


if ((fgFirstBB != nullptr) && (fgPgoSource == ICorJitInfo::PgoSource::Static))
{
const BasicBlock::weight_t sufficientSamples = 5000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This used to be 1000 -- I assume you're increasing this to keep prejit image size small?

If so, you should add a comment describing how this value influences prejit size.
If not, you might comment on what the impact of changing this would be.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1000 samples feels like a plenty of evidence that something is hot.

Copy link
Member Author

@EgorBo EgorBo Jul 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted to 1000. Yes, this changes was supposed to decrease the prejitted size, I was using this histogram:
image
(weights in SPC, the right column starts at 50000).

However, I don't need to save some space with it anymore as I've found an unrelated issue that bloated size for no reason (binary expressions like "arg op cns" used to leave "cns" on top of the pushed stack so there were lots of false-positive foldable-branches/switches).

@EgorBo
Copy link
Member Author

EgorBo commented Jul 12, 2021

Do we see any switches getting inlined in the P0/P1 tests? If not, consider adding some test cases for them.

Not sure about the runtime tests yet (will check), but jit-diff (-f --pmi) for Libs found 145 methods:

Found 293 files with textual diffs.

PMI CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies for  default jit

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 60195858
Total bytes of diff: 60190743
Total bytes of delta: -5115 (-0.01% of base)
Total relative delta: 8.65
    diff is an improvement.
    relative diff is a regression.


Top file regressions (bytes):
         156 : System.Security.Cryptography.Algorithms.dasm (0.04% of base)
         143 : System.Security.Cryptography.Pkcs.dasm (0.03% of base)
         129 : System.Drawing.Common.dasm (0.03% of base)
         118 : System.Speech.dasm (0.03% of base)
          33 : ILCompiler.TypeSystem.ReadyToRun.dasm (0.01% of base)
          19 : System.Private.CoreLib.dasm (0.00% of base)

Top file improvements (bytes):
       -3901 : Microsoft.CodeAnalysis.CSharp.dasm (-0.08% of base)
        -666 : System.Private.Xml.dasm (-0.02% of base)
        -229 : System.Diagnostics.Process.dasm (-0.23% of base)
        -214 : Microsoft.CodeAnalysis.VisualBasic.dasm (-0.00% of base)
        -180 : System.Reflection.Metadata.dasm (-0.03% of base)
        -159 : System.Diagnostics.DiagnosticSource.dasm (-0.09% of base)
         -86 : Microsoft.Extensions.FileProviders.Physical.dasm (-0.46% of base)
         -78 : System.Resources.Extensions.dasm (-0.22% of base)
         -75 : System.Data.Common.dasm (-0.00% of base)
         -43 : System.Configuration.ConfigurationManager.dasm (-0.01% of base)
         -32 : Microsoft.Extensions.Primitives.dasm (-0.12% of base)
         -30 : Microsoft.Extensions.Logging.dasm (-0.10% of base)
         -20 : System.Net.HttpListener.dasm (-0.01% of base)

19 total files with Code Size differences (13 improved, 6 regressed), 254 unchanged.

Top method regressions (bytes):
         288 (51.06% of base) : Microsoft.CodeAnalysis.CSharp.dasm - LanguageParser:ParseComplexElementInitializer():InitializerExpressionSyntax:this
         282 (93.38% of base) : Microsoft.CodeAnalysis.CSharp.dasm - LanguageParser:ParseObjectInitializerNamedAssignment():ExpressionSyntax:this
         279 (85.85% of base) : Microsoft.CodeAnalysis.CSharp.dasm - LanguageParser:ParseDictionaryInitializer():ExpressionSyntax:this
         233 (57.53% of base) : System.Private.CoreLib.dasm - TimeSpanParse:TryParseTimeSpan(ReadOnlySpan`1,ubyte,IFormatProvider,byref):bool
          78 (600.00% of base) : System.Security.Cryptography.Pkcs.dasm - CmsRecipient:.ctor(X509Certificate2):this
          65 (52.42% of base) : System.Security.Cryptography.Pkcs.dasm - CmsRecipient:.ctor(X509Certificate2,RSAEncryptionPadding):this
          60 (142.86% of base) : System.Drawing.Common.dasm - FontFamily:get_GenericSerif():FontFamily
          57 (126.67% of base) : System.Drawing.Common.dasm - FontFamily:get_GenericMonospace():FontFamily
          51 (22.27% of base) : System.Private.CoreLib.dasm - TimeSpanParse:ProcessTerminalState(byref,ubyte,byref):bool
          48 (15.74% of base) : System.Private.CoreLib.dasm - CalendarData:LoadCalendarDataFromSystemCore(String,ushort):bool:this
          39 ( 7.75% of base) : System.Security.Cryptography.Algorithms.dasm - DesImplementation:TryDecryptEcbCore(ReadOnlySpan`1,Span`1,int,byref):bool:this
          39 ( 7.75% of base) : System.Security.Cryptography.Algorithms.dasm - DesImplementation:TryEncryptEcbCore(ReadOnlySpan`1,Span`1,int,byref):bool:this
          39 ( 6.84% of base) : System.Security.Cryptography.Algorithms.dasm - DesImplementation:TryEncryptCbcCore(ReadOnlySpan`1,ReadOnlySpan`1,Span`1,int,byref):bool:this
          39 ( 6.84% of base) : System.Security.Cryptography.Algorithms.dasm - DesImplementation:TryDecryptCbcCore(ReadOnlySpan`1,ReadOnlySpan`1,Span`1,int,byref):bool:this
          39 ( 5.87% of base) : System.Private.CoreLib.dasm - CalendarData:NlsLoadCalendarDataFromSystem(String,ushort):bool:this
          33 ( 3.31% of base) : ILCompiler.TypeSystem.ReadyToRun.dasm - MetadataExtensions:GetDelegatePInvokeFlags(EcmaType):PInvokeFlags
          30 (115.38% of base) : System.Configuration.ConfigurationManager.dasm - ConfigurationSchemaErrors:SetSingleGlobalError(ConfigurationException):this
          25 ( 5.76% of base) : System.Speech.dasm - SemanticResultKey:.ctor(String,ref):this (2 methods)
          21 ( 4.07% of base) : System.Speech.dasm - Choices:Add(ref):this (2 methods)
          20 ( 0.92% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SourceAssemblySymbol:DetectAttributeAndOptionConflicts(DiagnosticBag):this

Top method improvements (bytes):
        -500 (-6.91% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:VisitMethod(IMethodSymbol):this
        -336 (-10.60% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddTypeParameterConstraints(ImmutableArray`1):this
        -252 (-11.50% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:VisitProperty(IPropertySymbol):this
        -252 (-9.21% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddNameAndTypeArgumentsOrParameters(INamedTypeSymbol):this
        -233 (-8.83% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:VisitParameter(IParameterSymbol):this
        -228 (-20.05% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - SourceNamedTypeSymbol:AddSyntheticMyGroupCollectionProperty(NamedTypeSymbol,bool,String,String,String,MembersAndInitializersBuilder,Binder,AttributeSyntax,DiagnosticBag):this
        -216 (-13.22% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddAccessibilityIfRequired(ISymbol):this
        -216 (-12.04% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddMemberModifiersIfRequired(ISymbol):this
        -216 (-12.54% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddArrayRank(IArrayTypeSymbol):this
        -199 (-8.56% of base) : System.Diagnostics.Process.dasm - NtProcessManager:GetProcessInfos(PerformanceCounterLib,int,int,ReadOnlySpan`1):ref
        -180 (-15.54% of base) : System.Reflection.Metadata.dasm - MetadataAggregator:CalculateHeapSizes(IReadOnlyList`1,IReadOnlyList`1):ImmutableArray`1
        -177 (-12.29% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddParametersIfRequired(bool,bool,ImmutableArray`1):this
        -175 (-43.21% of base) : System.Private.Xml.dasm - OptimizerPatterns:Inherit(QilNode,QilNode,int)
        -160 (-10.65% of base) : System.Private.CoreLib.dasm - Directory:Move(String,String)
        -144 (-7.64% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:VisitNamedType(INamedTypeSymbol):this
        -144 (-11.65% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddConstantValue(ITypeSymbol,Object,bool):this
        -144 (-10.88% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:MinimallyQualify(INamedTypeSymbol):this
        -142 (-10.08% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddPropertyNameAndParameters(IPropertySymbol):this
        -118 (-9.56% of base) : System.Diagnostics.DiagnosticSource.dasm - FilterAndTransform:AddNewActivitySourceTransform(String,int,int,DiagnosticSourceEventSource)
        -111 (-7.27% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddCustomModifiersIfRequired(ImmutableArray`1,bool,bool):this

Top method regressions (percentages):
          78 (600.00% of base) : System.Security.Cryptography.Pkcs.dasm - CmsRecipient:.ctor(X509Certificate2):this
          60 (142.86% of base) : System.Drawing.Common.dasm - FontFamily:get_GenericSerif():FontFamily
          19 (135.71% of base) : System.Speech.dasm - GrammarBuilderPhrase:.ctor(String):this
          57 (126.67% of base) : System.Drawing.Common.dasm - FontFamily:get_GenericMonospace():FontFamily
          30 (115.38% of base) : System.Configuration.ConfigurationManager.dasm - ConfigurationSchemaErrors:SetSingleGlobalError(ConfigurationException):this
         282 (93.38% of base) : Microsoft.CodeAnalysis.CSharp.dasm - LanguageParser:ParseObjectInitializerNamedAssignment():ExpressionSyntax:this
         279 (85.85% of base) : Microsoft.CodeAnalysis.CSharp.dasm - LanguageParser:ParseDictionaryInitializer():ExpressionSyntax:this
         233 (57.53% of base) : System.Private.CoreLib.dasm - TimeSpanParse:TryParseTimeSpan(ReadOnlySpan`1,ubyte,IFormatProvider,byref):bool
          65 (52.42% of base) : System.Security.Cryptography.Pkcs.dasm - CmsRecipient:.ctor(X509Certificate2,RSAEncryptionPadding):this
         288 (51.06% of base) : Microsoft.CodeAnalysis.CSharp.dasm - LanguageParser:ParseComplexElementInitializer():InitializerExpressionSyntax:this
          51 (22.27% of base) : System.Private.CoreLib.dasm - TimeSpanParse:ProcessTerminalState(byref,ubyte,byref):bool
          12 (18.18% of base) : System.Resources.Extensions.dasm - TypeNameComparer:IsMscorlib(ReadOnlySpan`1):bool
          48 (15.74% of base) : System.Private.CoreLib.dasm - CalendarData:LoadCalendarDataFromSystemCore(String,ushort):bool:this
          10 ( 8.85% of base) : System.Speech.dasm - BuilderElements:Add(String):this
          10 ( 8.00% of base) : System.Speech.dasm - SemanticKeyElement:Add(String):this
          39 ( 7.75% of base) : System.Security.Cryptography.Algorithms.dasm - DesImplementation:TryDecryptEcbCore(ReadOnlySpan`1,Span`1,int,byref):bool:this
          39 ( 7.75% of base) : System.Security.Cryptography.Algorithms.dasm - DesImplementation:TryEncryptEcbCore(ReadOnlySpan`1,Span`1,int,byref):bool:this
          39 ( 6.84% of base) : System.Security.Cryptography.Algorithms.dasm - DesImplementation:TryEncryptCbcCore(ReadOnlySpan`1,ReadOnlySpan`1,Span`1,int,byref):bool:this
          39 ( 6.84% of base) : System.Security.Cryptography.Algorithms.dasm - DesImplementation:TryDecryptCbcCore(ReadOnlySpan`1,ReadOnlySpan`1,Span`1,int,byref):bool:this
          39 ( 5.87% of base) : System.Private.CoreLib.dasm - CalendarData:NlsLoadCalendarDataFromSystem(String,ushort):bool:this

Top method improvements (percentages):
        -175 (-43.21% of base) : System.Private.Xml.dasm - OptimizerPatterns:Inherit(QilNode,QilNode,int)
        -228 (-20.05% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - SourceNamedTypeSymbol:AddSyntheticMyGroupCollectionProperty(NamedTypeSymbol,bool,String,String,String,MembersAndInitializersBuilder,Binder,AttributeSyntax,DiagnosticBag):this
        -180 (-15.54% of base) : System.Reflection.Metadata.dasm - MetadataAggregator:CalculateHeapSizes(IReadOnlyList`1,IReadOnlyList`1):ImmutableArray`1
         -13 (-15.29% of base) : System.Private.Xml.dasm - XmlILOptimizerVisitor:IsStepPattern(OptimizerPatterns,int):bool:this
         -36 (-15.19% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddSpace():this
         -39 (-15.12% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddPunctuation(ushort):this
         -39 (-15.12% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddKeyword(ushort):this
         -39 (-15.00% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddBitwiseOr():this
         -39 (-14.34% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:VisitModule(IModuleSymbol):this
         -39 (-14.34% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:VisitLabel(ILabelSymbol):this
         -39 (-14.34% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:VisitDynamicType(IDynamicTypeSymbol):this
        -106 (-13.49% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddGlobalNamespace(INamespaceSymbol):this
        -216 (-13.22% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddAccessibilityIfRequired(ISymbol):this
         -36 (-12.86% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddSpecialTypeKeyword(INamedTypeSymbol):bool:this
         -72 (-12.79% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddAccessor(ISymbol,IMethodSymbol,ushort):this
        -108 (-12.54% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddFieldModifiersIfRequired(IFieldSymbol):this
        -216 (-12.54% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddArrayRank(IArrayTypeSymbol):this
         -72 (-12.50% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddTypeParameterVarianceIfRequired(ITypeParameterSymbol):this
         -72 (-12.41% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:VisitRangeVariable(IRangeVariableSymbol):this
        -177 (-12.29% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SymbolDisplayVisitor:AddParametersIfRequired(bool,bool,ImmutableArray`1):this

145 total methods with Code Size differences (105 improved, 40 regressed), 275333 unchanged.

Diff is negative! for a change that supposed to inline more methods than in the baseline.

@EgorBo
Copy link
Member Author

EgorBo commented Jul 13, 2021

One more note regarding switches - runtime tests have several hundreds of them including foldable ones. But even non-foldable can be inlined in the stress mode, because we apply additional +10 multiplier in that mode.
There was a small debug-only issue with inlined switches because fgBBNumMax was not updated correctly but I fixed it (JitDump used to assert somewhere in m_switchDescMap)

@EgorBo EgorBo merged commit 21e36e8 into dotnet:main Jul 14, 2021
thaystg added a commit to thaystg/runtime that referenced this pull request Jul 14, 2021
…debugger_custom_views

* 'main' of github.com:thaystg/runtime: (125 commits)
  [wasm] [debugger] Support method calls  (dotnet#55458)
  [debugger] Fix debugging after hot reloading (dotnet#55599)
  Inliner: Extend IL limit for profiled call-sites, allow inlining for switches. (dotnet#55478)
  DiagnosticSourceEventSource supports base class properties (dotnet#55613)
  [mono] Fix race during mono_image_storage_open (dotnet#55201)
  [mono] Add wrapper info for native func wrappers. (dotnet#55602)
  H/3 and Quic AppContext switch (dotnet#55332)
  Compression.ZipFile support for Unix Permissions (dotnet#55531)
  [mono] Fix skipping of static methods during IMT table construction. (dotnet#55610)
  Combine System.Private.Xml TrimmingTests projects (dotnet#55606)
  fix name conflict with Configuration class (dotnet#55597)
  Finish migrating RSAOpenSsl from RSA* to EVP_PKEY*
  Disable generic math (dotnet#55540)
  Obsolete CryptoConfig.EncodeOID (dotnet#55592)
  Address System.Net.Http.WinHttpHandler's nullable warnings targeting .NETCoreApp (dotnet#54995)
  Enable Http2_MultipleConnectionsEnabled_ConnectionLimitNotReached_ConcurrentRequestsSuccessfullyHandled (dotnet#55572)
  Fix Task.WhenAny failure mode when passed ICollection of zero tasks (dotnet#55580)
  Consume DistributedContextPropagator in DiagnosticsHandler (dotnet#55392)
  Add property ordering feature (dotnet#55586)
  Reduce subtest count in Reflection (dotnet#55537)
  ...
@ghost ghost locked as resolved and limited conversation to collaborators Aug 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JIT: Methods with switches aren't inlineable
3 participants