New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix x86 steady state tiered compilation performance #17476

Merged
merged 2 commits into from Apr 11, 2018

Conversation

Projects
None yet
3 participants
@noahfalk
Member

noahfalk commented Apr 9, 2018

Fixes #17475

Also included - a few tiered compilation only test hooks + small logging fix for JitBench

Tiered compilation wasn't correctly implementing the MayHavePrecode and RequiresStableEntryPoint policy functions. On x64 this was a non-issue, but due to compact entrypoints on x86 it lead to methods allocating both FuncPtrStubs and Precodes. The FuncPtrStubs would never get backpatched which caused never ending invocations of the Prestub for some methods. Although such code still runs correctly, it is much slower than it needs to be. On MusicStore x86 I am seeing a 20% improvement in steady state RPS after this fix, bringing us inline with what I've seen on x64.

As of now - without tiered compilation:

============= Startup Performance ============

Server start (ms):   739
1st Request (ms):    688
Total (ms):         1427



========== Steady State Performance ==========

  Requests    Aggregate Time(ms)    Req/s   Req Min(ms)   Req Mean(ms)   Req Median(ms)   Req Max(ms)   SEM(%)
-----------   ------------------   ------   -----------   ------------   --------------   -----------   ------
    2-  100                 1820   251.48          3.33           3.98             3.79         16.42     3.31
  101-  250                 2406   256.26          3.16           3.90             3.82         12.43     1.69
  251-  500                 3352   264.10          3.20           3.79             3.72          8.88     0.76
  501-  750                 4293   265.60          3.16           3.77             3.69          7.70     0.70
  751- 1000                 5235   265.42          2.80           3.77             3.71          7.73     0.77
 1001- 1500                 7148   261.38          2.73           3.83             3.70         11.37     0.76
 1501- 2000                 9016   267.76          3.04           3.73             3.69          6.14     0.40
 2001- 3000                12753   267.57          3.08           3.74             3.69          6.68     0.27
 3001- 5000                20281   265.68          3.05           3.76             3.71          6.81     0.20
 5001-10000                39027   266.71          2.88           3.75             3.70          6.40     0.13

With tiered compilation:

============= Startup Performance ============

Server start (ms):   671
1st Request (ms):    540
Total (ms):         1211



========== Steady State Performance ==========

  Requests    Aggregate Time(ms)    Req/s   Req Min(ms)   Req Mean(ms)   Req Median(ms)   Req Max(ms)   SEM(%)
-----------   ------------------   ------   -----------   ------------   --------------   -----------   ------
    2-  100                 1732   189.77          3.37           5.27             4.33         18.23     4.70
  101-  250                 2345   244.66          3.20           4.09             3.98         13.00     1.81
  251-  500                 3202   291.77          2.43           3.43             3.42          6.07     0.94
  501-  750                 3984   319.69          2.45           3.13             3.02         20.45     2.33
  751- 1000                 4746   328.26          2.48           3.05             2.97          6.79     0.87
 1001- 1500                 6277   326.59          2.42           3.06             3.01          5.17     0.46
 1501- 2000                 7776   333.43          2.42           3.00             2.96          4.93     0.46
 2001- 3000                10759   335.22          2.14           2.98             2.95          5.97     0.31
 3001- 5000                16845   328.62          2.11           3.04             2.99          5.51     0.24
 5001-10000                31859   333.03          2.13           3.00             2.96          5.55     0.14
Fix x86 steady state tiered compilation performance
Also included - a few tiered compilation only test hooks + small logging fix for JitBench

Tiered compilation wasn't correctly implementing the MayHavePrecode and RequiresStableEntryPoint policy functions. On x64 this was a non-issue, but due to compact entrypoints on x86 it lead to methods allocating both FuncPtrStubs and Precodes. The FuncPtrStubs would never get backpatched which caused never ending invocations of the Prestub for some methods. Although such code still runs correctly, it is much slower than it needs to be. On MusicStore x86 I am seeing a 20% improvement in steady state RPS after this fix, bringing us inline with what I've seen on x64.

@noahfalk noahfalk requested a review from kouvel Apr 10, 2018

@@ -285,6 +285,8 @@ class EEConfig
// Tiered Compilation config
#if defined(FEATURE_TIERED_COMPILATION)
bool TieredCompilation(void) const {LIMITED_METHOD_CONTRACT; return fTieredCompilation; }
bool TieredCompilation_CallCounting() const {LIMITED_METHOD_CONTRACT; return fTieredCompilation_CallCounting; }
bool TieredCompilation_OptimizeTier0() const {LIMITED_METHOD_CONTRACT; return fTieredCompilation_OptimizeTier0; }

This comment has been minimized.

@kouvel

kouvel Apr 10, 2018

Member

Nit: indentation seems to be a bit off

@kouvel

kouvel approved these changes Apr 10, 2018

@@ -1109,6 +1111,8 @@ class EEConfig
#if defined(FEATURE_TIERED_COMPILATION)
bool fTieredCompilation;
bool fTieredCompilation_CallCounting;
bool fTieredCompilation_OptimizeTier0;

This comment has been minimized.

@kouvel

kouvel Apr 10, 2018

Member

Nit: indentation

@@ -2415,6 +2415,10 @@ BOOL MethodDesc::RequiresStableEntryPoint(BOOL fEstimateForChunk /*=FALSE*/)
{
LIMITED_METHOD_CONTRACT;
// Create precodes for versionable methods
if (IsVersionableWithPrecode())
return TRUE;

This comment has been minimized.

@kouvel

kouvel Apr 10, 2018

Member

Nit: indentation

@noahfalk

This comment has been minimized.

Member

noahfalk commented Apr 11, 2018

@dotnet-bot test Windows_NT x86 Release Innerloop Build and Test

@noahfalk

This comment has been minimized.

Member

noahfalk commented Apr 11, 2018

@dotnet-bot Ubuntu arm Cross Checked Innerloop Build and Test

@noahfalk

This comment has been minimized.

Member

noahfalk commented Apr 11, 2018

@dotnet-bot test Ubuntu arm Cross Checked Innerloop Build and Test

@noahfalk noahfalk merged commit 6854a3e into dotnet:master Apr 11, 2018

2 of 15 checks passed

Alpine.3.6 x64 Debug Build Started
Details
CROSS Check Triggered.
Details
CentOS7.1 x64 Checked Innerloop Build and Test Triggered.
Details
CentOS7.1 x64 Debug Innerloop Build Triggered.
Details
OSX10.12 x64 Checked Innerloop Build and Test Triggered.
Details
Ubuntu arm Cross Checked Innerloop Build and Test Triggered.
Details
Ubuntu arm64 Cross Debug Innerloop Build Triggered.
Details
Ubuntu x64 Checked Innerloop Build and Test Triggered.
Details
Ubuntu x64 Formatting Triggered.
Details
Windows_NT x64 Checked Innerloop Build and Test Triggered.
Details
Windows_NT x64 Formatting Triggered.
Details
Windows_NT x86 Checked Innerloop Build and Test Triggered.
Details
Windows_NT x86 Release Innerloop Build and Test Triggered.
Details
WIP ready for review
Details
license/cla All CLA requirements met.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment