Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance degradation of StackFrameHelper that's shipped with 4.7.1 #529

Closed
batkaevruslan opened this issue Oct 19, 2017 · 33 comments
Closed

Comments

@batkaevruslan
Copy link

batkaevruslan commented Oct 19, 2017

Run this code

public static void Main()
{
   for (int i = 0; i < 10000; i++)
   {
      new StackTrace(true).GetFrames();
   }
}

on Windows10 with .net 4.7.1 and it will take 5 seconds to complete.
Run this code on the same system without 4.7.1 and it will take 200ms

There is performance degradation in System.Diagnostrics.StackFrameHelper.InitializeSourceInfo

image

Looks like caching is broken and pdb files are loaded every time
image

@batkaevruslan batkaevruslan changed the title Performance degradation of 4.7.1 Performance degradation of StackFrameHelper that's shipped with 4.7.1 Oct 19, 2017
@StasPerekrestov
Copy link

It greatly affects unit test projects that use Moq framework. Internally, moq references the StackTrace class
https://github.com/moq/moq4/blob/master/Source/MethodCall.cs#L190

@Petermarcu
Copy link
Member

@lt72 do you know who we should get to take a look at this?

@lt72
Copy link
Contributor

lt72 commented Oct 19, 2017

@batkaevruslan: when you are running without 4.7.1, which framework are you running against?

@chsienki @noahfalk FYI

@StasPerekrestov
Copy link

StasPerekrestov commented Oct 19, 2017

@lt72 Actually, it doesn't matter. You may still target 4.6.1 for your projects. However, if you install net-471 on your machine, then you'll get the issue. BTW, Windows 10 creators update is shipped with 4.7.1 preinstalled.
Our build servers had 4.7 preinstalled till yesterday's morning, however our projects target 4.6.1.
We got the issue yesterday after we installed 4.7.1 on the build servers.

@batkaevruslan
Copy link
Author

as @StasPerekrestov said, it doesn't depend on target framework if 4.7.1 installed

@lt72
Copy link
Contributor

lt72 commented Oct 23, 2017

@chsienki: Can you please give a look to this one as soon as you get a chance?

@noahfalk
Copy link
Member

@chsienki is out so I've started taking a look a this instead. I believe I have reproed the behavior you are seeing, I agree the perf seems surprisingly slow and I'll be digging in to find out what happened and what we can do about it. Thanks for reporting the issue!

@springy76
Copy link

springy76 commented Oct 30, 2017

I just ran into this problem by our performance unit testing which suddenly became slower by a factor of 20+ due to calling StackTrace ctor for every EF6 query which trips through IDbCommandTreeInterceptor.*Executed in order to build per-callstack dictionaries of performance statistics.

@codecat
Copy link

codecat commented Oct 30, 2017

@noahfalk Is there any more internal info about this performance decrease that might help us narrow down if this is the same issue we're seeing?

We have some reports about 4.7.1 greatly decreasing performance compared to 4.7.0, but I don't exactly know if this is using the StackTrace ctor as above.

@noahfalk
Copy link
Member

I was hoping to validate some hypothesis first, but if its useful these are my unvalidated leads so far as to why the performance is slow:

In 4.7.1 we added support for parsing the Portable PDB format in the desktop framework. The code above that @batkaevruslan referenced in the images is attempting to do that parsing. However doing code review it appears the caching policy for the results of the portable PDB parse differ from the classic PDB format parse in two ways:
a) The cache only has a lifetime for a single stacktrace formatting operation. If multiple frames in a single stack trace use the same module the parse for the corresponding PDB will be reused, but it will not be used across different StackTraces.
b) The cache only saves positive results, the cases where it successfully parsed a portable PDB. In the example above there are 4 frames from mscorlib.dll which does not have a PDB adjacent to it or embedded inside of it. In each of these frames the parsing code will load parts of mscorlib.dll into memory and parse the PE format to check if there is an embedded portable PDB.

The differences in code are definitely there and I am reasonably convinced they will turn out to be the culprit, however I am working to setup a private build here at work with a different cache policy so that I can A/B test it. Its possible that there are further issues we haven't uncovered yet, or surprisingly these cache issues didn't contribute as severely to the performance as I'd expect.

@codecat - The easiest way to check if it is this particular issue affecting you is to collect an ETW trace using a profiler. Visual Studio has a profiler built-in or PerfView (https://www.microsoft.com/en-us/download/details.aspx?id=28567) is a standalone tool I often use. If you see a large amount of time spent inside the methods in the System.Diagnostics.StackTraceSymbols class then you are probably seeing the same issue.

For folks hitting this issue, if you have the option to stop collecting source line information you can call new StackTrace(false) instead of new StackTrace(true) and that should circumvent the problem. Definitely sorry, I know hitting these types of issues is aggravating and I am working on a better solution.

@stakx
Copy link

stakx commented Oct 30, 2017

For folks hitting this issue, if you have the option to stop collecting source line information you can call new StackTrace(false) instead of new StackTrace(true) and that should circumvent the problem.

Seeing that @StasPerekrestov mentioned Moq (on which I've lately done some work), I'd just like to say that circumventing the problem this way is not really an option for Moq. We'd have to remove a fairly useful diagnostic feature.

I feel that this problem needs to be fixed (and soon) where it was caused: in the .NET framework.

In 4.7.1 we added support for parsing the Portable PDB format in the desktop framework.

Support for the new format is appreciated. I just hope it's possible to enable the new format without breaking old use cases...

@noahfalk: Thank you for looking into this problem. Your effort is much appreciated! Please keep us posted about a possible timeline for a fix.

@noahfalk
Copy link
Member

I was able to verify that both of those caching issues mattered and changing them brings the benchmark down from ~6000ms -> ~350ms on my machine. After I get some sleep, tomorrow I'll see what I can learn about the remaining perf discrepancy.

I just hope it's possible to enable the new format without breaking old use cases...

That's certainly our goal as well. We strive for a high bar on back compat in the desktop .NET framework and we clearly fell short on this one. I'll do my best to make it right.

@lt72 lt72 assigned lt72 and unassigned chsienki and lt72 Oct 31, 2017
@noahfalk
Copy link
Member

noahfalk commented Nov 1, 2017

Further investigation showed that of the remaining 350ms, about 110ms comes from CAS asserts added to the implementation in 4.7.1, 40ms from one time costs jitting + once-per-module costs checking for portable PDB, and then the original 200ms are carried over from the 4.7 implementation. I'm optimistic the CAS costs can be significantly reduced and the jitting costs should go away once my test scenario is NGEN'ed (the framework is automatically NGEN'ed when you receive it via Windows update or MSI installer). The once-per-module cost checking for the portable PDB would be hard to reduce without getting rid of Portable PDB support.

At this point I think the performance issues are reasonably understood and focus is shifting to constructing the best fix and the logistics of distribution. Thanks for your patience and I'll continue to update the thread as I learn more.

@StasPerekrestov
Copy link

@noahfalk you rock! Thank you for giving us a hand with the issue.

@noahfalk
Copy link
Member

noahfalk commented Nov 6, 2017

Nothing specific to report, just wanted to confirm that progress is very much ongoing.

@batkaevruslan
Copy link
Author

batkaevruslan commented Nov 7, 2017 via email

@noahfalk
Copy link
Member

I just met with the servicing team, the fix for this was approved and it should be available in an upcoming release within a few months. I notice the friendly Moq devs also issued their own workaround (thanks!) so if you are in a pinch between now and the time that the .Net fix is available updating to the latest version of Moq should give you some relief.

I don't think I'll have anything else update-worthy while the gears of the testing/servicing/release process are running but if there are further questions I'll do my best to answer them or find someone else who can.

@richlander
Copy link
Member

We may have a private version that folks can test to validate it. If you want in, contact me at rlander@ms.

@lt72
Copy link
Contributor

lt72 commented Dec 1, 2017

We do have a fix for 4.7.1, and it will be available in a broad release soon. Likely in the January time frame.

@rschiefer
Copy link

We've also seen our test builds double in time after upgrading to .NET 4.7.1. I tried emailing Rich to get the private version but haven't heard back (he's probably busy). Can anyone else provide it?

@zhiliangxu
Copy link
Member

Azure App Service in West US has rolled out Windows Server 2016 and .Net 4.7.1. Our production service has been dramatically slowed down! @lt72 when the fix is available in January, will that also be applied to Azure App Service?

@lt72
Copy link
Contributor

lt72 commented Dec 24, 2017

I believe so, but @AlexGhiondea will know.
@noahfalk @chsienki FYI

@joewaid
Copy link

joewaid commented Jan 23, 2018

Is the timeframe for this fix still January? I'm seeing some really terrible test performance and need to get to a solution soon.

@stakx
Copy link

stakx commented Jan 23, 2018

@joewaid:

I'm seeing some really terrible test performance [...].

@rschiefer:

We've also seen our test builds double in time after upgrading to .NET 4.7.1.

Since you guys mention tests, if you're using Moq, please upgrade to version 4.7.145 or later; see devlooped/moq#500 (comment). (If you're not using Moq, please ignore. 😃)

That said, I'd also be interested tohear when the fix for this .NET regression will be available.

@AlexGhiondea
Copy link
Contributor

Hi

The patches have been released -- here is a link to the announcement blog:

https://blogs.msdn.microsoft.com/dotnet/2018/01/09/net-framework-4-7-1-is-available-on-windows-update-wsus-and-mu-catalog/

Thanks,
Alex

@springy76
Copy link

Its this thread which first brought my attention to the existence of portable PDBs. Can anyone explain when to use them? Or why when using them, new StackTrace(true) does not provide any line numbers? (everything targetting net471 and running on 4.7.1 with latest patches).

noahfalk added a commit to dotnet/core that referenced this issue Feb 2, 2018
Unfortunately due to microsoft/dotnet#529 we had to withdraw Portable PDB from 4.7.1 but we anticipate having the feature restored in 4.7.2 with much improved performance.
@noahfalk
Copy link
Member

noahfalk commented Feb 2, 2018

Can anyone explain when to use them?

Hi @springy76! Some general information about them can be found here: https://github.com/dotnet/core/blob/master/Documentation/diagnostics/portable_pdb.md
Generally the situation where you need to use them is when you are doing .Net Core or Mono development on Linux or OSX and you want nice stack traces with source line information for your debugging. If you have an app that will only run on .Net Framework for Windows they are not necessary. We are trying to support them in .Net Framework so that eventually you could use one common format for everything rather than having to pick a different option on each OS.

Or why when using them, new StackTrace(true) does not provide any line numbers?

Apologies the messaging didn't get updated in a timely enough manner. You probably weren't doing anything wrong but unfortunately to fix the performance issue reported here in the short term we had to disable the portable PDB feature in 4.7.1. We aim to bring it back performantly in 4.7.2. This outcome wasn't our first choice and we were trying to fix the performance issue right away while leaving the portable pdb functionality intact. Sadly complications were discovered late in our patch testing process and we had to make a call to disable the portable pdb feature. Assuming you have the patch Alex refers to above then not seeing any source paths is now the expected behavior for Portable PDBs in 4.7.1.

Hopefully that makes sense and let me know if I can help with more questions.

noahfalk added a commit to dotnet/core that referenced this issue Feb 2, 2018
* Update portable pdb guidance in 4.7.1

Unfortunately due to microsoft/dotnet#529 we had to withdraw Portable PDB from 4.7.1 but we anticipate having the feature restored in 4.7.2 with much improved performance.

* Update portable_pdb.md

Clarify that in 4.7.2 the issue is targeting, not just what runtime the code is executing on
@nmvgb
Copy link

nmvgb commented Feb 7, 2018

Hi,

[09-Jan-2018] The fix for this issue is included in the Update for .NET Framework 4.7.1 - KB4054856 for all platforms except Windows 10 Fall Creators Update. This fix removes the Portable PDB functionality from 4.7.1. In 4.7.2 we anticipate having an improved fix available which both performs well and re-enables the portable PDB functionality.

When can we expect an update for Windows 10 1079 (Fall Creators Update)?

Thanks,
Nuno

@AlexGhiondea
Copy link
Contributor

@nmvgb the fix for Windows 10 Fall Creators Update is available now: https://support.microsoft.com/en-us/help/4058258/windows-10-update-kb4058258

virtuallynaked added a commit to virtuallynaked/virtually-naked that referenced this issue Mar 23, 2018
.NET 4.7.1 has a major performance regression when capturing file and
line numbers in stack traces
(microsoft/dotnet#529) that was making debug
mode unusably slow so I'm turning that option off for now.
@clain23
Copy link

clain23 commented May 24, 2018

Hi,

We've just tried 4.7.2 with portable PDB and line numbers are still missing. Is the fix actually exists in 4.7.2?

@noahfalk
Copy link
Member

In 4.7.2 we were more cautious with how the portable pdb stack trace functionality is enabled. To turn it on your application either needs to target 4.7.2 (aka set 4.7.2 as the framework target when you build) or you can opt-in to the functionality with an app config switch:
https://github.com/Microsoft/dotnet/blob/master/Documentation/compatibility/Stack-traces-obtained-when-using-portable-PDBs-now-include-source-file-and-line-information-if-requested.md

Hopefully one of those options gets it working for you, but if not please let us know.
HTH,
-Noah

@clain23
Copy link

clain23 commented May 28, 2018

Thanks, we've got it working.

VS GUI changes compilation targetFramework property and we have defined httpRuntime targetFramework in our project. Changing httpRuntime targetFramework manually did the trick. https://blogs.msdn.microsoft.com/webdev/2012/11/19/all-about-httpruntime-targetframework/

@leecow
Copy link
Member

leecow commented Jan 5, 2021

Last comment indicates this has been resolved.

This repo is no longer actively monitored. Closing up old issues that have not had activity in while. If this is still an issue, please open a new issue in an appropriate repo listed in microsoft/dotnet#1275

@leecow leecow closed this as completed Jan 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests