Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

16.3.0 is 2.7-3.5x slower than 16.3.0-preview-20190828-03 on Windows only #2215

Closed
0xd4d opened this issue Oct 6, 2019 · 12 comments
Closed

Comments

@0xd4d
Copy link

0xd4d commented Oct 6, 2019

Microsoft.NET.Test.Sdk 16.3.0 is 2.7-3.5x slower than any other previous version including 16.3.0-preview-20190828-03

This is not #2120 since I can't repro it on Linux, only on Windows.

Repro

  • Use Windows and .NET Core SDK 3.0.100
  • clone https://github.com/0xd4d/iced.git
  • dotnet build -c Release
  • dotnet test -c Release -f netcoreapp3.0 --no-build Iced.UnitTests/Iced.UnitTests.csproj
  • git clean -xdf
  • edit Iced.UnitTests/Iced.UnitTests.csproj and change Microsoft.NET.Test.Sdk version to 16.3.0
  • dotnet build -c Release
  • dotnet test -c Release -f netcoreapp3.0 --no-build Iced.UnitTests/Iced.UnitTests.csproj

Notes:

  • It repros on Windows, not Linux
  • It only repros with 16.3.0, but not 16.3.0-preview-20190828-03 or any other previous version
Version Time GitHub Actions logs
16.3.0 4.0954 Minutes https://github.com/0xd4d/iced/runs/249788558#step:3:48
16.3.0-preview-20190828-03 1.5096 Minutes https://github.com/0xd4d/iced/runs/249798345#step:3:48

It's 3.5x slower when I build it on my own machine.

Getting logs failed:

The active test run was aborted. Reason: Test host process crashed 
@singhsarab
Copy link
Contributor

@ViktorHofer After moving to using the apphost feature, we are observing a significant degradation in performance. Could you help point out what could be going wrong here?

@ViktorHofer
Copy link
Member

cc @vitek-karas can you please loop in the right people that can help with the apphost scenario?

@vitek-karas
Copy link
Member

For now that would be me... We haven't seen any performance issues with apphost versus dotnet.exe. As it seems the regression is rather severe, would it be possible to isolate it to something like:

  • Have the same app to run (same command line, same code)
  • Run it via dotnet.exe app.dll - measure
  • Run it via app.exe - measure

I just want to make sure we're trying to diagnose the right component.

@0xd4d
Copy link
Author

0xd4d commented Oct 9, 2019

I can confirm that this appears to be apphost related.

If I delete testhost.exe from the output dir before running the tests, 16.3.0 is as fast as 16.3.0-preview-20190828-03.

@vitek-karas
Copy link
Member

I think the next step is to figure out if the regression happens because the logic which runs the testhost changes behavior based on presence of testhost.exe, or if the difference is in the testhost itself.

Would it be possible to extract the command line (and environment) with which the testhost is started, and then try to run it via testhost.exe and also via dotnet.exe testhost.dll ?

@0xd4d
Copy link
Author

0xd4d commented Oct 9, 2019

Parent dotnet.exe + testhost process environments are the same with and without an apphost.

The only difference in command line args is the testhost process args: port number is different, and if you delete testhost.exe, an extra argument to testhost.dll is added (in .nuget cache). This testhost.dll is the same as the one in the output dir.

@vitek-karas
Copy link
Member

This is weird - I did repro the problem (thanks for the relatively easy repro!), but I kind of doubt it has to do specifically with apphost, it's probably something in the process which calls testhost.

Fast version

  1. It starts by testhost for cca 20 seconds where it consumes one 1 CPU on 100%
  2. Then the testhost runs for about 1 minute consuming multiple CPUs on 100% (on my machine cca 40% of the entire machine), at the same time the caller process consumes 1 CPU on 100%.
  3. Finally testhost goes to very small CPU usage and the caller process runs 100% for several seconds (like 5)
    Total: testhost process 2:43 minutes CPU time, caller process 0:41 minutes CPU time

Slow version

  1. -- same as above --
  2. -- same as above --
  3. Same behavior (the caller process eats 1 CPU, while testhost doesn't do much), but this lasts almost 3 minutes
    Total: testhost process 3:03 minutes CPU time, caller process 3:31 minutes CPU time

So clearly almost the entire regression is in the caller of the testhost, and not the testhost itself. Now this doesn't mean that there's something different about the testhost itself, but I think it's a strong indicator that the caller does something very differently.

@0xd4d
Copy link
Author

0xd4d commented Oct 10, 2019

I think the bug is in this code:

https://github.com/microsoft/vstest/blob/master/src/Microsoft.TestPlatform.TestHostProvider/Hosting/DotnetTestHostManager.cs#L209-L236

Note that in the if path (apphost path), GetTestHostPath() is never called, but this method initializes hostPackageVersion to eg. 16.3.0, the default value is 15.0.0. Later, IsVersionCheckRequired property is called which checks if version doesn't start with 15.0.0.

@0xd4d
Copy link
Author

0xd4d commented Oct 10, 2019

When the protocol version is not 2, a lot of / all json data is ~3x the normal size.

@0xd4d
Copy link
Author

0xd4d commented Oct 10, 2019

Looks like PR #2206 will fix this issue since the call to GetTestHostPath() is always executed.

@singhsarab
Copy link
Contributor

@0xd4d I guess you hit the bull's eye.
@vitek-karas Thanks a lot for helping out with the investigations.

@vagisha-nidhi Let's update your PR with the discussed changes and that should take care of all the scenarios.

@martincostello
Copy link
Contributor

This might also be related to #2189.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants