Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests are not truly parallelized when run with Traversal project #488

Open
batkaevruslan opened this issue Oct 2, 2023 · 3 comments
Open

Comments

@batkaevruslan
Copy link

Hi!
For the same set of test projects, tests that are run from .sln file are executed faster than tests, that are run from traversal project file.

How it looks in Process Explorer:
For a solution with dozens of test projects, when tests are run from traversal project, they are split among nodes, but when a node finishes its tests the node does not "steal" work from other nodes. As a result, one node does the most part of the work.
image

How to reproduce:
See instructions in https://github.com/batkaevruslan/dotCoverParallelizationIssue (ignore repo name, as at first we thought it's dotCover fault).
We tested it on several machines\OS with sdk 7.0.401 and got similar results:

~/Repos/batkaevruslan/dotCoverParallelizationIssue (main*) » /usr/bin/time -h dotnet test dotCoverParallelizationIssue.sln -maxcpucount:2
Passed! - Failed: 0, Passed: 1, Skipped: 0, Total: 1, Duration: 1 s - Tests1.dll (net7.0)
Passed! - Failed: 0, Passed: 1, Skipped: 0, Total: 1, Duration: 1 s - Tests3.dll (net7.0)
Passed! - Failed: 0, Passed: 1, Skipped: 0, Total: 1, Duration: 1 s - Tests4.dll (net7.0)
Passed! - Failed: 0, Passed: 1, Skipped: 0, Total: 1, Duration: 1 s - Tests5.dll (net7.0)
Passed! - Failed: 0, Passed: 1, Skipped: 0, Total: 1, Duration: 10 s - Tests6.dll (net7.0)
Passed! - Failed: 0, Passed: 1, Skipped: 0, Total: 1, Duration: 30 s - Tests2.dll (net7.0)
31.71s real 6.47s user 1.51s sys

~/Repos/batkaevruslan/dotCoverParallelizationIssue (main*) » /usr/bin/time -h dotnet test affected.proj -maxcpucount:2 -v:q
Passed! - Failed: 0, Passed: 1, Skipped: 0, Total: 1, Duration: 1 s - Tests1.dll (net7.0)
Passed! - Failed: 0, Passed: 1, Skipped: 0, Total: 1, Duration: 1 s - Tests4.dll (net7.0)
Passed! - Failed: 0, Passed: 1, Skipped: 0, Total: 1, Duration: 10 s - Tests6.dll (net7.0)
Passed! - Failed: 0, Passed: 1, Skipped: 0, Total: 1, Duration: 30 s - Tests2.dll (net7.0)
Passed! - Failed: 0, Passed: 1, Skipped: 0, Total: 1, Duration: 1 s - Tests3.dll (net7.0)
Passed! - Failed: 0, Passed: 1, Skipped: 0, Total: 1, Duration: 1 s - Tests5.dll (net7.0)
35.04s real 6.28s user 1.70s sys

Pay attention to the second run: Tests3.dll and Tests5.dll could have been executed in parallel with Tests2.dll, but they haven't.

@jeffkl
Copy link
Contributor

jeffkl commented Oct 2, 2023

I believe what's happening is that when you run dotnet test against a solution file, it schedules the build differently. It appears that the VSTest target for dotnet test doesn't depend on Build so the build of each project happens as each test run is scheduled. In some scenarios that might be better and other scenarios it would be slower.

I cloned the repo and ran dotnet test /m:2 against the solution file and the traversal project:

Solution File Traversal project
image image

Technically both executions are doing the same amount of work, just scheduled differently. To get traversal to match the behavior, you can build first with dotnet build and then run dotnet test --no-build:

image

Now the test execution is identical. That will work if you are okay with building first and executing tests separately.

The VSTestDependsOn property is invoking Build first here:

<VSTestDependsOn Condition="'$(VSTestNoBuild)' != 'true'">
Build
</VSTestDependsOn>

Currently, the only way to override it is by doing so in a custom after Traversal target:

Directory.Build.props

<Project>
  <PropertyGroup>
    <CustomAfterTraversalTargets>$(MSBuildThisFileDirectory)\Custom.After.Traversal.targets</CustomAfterTraversalTargets>
  </PropertyGroup>
</Project>

Custom.After.Traversal.targets:

<Project>
  <PropertyGroup>
    <VSTestDependsOn Condition="'$(UsingMicrosoftTraversalSdk)' == 'true'"></VSTestDependsOn>
  </PropertyGroup>
</Project>

We could add a property to set so that Traversal doesn't call Build first if necessary.

@batkaevruslan
Copy link
Author

batkaevruslan commented Oct 3, 2023

@jeffkl , thanks for quick response!
After reading your comment, I realized that my repro project is not complete.
A month ago, we noticed problems while running tests with dotCover. At first, we thought that it is dotCover problem and create issue for JB team, and then we thought that it is traversal project problem and created current thread issue.

Please see attached to https://youtrack.jetbrains.com/issue/DCVR-12167/dotCover-and-parallel-multiple-test-assemblies-execution video: first 10 secs and last 60 is enough to see that we do specify --no-build and that only one node is working at the end.

Today I was not able to reproduce the same problem behavior with --no-build parameter on my repro project and even for our product solution. The last fact makes me think that it is some kind of flaky behavior.

@jeffkl
Copy link
Contributor

jeffkl commented Oct 4, 2023

The Traversal logic is doing simple MSBuild scheduling by calling the <MSBuild /> task with the specified projects and indicating they should be built in parallel. MSBuild then takes over and schedules project builds on an available node. I highly doubt that there's something wrong with Traversal and instead would recommend looking around https://github.com/dotnet/msbuild/issues for an answer. If something is found to be wrong with Traversal, we'll of course be happy to dig deeper but since Traversal is just calling into MSBuild I think you'll find more answer there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants