Question: parallel test output interleaving #27474

BruceForstall · 2018-09-25T23:20:11Z

Is there a way to run tests in parallel, but still get good, non-interleaved, output?

For example, in:

https://ci.dot.net/job/dotnet_coreclr/job/master/view/x64/job/jitstress/job/x64_checked_windows_nt_corefx_baseline/641/consoleText

we run the tests using:

build-tests.cmd -Release -os:Windows_NT -buildArch:x64  -- /p:WithoutCategories=IgnoreForCI /p:PreExecutionTestScript=D:\j\workspace\x64_checked_w---a7bd363e\SetStressModes.bat

Then, there is apparently a failure in System.ComponentModel.Composition.Tests. However, it is extremely difficult to see what it is from the logs, because the output between the time the test starts and the time the test ends is interleaved with other test output.

The text was updated successfully, but these errors were encountered:

BruceForstall · 2018-09-25T23:20:26Z

@weshaggard Suggestions?

weshaggard · 2018-09-25T23:26:21Z

I don't know of any good way. My best suggestion is to look at the log for the failing test project once it you see it has a failure.

weshaggard · 2018-09-25T23:26:50Z

We might also be able to figure out a way to dump the log of the failing test project at the end so it is continuous.

BruceForstall · 2018-09-26T00:27:15Z

That would be great! That plus a simple list of exactly the failing test projects.

fyi, I recently copied the run-test.sh script (in the root of corefx) to coreclr, and fixed it to work better for running in parallel, and not interleave output when doing so, by creating an output file per test project where redirected output is put. This is basically a bash script hack for running on arm32/arm64 since we can't run msbuild, but it works out very well, IMO. dotnet/coreclr#20107

BruceForstall · 2018-11-30T00:38:40Z

This is still a serious problem, and makes it extremely difficult to read corefx test run log files.

E.g., https://ci.dot.net/job/dotnet_coreclr/job/master/view/x64/job/jitstress/job/x64_checked_windows_nt_corefx_baseline/702/consoleFull

@karelz @ericstj @danmosemsft Is there anything that could be done here? Even adding a per-line "thread id" to the output would help. Ideally, the output would be one test at a time (and wait until the entire output is complete before generating any). Is it msbuild that invokes the individual RunTest.cmd files? Where?

One option would be to change the RunTest.cmd files themselves so all output is saved and then output all at once, as follows:

set _tmpfile=%TEMP%\RunTest_%RANDOM%.txt
echo ...foo... >>%_tmpfile%
echo ... bar... >>%_tmpfile%
...
call %RUNTIME_PATH%\dotnet.exe xunit.console.dll System.Buffers.Tests.dll ... >>%_tmpfile%
set RESULT=%ERRORLEVEL%
...
type %_tmpfile%
del /f %_tmpfile%
EXIT /B %RESULT%

Or, maybe better, move RunTest.cmd to RunTestHelp.cmd, and create a wrapper RunTest.cmd that does the output buffering, as shown above.

Where is the code that generates the RunTest.cmd files? Looks like the buildtools repo, src\Microsoft.DotNet.Build.Tasks\PackageFiles\RunnerTemplate.Windows.txt as well as src\Microsoft.DotNet.Build.Tasks\GenerateTestExecutionScripts.cs.

Comments?

cc @RussKeldorph

danmoseley · 2018-11-30T00:56:37Z

@ViktorHofer is thinking about this right now actually

BruceForstall · 2018-12-03T23:28:11Z

@ViktorHofer If you have a separate issue tracking this, please link it.

ViktorHofer · 2018-12-04T17:00:35Z

I'm not changing this with the move of the testing infrastructure to arcade but we can discuss what changes can be made to improve the output.

BruceForstall · 2018-12-04T17:35:19Z

Does "the move of the testing infrastructure to arcade" improve the situation? We still have this problem right now, when all tests are run locally (or on one machine in the CI), and I thought the corefx transition to arcade had already happened.

ViktorHofer · 2018-12-04T17:54:19Z

The transition is nearly done: dotnet/corefx#33423. I haven't made any improvements to this specific issue. Let's follow-up what can be done.

Note: I'm considering moving to vstest.console as the xunit runner is out of support. Not sure if that helps anyhow.

danmoseley · 2019-03-11T04:14:28Z

If the testresults.xml is available, does that solve this?

BruceForstall · 2019-03-11T04:54:50Z

It looks like we have a bug in our Jenkins configuration where is only archives the testsresults.xml file if the tests all succeed. Maybe if fixed that to always upload the file(s), we could use them for more info. They are uploaded for xunit "interpretation", but Jenkins test result display is (IMO) terrible.

So, I don't know if it "solves" this; IMO, I prefer a nice, de-interlaced output at the end of the test run log file. But maybe it makes it usable.

ericstj · 2019-04-10T17:51:41Z

@BruceForstall will this still be an issue under the new test process in Azure DevOps?

BruceForstall · 2019-04-10T18:27:40Z

@echesakovMSFT Can you answer @ericstj 's question? My guess is the problem is lessened.

IMO, the problem still exists when running tests locally, if the output is still interleaved.

echesakov · 2019-04-10T19:27:23Z

@BruceForstall Can you remind me when we see the interleaved output in Jenkins? Does it come from summary messages that get printed for different test assemblies? If so, then moving to Azure DevOps should help - since we will be running each assembly on a different test machine. However, if we decide to group some of those assemblies to the same Helix payload then it's still going to be an issue.

BruceForstall · 2019-04-10T19:48:44Z

It happens in the console output when we use the normal corefx test runner (e.g., https://ci.dot.net/job/dotnet_coreclr/job/master/view/x64/job/jitstress/job/x64_checked_windows_nt_corefx_baseline/826/consoleText). When we run using our own test running scripts (arm32/arm64), we've implemented de-interleaving in those scripts to avoid the problem.

ViktorHofer · 2019-06-23T19:34:48Z

I think we can close this now that coreclr is using a different infrastructure to run corefx tests.

BruceForstall · 2019-06-24T18:43:05Z

@ViktorHofer I guess that's ok from the CI perspective, but it still seems weird/problematic from the perspective of running all tests locally on a single machine. Maybe people just don't do that.

ericstj assigned safern Apr 10, 2019

ViktorHofer closed this as completed Jun 23, 2019

msftgits transferred this issue from dotnet/corefx Jan 31, 2020

msftgits added this to the 3.0 milestone Jan 31, 2020

ghost locked as resolved and limited conversation to collaborators Dec 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: parallel test output interleaving #27474

Question: parallel test output interleaving #27474

BruceForstall commented Sep 25, 2018

BruceForstall commented Sep 25, 2018

weshaggard commented Sep 25, 2018

weshaggard commented Sep 25, 2018

BruceForstall commented Sep 26, 2018

BruceForstall commented Nov 30, 2018

danmoseley commented Nov 30, 2018

BruceForstall commented Dec 3, 2018

ViktorHofer commented Dec 4, 2018

BruceForstall commented Dec 4, 2018

ViktorHofer commented Dec 4, 2018

danmoseley commented Mar 11, 2019

BruceForstall commented Mar 11, 2019

ericstj commented Apr 10, 2019

BruceForstall commented Apr 10, 2019

echesakov commented Apr 10, 2019

BruceForstall commented Apr 10, 2019

ViktorHofer commented Jun 23, 2019

BruceForstall commented Jun 24, 2019