Open
Description
Issue Description
After upgrading Microsoft.NET.Test.Sdk
from version 17.10.0
to 17.11.0
, our AzureDevOps test pipeline task encounters a System.IO.IOException
at the very beginning of the run.
Steps to Reproduce
Use Microsoft.NET.Test.Sdk
version 17.11.0
or above
Run dotnet test
via the DotNetCoreCLI@2
in an AzureDevOps pipeline:
- task: DotNetCoreCLI@2
displayName: dotnet test
inputs:
command: test
projects: <project solution>
Expected Behavior
Test run as expected with no IO exception at the start
Actual Behavior
Error seen in the AzureDevops test pipeline log:
System.IO.IOException: Unable to write data to the transport connection: An existing connection was forcibly closed by the remote host..
---> System.Net.Sockets.SocketException (10054): An existing connection was forcibly closed by the remote host.
at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 count)
--- End of inner exception stack trace ---
at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 count)
at System.IO.BufferedStream.Flush()
at Microsoft.VisualStudio.TestPlatform.CommunicationUtilities.SocketCommunicationManager.WriteAndFlushToChannel(String rawMessage) in /_/src/Microsoft.TestPlatform.CommunicationUtilities/SocketCommunicationManager.cs:line 413
at Microsoft.VisualStudio.TestPlatform.CommunicationUtilities.DataCollection.DataCollectionRequestSender.SendAfterTestRunEndAndGetResult(ITestMessageEventHandler runEventsHandler, Boolean isCancelled) in /_/src/Microsoft.TestPlatform.CommunicationUtilities/DataCollectionRequestSender.cs:line 155
at Microsoft.VisualStudio.TestPlatform.CrossPlatEngine.DataCollection.ProxyDataCollectionManager.<>c__DisplayClass20_0.<AfterTestRunEnd>b__0() in /_/src/Microsoft.TestPlatform.CrossPlatEngine/DataCollection/ProxyDataCollectionManager.cs:line 155
at Microsoft.VisualStudio.TestPlatform.CrossPlatEngine.DataCollection.ProxyDataCollectionManager.InvokeDataCollectionServiceAction(Action action, ITestMessageEventHandler runEventsHandler) in /_/src/Microsoft.TestPlatform.CrossPlatEngine/DataCollection/ProxyDataCollectionManager.cs:line 288
Environment
Windows Server 2016 inside a AzureDevOps agent pipeline job
Additional Notes
This is not happening on version 17.10.0
or presumably earlier. Seems to be happening at ProxyDataCollectionManager.cs line 288
I'm not sure what is trying to be written/connected when the issue happens
Activity
nohwnd commentedon Oct 24, 2024
Hello, this error is a red herring, it means that the other process died unexpectedly. Please collect diagnostic logs and post them here:
https://github.com/Microsoft/vstest/blob/main/docs/diagnose.md
Or if they contain private information, please create a visual studio feedback issue and upload them there, you can share logs there privately.
If you are unable to share them, please look into the .host..log for exceptions. Please start from the bottom, it should be the last exception that happens. OR if you see just heartbeat and then the process dies, then it was killed externally.
srikcgaa2 commentedon Jan 25, 2025
@nohwnd I can confirm that this error appears after I upgraded the version from 17.9.0 to 17.11.1 when this error does not appear before the update with same set of test cases
nohwnd commentedon Jan 27, 2025
Can you provide the diag logs I was asking for above, please?
srikcgaa2 commentedon Feb 3, 2025
Created Microsoft feedback ticket and attached logs in private
https://developercommunity.visualstudio.com/t/MicrosoftNETTestSDK-1711-SystemIO/10840657
srikcgaa2 commentedon Feb 3, 2025
Today I was also able to see that with 17.10.0, this exception did not occur on the same branch.
v-mykhalchuk commentedon Mar 21, 2025
Hello
We are seeing same issue when running 'dotnet test' in ADO pipeline job.
nohwnd commentedon Jun 12, 2025
If this is still happening in 17.14 please report this again. Sorry for not finding the root cause.
v-mykhalchuk commentedon Jun 12, 2025
@nohwnd I can confirm this still occurs on version 17.14.1
Any details from the VM this runs on that we can examine to maybe get closer to a root cause? Maybe crash logs or diag output from dotnet test command?
Just initial thing - we run this as
dotnet.exe test MySolution.sln --logger trx --results-directory E:\testOut --no-build --blame --blame-hang --blame-hang-dump-type full --blame-hang-timeout 120000 --configuration release --collect "Code coverage" --settings test.runsettings
nohwnd commentedon Jun 12, 2025
Yup, diag logs would be great for a start. :)
v-mykhalchuk commentedon Jun 13, 2025
@nohwnd I did few tests today, diag produced HUGE result where I did not see anything obviously pointing to an issue, and I will not be able to share it without a good bit of sensitive data purification.
But I've got some additional observations.
dotnet.exe test MySolution.sln --logger trx --results-directory E:\testOut --no-build --blame --blame-hang --blame-hang-dump-type full --blame-hang-timeout 120000 --configuration release --collect "Code coverage" --settings test.runsettings
Original runsettings file that fails:
And when we update ModulePath to point to a single dll, the behavior may change:
<ModulePath>.*\\CompanyName.ProductName.ComponentName.dll</ModulePath>
If I point it to any code project dll - it fails the same way.
When I point it to a unit test project dll or to a not-existing one - it succeds.
Most of our unit test projects are decorated with ExcludeFromCodeCoverage attributes.
Hope this helps.
v-mykhalchuk commentedon Jun 18, 2025
hey @nohwnd
can you please confirm if above is helpful or if I should continue chasing the diag logs even though there seem to be nothing apparently pointing to an issue?
nohwnd commentedon Jun 18, 2025
If you are unable to share the log, this is what I would do:
There probably will be empty entry for errorMessage, under this log, meaning that we did not capture any error output from the testhost because it crashed. (Sometimes you can find error there for example for missing runtime version, or similar).
In the testhost log I go to the last lines and check if there are some closing lines saying: terminating or similar. If there are the testhost terminated on purpose, if not and the log just ends in the middle of work, it crashed.
then I look into the datacollector log, to see if code coverage was enabled for this project, if yes, code coverage is often to blame for the crash due to access violation exception. If not I typically try to make a memory dump of the process to see how it crashed.
for that you would need --blame-crash , if this is on windows and net framework please also install procdump to your path, e.g. via chocolatey.
28 remaining items