AppVeyor Test Failures under .NET 3.5 #2579

rprouse · 2017-11-28T00:01:55Z

These test failures under .NET 3.5 are getting consistent enough that we are getting few passing builds and we had one build time out at 60 minutes in the .NET 3.5 tests.

In PR #2380 I am going to #if net35 and increase the timeout as a temporary workaround, but we should look at this. I am curious why we are having so many .NET 3.5 threading issues, but no other target.

1) Failed : NUnit.Framework.Internal.Execution.EventQueueTests+SetWaitHandle_Enqueue_AsynchronousTest.SetWaitHandle_Enqueue_Asynchronous
Test exceeded Timeout value of 1000ms

1) Failed : NUnit.Framework.Internal.Execution.EventQueueTests+DequeueBlocking_StopTest.DequeueBlocking_Stop
Test exceeded Timeout value of 1000ms

7) Cancelled : NUnit.Framework.Internal.Execution.EventQueueTests+SetWaitHandle_Enqueue_AsynchronousTest.SetWaitHandle_Enqueue_Asynchronous
Test cancelled by user
   at NUnit.Framework.Internal.Reflect.InvokeMethod(MethodInfo method, Object fixture, Object[] args) in C:\projects\nunit\src\NUnitFramework\framework\Internal\Reflect.cs:line 232
   at NUnit.Framework.Internal.Commands.TestMethodCommand.RunNonAsyncTestMethod(TestExecutionContext context) in C:\projects\nunit\src\NUnitFramework\framework\Internal\Commands\TestMethodCommand.cs:line 109
   at NUnit.Framework.Internal.Commands.TestMethodCommand.Execute(TestExecutionContext context) in C:\projects\nunit\src\NUnitFramework\framework\Internal\Commands\TestMethodCommand.cs:line 64
   at NUnit.Framework.Internal.Commands.BeforeTestCommand.Execute(TestExecutionContext context) in C:\projects\nunit\src\NUnitFramework\framework\Internal\Commands\BeforeTestCommand.cs:line 50
   at NUnit.Framework.Internal.Commands.BeforeAndAfterTestCommand.Execute(TestExecutionContext context) in C:\projects\nunit\src\NUnitFramework\framework\Internal\Commands\BeforeAndAfterTestCommand.cs:line 59

1) Failed : NUnit.Framework.Assertions.CollectionAssertTest.PerformanceTests(System.Linq.Enumerable+<RangeIterator>d__b1)
Elapsed time of 137.7538ms exceeds maximum of 100ms

The text was updated successfully, but these errors were encountered:

rprouse · 2017-12-02T16:25:41Z

Hit yet another .NET 3.5 test failure today ☹️

Failed : >NUnit.Framework.Internal.Execution.EventQueueTests.PumpSynchronousAndAsynchronousEvents
Test exceeded Timeout value of 1000ms

jnm2 · 2017-12-02T16:27:57Z

I think we should change all timeouts to a ten second minimum. That will help distinguish actual logic errors from CI being overloaded.

rprouse · 2017-12-02T16:31:46Z

If it is the CI being overloaded, why is it always the .NET 3.5 tests that fail and never any of the other targets? I am okay with increasing the timeout for 3.5 since it might be inefficiencies in the .NET 3.5 runtime, but I would prefer to keep the other targets where they are since they never seem to fail.

jnm2 · 2017-12-02T17:14:19Z

@rprouse That's a good question, but I imagine the same answer would explain either inefficient thread handling or actual logic errors. I guess my question is, is it worth special-casing 3.5? In my line of thought, the non-3.5 tests don't take 1000ms. They take maybe 1ms and would continue to take 1ms even when the timeouts are all changed to 10000ms.

rprouse · 2017-12-02T17:26:44Z

Maybe, I would just hate to see slowdowns creep into the code and not be noticed because of the larger timeout, but I also agree that timeouts shouldn't be used for performance testing so I am good either way.

jnm2 · 2017-12-02T17:28:04Z

I could be wrong but I haven't seen any of our timeouts measure slowness of code, only binary correctness of logic.

It would be nice to make it a standard that Timeout is only for binary logic failures (simply as a backstop so that we don't wait an hour for CI to fail) and MaxTime for performance measurements (because we'll have actual numbers rather than interrupting it.)

In some cases, you might even use them together but each for a separate purpose.

jnm2 · 2017-12-02T17:50:19Z

The .NET 3.5 test run timed out at 60 minutes again.

Well, let's not waste time changing test timeouts to 10,000ms then, since NUnit Framework itself is buggy. There's pretty clearly a logic error since this happened multiple times now that I think about it.

jnm2 · 2017-12-06T22:38:37Z

Ran into it again.

jnm2 · 2017-12-07T10:30:14Z

Ran into it three more times.

jnm2 · 2017-12-07T15:26:16Z

And again.

rprouse · 2017-12-07T18:02:12Z

Here is a new .NET 3.5 failure 😦

1) Failed : NUnit.Framework.Assertions.CollectionAssertTest.PerformanceTests(System.Linq.Enumerable+<RangeIterator>d__b1)
Elapsed time of 137.7538ms exceeds maximum of 100ms

jnm2 · 2017-12-07T18:11:27Z

I really want to know what's special about net35. I can't think of any way that it's just failing because it's third in the list.

rprouse · 2017-12-07T18:21:55Z

I was thinking of using the 3.5 Docker container microsoft/dotnet-framework:3.5 to run and maybe debug our 3.5 tests repeatedly to try to track it down and see if it happens.

mikkelbu · 2017-12-07T18:57:07Z

Actually, in https://ci.appveyor.com/project/CharliePoole/nunit/build/3.10.0-ci-04960-pr-2586#L318
3.5 is failing with

1) Failed : NUnit.Framework.Assertions.CollectionAssertTest.PerformanceTests(System.Linq.Enumerable+<RangeIterator>d__b1)
Elapsed time of 137.7538ms exceeds maximum of 100ms

but 4.5 is also failing (in another test)

1) Failed : NUnit.Framework.Assertions.CollectionAssertTest.PerformanceTests(System.Collections.Generic.List`1[System.Double])
Elapsed time of 118.375ms exceeds maximum of 100ms

jnm2 · 2017-12-07T18:57:57Z

Came here to say the same as @mikkelbu.

jnm2 · 2017-12-08T17:17:56Z

Observed a higher time on AppVeyor 4.5:

Failed : NUnit.Framework.Assertions.CollectionAssertTest.PerformanceTests(System.Collections.Generic.List`1[System.Double])
Elapsed time of 370.5775ms exceeds maximum of 100ms

jnm2 · 2017-12-08T17:19:07Z

We should probably multiply all our performance tests by at least ten when running under CI. That was where my Has.MaxTime request came from.

jnm2 · 2017-12-11T00:36:28Z

We'll be forced to take care of the performance tests next in order to ship a .NET Standard 2.0 DLL: #2555

rprouse · 2017-12-12T02:19:59Z

Another .NET 3.5 failure

1) Failed : NUnit.Framework.Internal.Execution.EventQueueTests+SetWaitHandle_Enqueue_AsynchronousTest.SetWaitHandle_Enqueue_Asynchronous
  Expected: null
  But was:  <NUnit.Framework.AssertionException:   Expected: True
  But was:  False
   at NUnit.Framework.Assert.ReportFailure(String message)
   at NUnit.Framework.Assert.That[TActual](TActual actual, IResolveConstraint expression, String message, Object[] args)
   at NUnit.Framework.Internal.Execution.EventQueueTests.SetWaitHandle_Enqueue_AsynchronousTest.Consumer() in C:\projects\nunit\src\NUnitFramework\tests\Internal\EventQueueTests.cs:line 196
   at NUnit.Framework.Internal.Execution.EventQueueTests.ProducerConsumerTest.ConsumerThreadWrapper() in C:\projects\nunit\src\NUnitFramework\tests\Internal\EventQueueTests.cs:line 316>
   at NUnit.Framework.Internal.Execution.EventQueueTests.ProducerConsumerTest.RunProducerConsumer() in C:\projects\nunit\src\NUnitFramework\tests\Internal\EventQueueTests.cs:line 304
   at NUnit.Framework.Internal.Execution.EventQueueTests.SetWaitHandle_Enqueue_AsynchronousTest.SetWaitHandle_Enqueue_Asynchronous() in C:\projects\nunit\src\NUnitFramework\tests\Internal\EventQueueTests.cs:line 181

jnm2 · 2017-12-12T14:32:26Z

Latest AppVeyor net35 failure. This time it looks like the whole process died:

System.Net.Sockets.SocketException (0x80004005): An existing connection was forcibly closed by the remote host
Server stack trace: 
   at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags)
   at System.Runtime.Remoting.Channels.SocketStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   at System.Runtime.Remoting.Channels.SocketHandler.ReadFromSocket(Byte[] buffer, Int32 offset, Int32 count)
   at System.Runtime.Remoting.Channels.SocketHandler.Read(Byte[] buffer, Int32 offset, Int32 count)
   at System.Runtime.Remoting.Channels.Tcp.TcpFixedLengthReadingStream.Read(Byte[] buffer, Int32 offset, Int32 count)
   at System.IO.BinaryReader.ReadBytes(Int32 count)
   at System.Runtime.Serialization.Formatters.Binary.SerializationHeaderRecord.Read(__BinaryParser input)
   at System.Runtime.Serialization.Formatters.Binary.__BinaryParser.ReadSerializationHeaderRecord()
   at System.Runtime.Serialization.Formatters.Binary.__BinaryParser.Run()
   at System.Runtime.Serialization.Formatters.Binary.ObjectReader.Deserialize(HeaderHandler handler, __BinaryParser serParser, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage)
   at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize(Stream serializationStream, HeaderHandler handler, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage)
   at System.Runtime.Remoting.Channels.CoreChannel.DeserializeBinaryResponseMessage(Stream inputStream, IMethodCallMessage reqMsg, Boolean bStrictBinding)
   at System.Runtime.Remoting.Channels.BinaryClientFormatterSink.SyncProcessMessage(IMessage msg)
Exception rethrown at [0]: 
   at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
   at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
   at NUnit.Engine.ITestAgent.Stop()
   at NUnit.Engine.Runners.ProcessRunner.Dispose(Boolean disposing)
   at NUnit.Engine.Runners.AbstractTestRunner.Dispose()
   at NUnit.Engine.Runners.MasterTestRunner.Dispose(Boolean disposing)
   at NUnit.Engine.Runners.MasterTestRunner.Dispose()
   at NUnit.ConsoleRunner.ConsoleRunner.RunTests(TestPackage package, TestFilter filter)
   at NUnit.ConsoleRunner.Program.Main(String[] args)

jnm2 · 2017-12-12T17:47:24Z

New AppVeyor/net35 crash:

Unable to acquire remote process agent
   at NUnit.Engine.Runners.ProcessRunner.CreateAgentAndRunner()
   at NUnit.Engine.Runners.ProcessRunner.RunTests(ITestEventListener listener, TestFilter filter)

jnm2 · 2018-01-24T13:21:30Z

I'm starting to see this every now and then on Travis/netstandard2.0:

3) Failed : NUnit.Framework.Internal.Execution.TestWorkerTests.BusyExecuteIdleEventsCalledInSequence
  Expected: "BusyExecIdle" after 200 milliseconds delay
  But was:  "Busy"

rprouse · 2018-02-27T02:12:35Z

I've started looking at this. First step is to start reproducing locally.

Notes so far

Running .NET 3.5 tests directly on my laptop

I cannot reproduce just running the .NET 3.5 tests on my machine. Best guess is that it is related to number of cores? Pretty sure that the AppVeyor VMs have one core, so we default to two workers to run. I've tried locally with one and two workers, but no luck.

Running tests in a .NET 3.5 docker container

Next test, creating a docker image with this Dockerfile,

FROM microsoft/dotnet-framework:3.5
WORKDIR \app
COPY tools/nunit.consolerunner.3.7.0/NUnit.ConsoleRunner/tools/* ./
ENTRYPOINT ["nunit3-console.exe", "net35/nunit.framework.tests.dll"]

Which I build with

docker build -t rprouse/nunit .

First build is slow as it pulls all of the base images needed. Subsequent are much quicker. I can then run the tests with,

docker run -it -v c:\src\nunit\nunit\bin\Release\net35:c:\app\net35 rprouse/nunit

Number of workers is correct, but 10 runs succeeded 😦

Run multiple docker instances simultaneously

Run the .NET 3.5 tests in three docker containers at the same time to put the system under load?

Woot, got it two of the three test runs failed.

1) Failed : NUnit.Framework.Internal.Execution.EventQueueTests+DequeueBlocking_StopTest.DequeueBlocking_Stop
Test exceeded Timeout value of 2000ms

and

1) Failed : NUnit.Framework.Internal.Execution.EventQueueTests+SetWaitHandle_Enqueue_AsynchronousTest.SetWaitHandle_Enqueue_Asynchronous
Test exceeded Timeout value of 2000ms

It isn't failing every time, but getting closer...

rprouse · 2018-02-27T02:21:17Z

Comes back a bit sheepish 🐑

3 docker containers worked, so I decided to see if it isn't cores related and just ran three sets of tests on my laptop at the same time.

.\tools\nunit.consolerunner.3.7.0\NUnit.ConsoleRunner\tools\nunit3-console.exe .\bin\Release\net35\nunit.framework.tests.dll

It took a few more tries, but I've managed to reproduce again,

1) Failed : NUnit.Framework.Internal.Execution.EventQueueTests+SetWaitHandle_Enqueue_AsynchronousTest.SetWaitHandle_Enqueue_Asynchronous
  Expected: null
  But was:  <NUnit.Framework.AssertionException:   Expected: True
  But was:  False

   at NUnit.Framework.Assert.ReportFailure(String message) in C:\src\nunit\nunit\src\NUnitFramework\framework\Assert.cs:line 397
   at NUnit.Framework.Assert.That[TActual](TActual actual, IResolveConstraint expression, String message, Object[] args) in C:\src\nunit\nunit\src\NUnitFramework\framework\Assert.That.cs:line 246
   at NUnit.Framework.Internal.Execution.EventQueueTests.SetWaitHandle_Enqueue_AsynchronousTest.Consumer() in C:\src\nunit\nunit\src\NUnitFramework\tests\Internal\EventQueueTests.cs:line 196
   at NUnit.Framework.Internal.Execution.EventQueueTests.ProducerConsumerTest.ConsumerThreadWrapper() in C:\src\nunit\nunit\src\NUnitFramework\tests\Internal\EventQueueTests.cs:line 316>
   at NUnit.Framework.Internal.Execution.EventQueueTests.ProducerConsumerTest.RunProducerConsumer() in C:\src\nunit\nunit\src\NUnitFramework\tests\Internal\EventQueueTests.cs:line 304
   at NUnit.Framework.Internal.Execution.EventQueueTests.SetWaitHandle_Enqueue_AsynchronousTest.SetWaitHandle_Enqueue_Asynchronous() in C:\src\nunit\nunit\src\NUnitFramework\tests\Internal\EventQueueTests.cs:line 181

Initial conclusion is that it is load based, not the number of cores. Oh well, it was fun playing with docker 😄

rprouse · 2018-03-10T20:58:58Z

This isn't fully fixed, we had a test hang in https://ci.appveyor.com/project/CharliePoole/nunit/build/3.10.0-ci-05316-issue-2405. There have been two other failures, but they don't have the new code because the fork wasn't updated. It is looking promising for some of the callstacks we've seen.

I am leaving open and moving to the 3.11 milestone.

jnm2 · 2018-03-10T21:04:12Z

Your work so far is impressive! I'm hopeful.

rprouse · 2019-04-22T16:28:13Z

@jnm2 I think this is fixed. Have you seen recent occurrences?

jnm2 · 2019-04-22T16:37:15Z

@rprouse No, but I haven't been triggering builds in this repo recently.
Most recent net35 failure is a hang: https://ci.appveyor.com/project/CharliePoole/nunit/builds/23999000
But then here's netstandard2.0 hanging as well, so I think it's a separate issue: https://ci.appveyor.com/project/CharliePoole/nunit/builds/23162726

rprouse · 2019-04-23T16:07:59Z

I am going to close this as fixed. We can open another issue if it crops up again. Thanks.

rprouse added is:bug pri:high labels Nov 28, 2017

rprouse added this to the 3.10 milestone Nov 28, 2017

rprouse added a commit that referenced this issue Nov 28, 2017

Increase timeout in .NET 3.5 queue/dequeue tests. See #2579

3021113

rprouse added pri:critical Use this to indicate a hotfix may be necessary and removed pri:high labels Nov 29, 2017

This was referenced Nov 29, 2017

Minimalize csprojs #2584

Merged

CI failure: NUnit.Framework.Api.TestAssemblyRunnerTests.RunAsync_AfterLoad_SendsExpectedEvents #2544

Closed

rprouse mentioned this issue Feb 27, 2018

Replace the ConcurrentQueue and SpinWait compatibility classes #2726

Merged

rprouse modified the milestones: 3.10, 3.11 Mar 10, 2018

rprouse modified the milestones: 3.11, 3.12 Aug 16, 2018

rprouse closed this as completed Apr 23, 2019

rprouse added the closed:done label Apr 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AppVeyor Test Failures under .NET 3.5 #2579

AppVeyor Test Failures under .NET 3.5 #2579

rprouse commented Nov 28, 2017 •

edited

rprouse commented Dec 2, 2017

jnm2 commented Dec 2, 2017

rprouse commented Dec 2, 2017

jnm2 commented Dec 2, 2017

rprouse commented Dec 2, 2017

jnm2 commented Dec 2, 2017 •

edited

jnm2 commented Dec 2, 2017 •

edited

jnm2 commented Dec 6, 2017

jnm2 commented Dec 7, 2017

jnm2 commented Dec 7, 2017

rprouse commented Dec 7, 2017

jnm2 commented Dec 7, 2017 •

edited

rprouse commented Dec 7, 2017

mikkelbu commented Dec 7, 2017

jnm2 commented Dec 7, 2017

jnm2 commented Dec 8, 2017

jnm2 commented Dec 8, 2017

jnm2 commented Dec 11, 2017

rprouse commented Dec 12, 2017

jnm2 commented Dec 12, 2017 •

edited

jnm2 commented Dec 12, 2017

jnm2 commented Jan 24, 2018

rprouse commented Feb 27, 2018

rprouse commented Feb 27, 2018

rprouse commented Mar 10, 2018

jnm2 commented Mar 10, 2018

rprouse commented Apr 22, 2019

jnm2 commented Apr 22, 2019

rprouse commented Apr 23, 2019

AppVeyor Test Failures under .NET 3.5 #2579

AppVeyor Test Failures under .NET 3.5 #2579

Comments

rprouse commented Nov 28, 2017 • edited

rprouse commented Dec 2, 2017

jnm2 commented Dec 2, 2017

rprouse commented Dec 2, 2017

jnm2 commented Dec 2, 2017

rprouse commented Dec 2, 2017

jnm2 commented Dec 2, 2017 • edited

jnm2 commented Dec 2, 2017 • edited

jnm2 commented Dec 6, 2017

jnm2 commented Dec 7, 2017

jnm2 commented Dec 7, 2017

rprouse commented Dec 7, 2017

jnm2 commented Dec 7, 2017 • edited

rprouse commented Dec 7, 2017

mikkelbu commented Dec 7, 2017

jnm2 commented Dec 7, 2017

jnm2 commented Dec 8, 2017

jnm2 commented Dec 8, 2017

jnm2 commented Dec 11, 2017

rprouse commented Dec 12, 2017

jnm2 commented Dec 12, 2017 • edited

jnm2 commented Dec 12, 2017

jnm2 commented Jan 24, 2018

rprouse commented Feb 27, 2018

Notes so far

Running .NET 3.5 tests directly on my laptop

Running tests in a .NET 3.5 docker container

Run multiple docker instances simultaneously

rprouse commented Feb 27, 2018

rprouse commented Mar 10, 2018

jnm2 commented Mar 10, 2018

rprouse commented Apr 22, 2019

jnm2 commented Apr 22, 2019

rprouse commented Apr 23, 2019

rprouse commented Nov 28, 2017 •

edited

jnm2 commented Dec 2, 2017 •

edited

jnm2 commented Dec 2, 2017 •

edited

jnm2 commented Dec 7, 2017 •

edited

jnm2 commented Dec 12, 2017 •

edited