New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AppVeyor Test Failures under .NET 3.5 #2579
Comments
Hit yet another .NET 3.5 test failure today
|
I think we should change all timeouts to a ten second minimum. That will help distinguish actual logic errors from CI being overloaded. |
If it is the CI being overloaded, why is it always the .NET 3.5 tests that fail and never any of the other targets? I am okay with increasing the timeout for 3.5 since it might be inefficiencies in the .NET 3.5 runtime, but I would prefer to keep the other targets where they are since they never seem to fail. |
@rprouse That's a good question, but I imagine the same answer would explain either inefficient thread handling or actual logic errors. I guess my question is, is it worth special-casing 3.5? In my line of thought, the non-3.5 tests don't take 1000ms. They take maybe 1ms and would continue to take 1ms even when the timeouts are all changed to 10000ms. |
Maybe, I would just hate to see slowdowns creep into the code and not be noticed because of the larger timeout, but I also agree that timeouts shouldn't be used for performance testing so I am good either way. |
I could be wrong but I haven't seen any of our timeouts measure slowness of code, only binary correctness of logic. It would be nice to make it a standard that Timeout is only for binary logic failures (simply as a backstop so that we don't wait an hour for CI to fail) and MaxTime for performance measurements (because we'll have actual numbers rather than interrupting it.) In some cases, you might even use them together but each for a separate purpose. |
Well, let's not waste time changing test timeouts to 10,000ms then, since NUnit Framework itself is buggy. There's pretty clearly a logic error since this happened multiple times now that I think about it. |
Ran into it again. |
Ran into it three more times. |
And again. |
Here is a new .NET 3.5 failure 😦
|
I really want to know what's special about net35. I can't think of any way that it's just failing because it's third in the list. |
I was thinking of using the 3.5 Docker container |
Actually, in https://ci.appveyor.com/project/CharliePoole/nunit/build/3.10.0-ci-04960-pr-2586#L318
but 4.5 is also failing (in another test)
|
Came here to say the same as @mikkelbu. |
Observed a higher time on AppVeyor 4.5:
|
We should probably multiply all our performance tests by at least ten when running under CI. That was where my |
We'll be forced to take care of the performance tests next in order to ship a .NET Standard 2.0 DLL: #2555 |
Another .NET 3.5 failure
|
Latest AppVeyor net35 failure. This time it looks like the whole process died:
|
New AppVeyor/net35 crash:
|
I'm starting to see this every now and then on Travis/netstandard2.0:
|
I've started looking at this. First step is to start reproducing locally. Notes so farRunning .NET 3.5 tests directly on my laptopI cannot reproduce just running the .NET 3.5 tests on my machine. Best guess is that it is related to number of cores? Pretty sure that the AppVeyor VMs have one core, so we default to two workers to run. I've tried locally with one and two workers, but no luck. Running tests in a .NET 3.5 docker containerNext test, creating a docker image with this
Which I build with docker build -t rprouse/nunit . First build is slow as it pulls all of the base images needed. Subsequent are much quicker. I can then run the tests with, docker run -it -v c:\src\nunit\nunit\bin\Release\net35:c:\app\net35 rprouse/nunit Number of workers is correct, but 10 runs succeeded 😦 Run multiple docker instances simultaneouslyRun the .NET 3.5 tests in three docker containers at the same time to put the system under load? Woot, got it two of the three test runs failed.
and
It isn't failing every time, but getting closer... |
Comes back a bit sheepish 🐑 3 docker containers worked, so I decided to see if it isn't cores related and just ran three sets of tests on my laptop at the same time. .\tools\nunit.consolerunner.3.7.0\NUnit.ConsoleRunner\tools\nunit3-console.exe .\bin\Release\net35\nunit.framework.tests.dll It took a few more tries, but I've managed to reproduce again,
Initial conclusion is that it is load based, not the number of cores. Oh well, it was fun playing with docker 😄 |
This isn't fully fixed, we had a test hang in https://ci.appveyor.com/project/CharliePoole/nunit/build/3.10.0-ci-05316-issue-2405. There have been two other failures, but they don't have the new code because the fork wasn't updated. It is looking promising for some of the callstacks we've seen. I am leaving open and moving to the 3.11 milestone. |
Your work so far is impressive! I'm hopeful. |
@jnm2 I think this is fixed. Have you seen recent occurrences? |
@rprouse No, but I haven't been triggering builds in this repo recently. |
I am going to close this as fixed. We can open another issue if it crops up again. Thanks. |
These test failures under .NET 3.5 are getting consistent enough that we are getting few passing builds and we had one build time out at 60 minutes in the .NET 3.5 tests.
In PR #2380 I am going to
#if net35
and increase the timeout as a temporary workaround, but we should look at this. I am curious why we are having so many .NET 3.5 threading issues, but no other target.The text was updated successfully, but these errors were encountered: