-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime.Remoting.RemotingException in NUnit.Engine.Runners.ProcessRunner.Dispose #219
Comments
Sorry, I'm closing this until make sure I have proper version. Never mind. |
So, I've checked version:
OS and Mono:
|
This looks to be very much related to issue nunit/nunit#1834 where we get a similar stack trace about a closed connection when disposing. Obviously this is slightly different since you're running mono and not using the explore flag, but I'd hazard a guess that they're the same underlying bug. |
I put a confirm label on this because I think it is likely a duplicate of nunit/nunit#1834 |
I've spent some time and now have more information about cause of this issue. There is a race between
And
So, my guess is that problem is that remote runner process receives stop signal and immiditely closes tcp connection while Host process tries gracefully finalize tcp connection which is already closed at that moment. Just to work around, I can either ignore that exception in Sure, I could be wrong with all that, but quick investigation ended up with such conclusion. |
Looks like a lead! I'm surprised the |
I don't think the |
Yes, I don't know how does signaling stop event in host process and waiting in child process work, but can assume it might have underlying tcp communication involved via the channel that is closed by child process just after receiving signal. So that, Host (from dispose) sends Stop() this stop is not atomic, for example it sends asynchronous signal, remote side receives it unblocks from wait and closes channel while host reads something from that channel to complete Send(). This is just the only hypothesis I have for now, I don't know how this Remoting stuff works internally and even not sure I correctly followed relations between different nunit models (not familiar with codebase). So, excuse me if I'm wrong and waste your time. And this hypothesis is likely wrong but what is probably correct that signaling stop event on host side is not atomic, it makes remote side woken up while its further activity (whether it is closing channel or just exiting) makes host Send throwing exception. |
@lsem I believe you are right about the core of the problem. While the console is in the process of making the remoting call to I have just proven this. All you need is to put a delay in between the two to make sure remoting has returned, and the error vanishes. |
Next up: familiarize myself with remoting best practices around shutdown. Same question, no answer: https://social.msdn.microsoft.com/Forums/en-US/e8fd1a5c-ee9d-4e37-9773-23193dfa30a0 There does not seem to be any way to synchronize with the remoting logic to make sure that you don't shut down while trying to return information. If there's a way to decorate I stepped through in the debugger. We basically need to shut down no sooner than this line finishing: Perhaps writing an |
I'm almost there but I have to call it for the night. The only hitch is that there are two channels being created and |
Fingers crossed @jnm2 - sounds like a sensible solution. Cheers for looking in to this one, it'll be great to no longer see this error! |
#223 is ready for review! |
@jnm2 I can confirm that putting delay fixes (works around) this particular problem. That was what I actually did for my build server. |
We believe this was fixed by #223 which will be released as part of 3.7. If you'd like to confirm sooner that the issue is fixed, please try If it turns out that this does not fix your issue, please reopen and comment. |
I am still seeing this in NUnit.ConsoleRunner v3.7.0 (from NuGet) using Linux/Mono. It's intermittent and doesn't crash every time.
I have a log of the whole build used to reproduce this available on Travis. In that log, the crash above is recorded starting on line 2784. I think that the rest of the log should contain just about everything else about the build environment you might possibly want to know. |
@craigfowler Thank you for that. You're not the only one now. Please see #255 (comment) - if you're able, testing the runner in that comment with |
@rprouse Or should I suggest he try the runner from https://travis-ci.org/nunit/nunit-console/builds/266292088? |
@jnm2 I don't think my changes will help for @craigfowler. You cannot install on Linux, so it won't find another engine and my repro happens all the time. |
OK, I'm set up using that build of 3.8.0 mentioned in the comment of #255 and all of my Linux/Travis builds will use that test runner for now. So far so good (that is - I've seen at least one success), although of course it being intermittent I'll have to wait a while with an eye out for failures. One thing - and tbh I'm not sure if it's NUnit messaging or Selenium WebDriver (I don't know either's messages well enough to tell them apart). In lines 2790 and 2791 of this build I get the following warnings, although the build DID pass without error:
That might be a red herring though, if it's me being dumb and doing something wrong with a webdriver, or perhaps it is related to this issue? It's worth saying that I have a fair bit going on in an Not familiar enough with NUnit's architecture to know if that's useful information or just tangential waffle. |
Great news, keep us posted!
I didn't see this when I ran that build, but the temporary fix is |
Update: I've just seen a failure using the experimental console runner build linked above. It's logged at this Travis build, starting on line 2678. This is using the recommended command-line switches as well. The actual command-line used to run these tests is available in the build script line 34.
|
In general, I send people to an appropriate dev build on MyGet. Those are already reviewed and merged, whereas appveyor builds may or may not be. The MyGet builds are like our continuous beta release toward the next version. |
A workaround for us is to run only one test assembly agent at a time by setting the command line option This solved for us the intermittent error 'The object with ID 2 implements the IObjectReference interface for which all dependencies cannot be resolved. The likely cause is two instances of IObjectReference that have a mutual dependency on each other.'. We are running the NUnit Console Runner 3.7.0. with mono on Ubuntu 16.04. The error only occurs intermittently when testing more than one assembly. |
error: The object with ID 2 implements the IObjectReference interface for which all dependencies cannot be resolved. The likely cause is two instances of IObjectReference that have a mutual dependency on each other. ref nunit/nunit-console#219
error: The object with ID 2 implements the IObjectReference interface for which all dependencies cannot be resolved. The likely cause is two instances of IObjectReference that have a mutual dependency on each other. ref nunit/nunit-console#219
error: The object with ID 2 implements the IObjectReference interface for which all dependencies cannot be resolved. The likely cause is two instances of IObjectReference that have a mutual dependency on each other. ref nunit/nunit-console#219
Hi,
I'm trying to migrate my windows build server to linux and now have crashes in my unit tests:
May be this is known issue?
The text was updated successfully, but these errors were encountered: