Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash on ICE ObjectNotExistException #3731

Merged
merged 2 commits into from Apr 24, 2015
Merged

Conversation

dominikl
Copy link
Member

Attempt to fix Ticket 12120 by catching the exception as suggested by @joshmoore . See last commit. The previous commits are just some refactoring, so that I was able to log the exception.

Test: No idea; the exception is thrown rarely and randomly, couldn't find a way to deliberately trigger the problem. Code review sufficient?

/cc @jburel

@joshmoore
Copy link
Member

@dominikl : certainly in terms of the catch this makes sense. The only thing is that it could be called again, no? Is there anything that the poll() method can do to prevent further invocations?

@dominikl
Copy link
Member Author

@joshmoore Tried to do some cleanup if the exception is thrown, but if this really prevents further calls, I'm not sure. If there just would be a way to test it.

@jburel
Copy link
Member

jburel commented Apr 20, 2015

Can the exception be thrown when loading metadata if for example we select in insight image that I have been removed in Web in the meantime?

@joshmoore
Copy link
Member

No, the Ice.ONEE can only by thrown if the service itself (the omero.cmd.HandlePrx) has been closed/cleaned up by another thread. That's also what it would take to test this:

  • create a long running handle like omero.cmd.Timing
  • start calling poll()
  • call handle.close() (in a second thread)
  • see what happens to the calling of poll() (it should stop without an exception)

@dominikl
Copy link
Member Author

Ok... I'll see if I can implement a test with @joshmoore 's scenario.

@jburel
Copy link
Member

jburel commented Apr 21, 2015

@joshmoore: thanks for the clarification.

@dominikl
Copy link
Member Author

While trying to artificially reproduce the problem, I noticed that the cause probably is the start of this thread: https://github.com/openmicroscopy/openmicroscopy/blob/692f0eb24b5ac97a0115810b4a55df674f185bc7/components/blitz/src/omero/cmd/CmdCallbackI.java#L123 . If the operation finished very quickly and this thread is kicked off, the call to poll() throws the ObjectNotExistException. But as the comment suggests, this thread is deliberately started because of a process could be finished very quickly. What's the purpose of this thread @joshmoore ?
The commits up to now are probably useless, because this should be fixed within blitz package, CmdCallbackI.java.

@joshmoore
Copy link
Member

What was happening is that non-poll() calling clients were never being notified. onFinished is only called if either 1) finished() is called by the server or 2) poll() is called. The notification can only be received after the callback has been added: handle.addCallback(cb);, i.e. a race condition.

Happy to also have a catch (Ice.ONEE) added to client.java, or even a catch (Throwable t) if this is going to lead to an abnormal exit. Do you know how it is that the handle is being closed so quickly?

@dominikl
Copy link
Member Author

No idea. It seems to happen very rarely. I could only reproduce it locally by artificially delaying the call to poll() in this thread, then doing the different tasks in Insight mentioned in the QA reports (e. g. viewing images' metadata) raise the ObjectNotExistsException. I better close this PR, can't make any more progress (further digging into the code, doesn't help, all ICE generated classes, I have no clue what they are doing).
As the purpose of this thread calling poll() is, to make sure clients are notified about the result of a process, which has finished very quickly, I guess the handle shouldn't be closed by the server in the first instance, because then you can't get the result (or status) anyway.
Closing this PR and leaving the ticket for @jburel (or someone else).

@dominikl dominikl closed this Apr 22, 2015
@dominikl dominikl reopened this Apr 22, 2015
@dominikl
Copy link
Member Author

Removed the unnecessary refactoring code and just added one commit, which catches the exceptions which might be thrown on the internal call of the poll() method.

@joshmoore
Copy link
Member

@dominikl : thoughts on joshmoore@4c5fb93 ? This would allow someone to restore the "throwing" behavior or deal with any NPEs that they might be receiving due to dominikl@c74c8af

@dominikl
Copy link
Member Author

@joshmoore Yes, looks good. Shall I cherry-pick your commit, or what's the best way to get it in?

@joshmoore
Copy link
Member

Please.

@dominikl
Copy link
Member Author

Cherry-picked... Thanks @joshmoore for your help; I'm still pretty much lost when it comes to server/ICE stuff.

@mtbc
Copy link
Member

mtbc commented Apr 24, 2015

  1. Maybe call it safePoll or somesuch? (There's nothing intrinsically initial about it?)
  2. Ought the new thread's priority be set?

Those are just ideas. This already looks good to merge to me.

@joshmoore
Copy link
Member

Maybe call it safePoll or somesuch? (There's nothing intrinsically initial about it?)

except it's the method called from the ctor so you can manipulate it to be a no-op. If named safePoll, then there's the assumption that you can call it at any point, meaning turning it into a no-op could have side effects.

Ought the new thread's priority be set?

Don't think we currently do much in terms of managing our priorities. Specific suggestion(s)?

@mtbc
Copy link
Member

mtbc commented Apr 24, 2015

None really, I just wasn't sure if we were keen to get the call done promptly.

All okay. Still good to merge.

joshmoore added a commit that referenced this pull request Apr 24, 2015
Crash on ICE ObjectNotExistException
@joshmoore joshmoore merged commit cf70904 into ome:develop Apr 24, 2015
@sbesson sbesson added this to the 5.1.1 milestone Apr 28, 2015
@dominikl dominikl deleted the crash_on_onee branch May 25, 2015 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants