Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First interrupt a thread. Stop it if that doesn't work #163

Merged
merged 4 commits into from Oct 11, 2019

Conversation

shen-tian
Copy link
Contributor

@shen-tian shen-tian commented Oct 4, 2019

Fixes #158

Right now, an interrupt first calls Thread.interrupt(). It waits 100ms for the thread to terminate itself. Failing that, it calls Thread.stop(). This gives code that respects the interrupt flag a chance to exit in a more safe manner.

A few things to note:

  • Thread.sleep() throws an InterruptedException when interrupted. This causes an eval-error if not otherwise handled. One test has been updated to reflect this. This does change nREPL's behaviour. I'm not sure how much of an issue this could be?
  • The 100ms waiting time was arbitrary. Guess the "correct" value depends on how often the thread being interrupted checks the flag. I have no idea what's the typical interval, but can imagine a situation where the client would want to set the time to be longer? I also feel like the default wait time shouldn't be too long. @FreekPaans: just to get one more data point, what would be the "right" timeout for you?

I think this provides us a way to deal with the #132, where a an interrupted eval with streamed printing can generate malformed bencode. Not too sure how that should be done though.

@FreekPaans
Copy link

Hi. Nice work! A default of 100ms seems a little bit on the low end to me. From the perspective of a UI client (like CIDER), you would typically interrupt when something takes too long, right? So I guess an additional second or 2 of waiting wouldn't be a problem in that case. For my use case, 5 seconds would be fine.

Just another thought: how hard would be to encode the timeout in the actual message (sorry, not too familiar with nrepl)? In that case you can let the client decide.

@shen-tian
Copy link
Contributor Author

Hi. Nice work! A default of 100ms seems a little bit on the low end to me. From the perspective of a UI client (like CIDER), you would typically interrupt when something takes too long, right? So I guess an additional second or 2 of waiting wouldn't be a problem in that case. For my use case, 5 seconds would be fine.

Just another thought: how hard would be to encode the timeout in the actual message (sorry, not too familiar with nrepl)? In that case you can let the client decide.

I also realised we don't actually need to wait for the thread to terminate or stop before returning on the interrupt call, so something like a 5s wait time wouldn't slow things down on the client side.

The timeout is really "what's the cap on CPU time wasted on an eval after you interrupt it".

@bbatsov : your thoughts on adding this timeout as something the client can specify?

@bbatsov
Copy link
Contributor

bbatsov commented Oct 4, 2019

@bbatsov : your thoughts on adding this timeout as something the client can specify?

I'm on the fence about this. It really depends on whether we can agree that some default is good enough for most cases.

@shen-tian
Copy link
Contributor Author

I'm fiddling a version that does the wait-and-stop asynchronously.

If we can get that to work with say, a 5s timeout, that might provide a good enough default behaviour that we won't need to add a customisable parameter for now?

@shen-tian
Copy link
Contributor Author

shen-tian commented Oct 9, 2019

Ok. Updated this to work as follows:

  1. Calls interrupt
  2. Wait 100ms. This is mainly to allow thread that respond quickly to interrupts to send a message back in response to the interrupt. Significantly, this includes an exception thrown by Thread/sleep.
  3. Asynchronously: wait another 5000ms for the thread to cleanly terminate. Only calls .stop if it fails to do so (and risk state corruption)

I think this set of behaviours strikes a balance between allowing a thread to respond to an interrupt, but also ensuring we actually kill runaway processes. The 5000ms is from discussion above, and 100ms chosen to not slow down the response too much. I can imagine someone wanting different values for their particular application, but that feels speculative.

The build is failing because of a network issue with the master build of clojure.

CHANGELOG.md Outdated Show resolved Hide resolved
@bbatsov bbatsov merged commit 2224548 into nrepl:master Oct 11, 2019
@bbatsov
Copy link
Contributor

bbatsov commented Oct 11, 2019

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Confused about interrupt
3 participants