Skip to content

Conversation

Spikhalskiy
Copy link
Contributor

What was changed:

Code in Worker's Poller and PollerTasks was cleaned up to one unified way of handling InterruptedException.

Before LocalActivityPollTask and ActivityPollTask had a full mix of correct and incorrect handling of InterruptedException:

  • It was swallowed and ignored
  • it was replaced with RuntimeException
  • interrupted flag was raised (the only correct way and Pollers got it through GRPC stub implementation)
    See [the article](https://dzone.com/articles/how-to-handle-the-interruptedexception for more context.) for more context about appropriate ways of handling InterruptedException.

Now PollerTasks are refactored to one approach (same as GRPC) - raising interrupted flag. And Poller was refactored to the more explicit way of handling interruptions.

Closes

This PR also includes a small cleanup of ActivityPollTask and closes Issue #204

.build());
}

if (taskQueueActivitiesPerSecond > 0) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy-paste cleanup, Issue #204

@Spikhalskiy Spikhalskiy force-pushed the poller-interruption-cleanup branch from e08d4fc to 3330fa8 Compare May 9, 2021 21:06
}
return accepted;
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here previously we potentially were completely loosing an interrupted flag of a worker thread. [WorkflowWorker -> ReplayWorkflowTaskHandler -> ReplayWorkflowRunTaskHandler -> LocalActivityPollTask#apply]

@Spikhalskiy Spikhalskiy force-pushed the poller-interruption-cleanup branch from 3330fa8 to cc9dc09 Compare May 9, 2021 21:09
|| !(e.getCause() instanceof InterruptedException)
&& !(e instanceof RejectedExecutionException)) {
// if we are terminating and getting rejected execution - it's normal
if (!pollExecutor.isTerminating() || !(e instanceof RejectedExecutionException)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pollExecutor.isTerminating() is covering both pollExecutor.isShutdown() and pollExecutor.isTerminating().
|| !(e.getCause() instanceof InterruptedException) was here for handling what LocalActtivityPollTask was doing here:

catch (InterruptedException e) {
      throw new RuntimeException("local activity poll task interrupted", e);
}

It's now replaced with a proper interrupted flag handling.

// flush the flag
Thread.currentThread().interrupt();
}
return pollExecutor.isTerminating() || threadIsInterrupted;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pollExecutor.isTerminating() is covering both pollExecutor.isShutdown() and pollExecutor.isTerminating()

@Spikhalskiy Spikhalskiy force-pushed the poller-interruption-cleanup branch from cc9dc09 to f4d2664 Compare May 9, 2021 21:25
@Override
public void run() {
try {
if (pollExecutor.isShutdown()) {
Copy link
Contributor Author

@Spikhalskiy Spikhalskiy May 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed here. This condition was just checked in Poller in finally block AND by the thread pool itself on the task acceptance.

@Spikhalskiy Spikhalskiy force-pushed the poller-interruption-cleanup branch 2 times, most recently from 8dee28d to 451747e Compare May 9, 2021 23:19
@Spikhalskiy Spikhalskiy force-pushed the poller-interruption-cleanup branch from 451747e to 73dd89e Compare May 9, 2021 23:20
pollBackoffThrottler.success();
} catch (InterruptedException e) {
// we restore the flag here, so it can be checked and processed (with exit) in finally
Thread.currentThread().interrupt();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also do pollBackoffThrottler.failure() to avoid leaking?

Copy link
Contributor Author

@Spikhalskiy Spikhalskiy May 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say no. InterruptedException is not really something that should trigger a backoff, it's not really a "failure" of any sort, it's a sign that we should terminate the thread because something is canceled or shutting down.
But I don't think it's very important or critical here, we can add it if you think it looks cleaner this way.

Copy link
Contributor

@vitarb vitarb May 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree, main use case for this is probably shutdown and it doesn't really matter what happens with counters at that point, I just didn't want to have a code path where we are not incrementing the counter back. Not a blocker if you feel strongly about it though.

Copy link
Contributor Author

@Spikhalskiy Spikhalskiy May 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not fully getting what do you mean by "leaking" and "incrementing the counter back". It sounds like you expect throttler to have some counter that decreases before the execution and increases at the end, right? And you feel uneasy for it to stuck in a broken state when we decreased it but didn't increase back.
But it's actually not how the throttler works inside. It has .failure() that counts the number of failures that happened in a row and .success() that flushes the counter of failures. I don't think either of these methods should really be called here. And there is no anything that looks like a leak happening here or I don't see it (.throttle doesn't really modify any counters inside).
So, yeah, let's maybe just leave it as it is.

@Spikhalskiy Spikhalskiy requested a review from vitarb May 10, 2021 23:47
@Spikhalskiy
Copy link
Contributor Author

@vitarb Thank you for the review!

@vitarb vitarb merged commit 07a10cf into temporalio:master May 11, 2021
@Spikhalskiy Spikhalskiy mentioned this pull request Aug 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants