Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make jetty scheduler threads daemon thread #1637

Merged
merged 1 commit into from
Sep 2, 2015

Conversation

drcrallen
Copy link
Contributor

Fixes bug from #1627

I was able to reproduce it locally for peons that answered a query. Turns out that Jetty was spawning a scheduler which was not marked as daemon.

@@ -223,8 +223,6 @@ public void run()
)
);
injector.getInstance(ExecutorLifecycle.class).join();
// Explicitly call lifecycle stop, dont rely on shutdown hook.
lifecycle.stop();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what is the advantage of removing this and putting the requirement that each thread started by any lifecycle handler be a daemon thread , this assumption is hard to verify in unit tests and could cause peons hanging.
If the concern is that calling stop() twice might cause issues then we should probably use the started flag in the stop() method to ensure it is effectively a noop if called twice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite simply: Because I'm trying to push the nodes to be able to have a very clean shutdown. That means all threads exit properly on their own accord or are daemon threads, and that shutdown hooks are properly executed.

For a thread to hold up the entire JVM (excluding the main thread) is probably a bug.

We don't call lifecycle.stop() on any of the other CLI's and I don't think it is reasonable to require any module or custom CLI to explicitly call lifecycle.stop(), but only if it intends to eventually exit.

This is mostly to help shake out any potential bugs where we may not have fixed a daemon thread.

You are correct that it is harder to test. And if you think this PR should be blocked until the IT are fixed, then we can talk about what the expectations are (I want to fix the IT for this kind of stuff anyways).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think peon process and other druid processes have different contract when it comes to stopping.

For the non-peon druid process, expectation is that they will always run and will stop only when a signal is sent and lifecycle.stop() is indeed called via the shutdown hook. (all non-peon druid process call https://github.com/metamx/java-util/blob/master/src/main/java/com/metamx/common/lifecycle/Lifecycle.java#L283 )
For the peon process however, there are 2 ways for it to stop, one is by sending the signal and another is that it finishes its work and lifecycle.stop() can/should be called in both cases.

BTW, I wouldn't want to hold the PR if people are, in general, agreeing with the approach chosen by this PR :) . I would be happy to merge this if there is agreement on this. Let us bring this up in the syncup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, Handler (https://github.com/metamx/java-util/blob/master/src/main/java/com/metamx/common/lifecycle/Lifecycle.java#L308) has start() and stop().
on start() a handler is free to allocate resources as it wishes (including starting non-daemon threads). It should free up resources and stop all threads on call to stop(). IMO, It would be considered a bug only if it doesn't do so after the call to its stop().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was discussed in the dev-sync that @himanshug 's approach is probably the way to go for now. This may require changes to Lifecycle in java-util such that stop can be applied multiple times. But for the purposes of this PR I'll revert the behavior here.

@@ -161,6 +162,10 @@ private static Server makeJettyServer(@Self DruidNode node, ServerConfig config)

final Server server = new Server(threadPool);

// Without this bean set, the default ScheduledExecutorScheduler runs as non-daemon, causing lifecycle hooks to fail
// to fire on main exit. Related bug: https://github.com/druid-io/druid/pull/1627
server.addBean(new ScheduledExecutorScheduler("JettyScheduler", true), true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does server.stop() not terminate these threads?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@himanshug If the stop is called then yes. The problem is that if you let io.druid.cli.Main naturally exit, then the lifecycle.stop which calls server.stop is never called because this is a daemon thread. With this patch, a clean exit out of Main can be accomplished because the jetty scheduler is not holding up the shutdown hook

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like I'm missing something, but dint we decide to call lifecycle.stop() explicitly and via shutdown hook in cases where Main is expected to exit voluntarily (which is in case of CliPeon only for now) ? In other CliXXX classes, shutdown hook is always registered and they can only exit on a signal which will call the shutdown hook?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This patch eliminates the need to explicitly call for all known cases. As Eric mentioned, in general it should be harder to create non daemon threads than daemon threads. This PR is working towards that and a world where all Druid tasks can shutdown cleanly by simply returning from Main

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stopping the life cycle will still be done explicitly until there are at least better integration tests in place

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for me if that is the conclusion

@drcrallen
Copy link
Contributor Author

Bouncing for Travis

@drcrallen drcrallen closed this Aug 27, 2015
@drcrallen drcrallen reopened this Aug 27, 2015
@drcrallen drcrallen closed this Aug 31, 2015
@drcrallen drcrallen reopened this Aug 31, 2015
@nishantmonu51
Copy link
Member

👍

nishantmonu51 added a commit that referenced this pull request Sep 2, 2015
Make jetty scheduler threads daemon thread
@nishantmonu51 nishantmonu51 merged commit e79572b into apache:master Sep 2, 2015
@drcrallen drcrallen deleted the peonPlaysNiceOnShutdown branch September 2, 2015 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants