Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invoker graceful shutdown and drain mode #3704

Closed
wants to merge 12 commits into from

Conversation

dubee
Copy link
Member

@dubee dubee commented May 24, 2018

Allows an invoker to manually be marked as down so that maintenance can be performed on the invoker. When in maintenance mode, the invoker will finish executing all actions currently running and finish processing its queued actions. Since an invoker in maintenance mode will appear as down to the load balancer, new actions will not be queued up for the invoker. The invoker will exit maintenance mode when it is restarted.

Related to #3681

Description

Related issue and scope

  • I opened an issue to propose and discuss this change (#????)

My changes affect the following components

  • API
  • Controller
  • Message Bus (e.g., Kafka)
  • Loadbalancer
  • Invoker
  • Intrinsic actions (e.g., sequences, conductors)
  • Data stores (e.g., CouchDB)
  • Tests
  • Deployment
  • CLI
  • General tooling
  • Documentation

Types of changes

  • Bug fix (generally a non-breaking change which closes an issue).
  • Enhancement or new feature (adds new functionality).
  • Breaking change (a bug fix or enhancement which changes existing behavior).

Checklist:

  • I signed an Apache CLA.
  • I reviewed the style guides and followed the recommendations (Travis CI will check :).
  • I added tests to cover my changes.
  • My changes require further changes to the documentation.
  • I updated the documentation where necessary.

@dubee dubee changed the title Terminate invoker health scheduler via REST endpoint Provide invoker maintenance mode May 24, 2018
@dubee dubee requested a review from markusthoemmes May 24, 2018 18:30
@dubee dubee added the review Review for this PR has been requested and yet needs to be done. label May 24, 2018
@codecov-io
Copy link

codecov-io commented May 24, 2018

Codecov Report

Merging #3704 into master will increase coverage by 0.44%.
The diff coverage is 13.46%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3704      +/-   ##
==========================================
+ Coverage   74.34%   74.79%   +0.44%     
==========================================
  Files         128      128              
  Lines        6081     6129      +48     
  Branches      398      387      -11     
==========================================
+ Hits         4521     4584      +63     
+ Misses       1560     1545      -15
Impacted Files Coverage Δ
.../scala/src/main/scala/whisk/common/Scheduler.scala 100% <ø> (ø) ⬆️
...er/src/main/scala/whisk/core/invoker/Invoker.scala 0% <ø> (ø) ⬆️
...ain/scala/whisk/core/invoker/InvokerReactive.scala 0% <0%> (ø) ⬆️
...scala/whisk/core/containerpool/ContainerPool.scala 84.88% <0%> (-1%) ⬇️
...n/scala/whisk/core/connector/MessageConsumer.scala 75.28% <43.75%> (-8.29%) ⬇️
...cala/src/main/scala/whisk/http/ErrorResponse.scala 89.77% <0%> (+1.13%) ⬆️
...on/scala/src/main/scala/whisk/common/Logging.scala 87.35% <0%> (+1.14%) ⬆️
...ain/scala/whisk/core/entity/ActivationResult.scala 92.3% <0%> (+1.53%) ⬆️
...in/scala/whisk/utils/ExecutionContextFactory.scala 100% <0%> (+7.69%) ⬆️
...rc/main/scala/whisk/core/entity/ExecManifest.scala 96.34% <0%> (+14.63%) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9e5a5e8...3c5855f. Read the comment docs.

@chetanmeh
Copy link
Member

I believe this is first management api over REST which allows change of state of system. Does this api has some form of security enabled?

@rabbah
Copy link
Member

rabbah commented May 25, 2018

+1 @chetanmeh

The administrative interfaces should have authentication in place. While at it, we should apply the same to the /invokers route in the controller perhaps.

@markusthoemmes
Copy link
Contributor

What do we gain over docker stoping the container in this case?

Have you thought about using kill-signal based functionality here to integrate better with tools like kube?

@csantanapr
Copy link
Member

csantanapr commented May 27, 2018

I vote for not using url I consider this anti pattern.

I preferred os.Signal for use cases that “docker CLI” doesn’t cover CLI already if any.

@chetanmeh
Copy link
Member

Both Kubernetes and Mesos support graceful shutdown via SIGTERM followed by some kind of graceful shutdown timeout (~ 30 secs and configurable). Only caveat here is that the shell shipped by default with Alpine Linux does not pass signals to children so we would need to launch with bash (currently we do install bash in common Docker base image)

So if we can ensure that Invoker can finish up shutdown in this time window then we should rely on this support. If at all we need to expose a url endpoint then we can secure it via leveraging same credentials which we use for JMX authentication (either directly or probably via some form of JAAS auth)

Copy link
Member Author

@dubee dubee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@csantanapr, @chetanmeh, I pushed a second commit that uses sigterm to put the invoker in maintenance mode. To put an invoker in this mode, execute: docker kill --signal=USR2 invoker0.

I think it is important to note that maintenance mode and a graceful shutdown are two different features. A graceful shutdown would have to check to see if the invoker's queue is empty and that there are no user containers running, which I think is useful. Whereas maintenance just marks the invoker as down in the load balancer.

case "USR2" =>
logger.info(this, s"Stopping health scheduler")
actorSystem.stop(healthScheduler)
while (true) {}
Copy link
Member Author

@dubee dubee May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a graceful shutdown here, we would need to make sure user containers are not running and possibly check that the queue is empty.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By queue do you mean the internal message feed or the kafka topic? The topic doesn't need to be empty as that can take an indeterminate amount of time --- bringing up another invoker to take over the topic would address the messages held in the topic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the Kafka topic, I mean. Need to find a way to make the invoker stop consuming messages from the topic. Will take a look.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the above, might just be able to shutdown the activationFeed actor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the feed had a shutdown hook. It is worth adding one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is a shutdown hook for containerFactory. Perhaps maintenance mode is not necessary if we have a graceful shutdown.

@rabbah
Copy link
Member

rabbah commented May 29, 2018

@chetanmeh 30s may not be enough time to drain all the request live in memory since each request could be for the max allowed duration. This assumes draining means execute the outstanding requests. It could also mean post these requests back to the topic and let some other invoker deal with them. For that, 30s is likely more suitable.

@markusthoemmes
Copy link
Contributor

Correct, maintenance mode and graceful shutdown are different things. What's the benefit of this extra kill signal vs. just killing the whole invoker? Running actions will die either way.

@dubee
Copy link
Member Author

dubee commented May 30, 2018

@markusthoemmes, as long as the thread is busy where we hook the signal the invoker will not shutdown. Currently there is just an infinite loop there, but a graceful shutdown would keep the thread busy until a shutdown can happen.

@dubee dubee changed the title Provide invoker maintenance mode Provide graceful shutdown for invoker Jun 1, 2018
@dubee dubee added wip and removed review Review for this PR has been requested and yet needs to be done. labels Jun 1, 2018
}
})

Signal.handle(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is still a little clunky, but I think the behavior is right. Ie, stopping the health scheduler, stopping the Kafka message consumer, and waiting for the user containers to finish.

@dubee dubee force-pushed the invoker-maintenance branch 3 times, most recently from 6f3cb5c to 558647a Compare June 4, 2018 20:48
@dubee
Copy link
Member Author

dubee commented Jun 4, 2018

@dgrove-oss, do you think these changes will work with Kube's graceful shutdown feature now?

@dubee
Copy link
Member Author

dubee commented Jun 4, 2018

@dgrove-oss, I am using docker kill --signal=TERM invoker0 to kick off the invoker graceful shutdown. This should be equivalent of kill 1 from within the invoker container.

@dgrove-oss
Copy link
Member

This looks pretty good to me. I think we'd just need to tweak invoker.yaml to set the container'sterminationGracePeriodSeconds to a bigger value than its default of 30 seconds to make sure we give all the user containers the time they need to finish cleanly. Maybe set the grace period for kubernetes to the max time for a user function invocation + a small fudge factor?

@rabbah
Copy link
Member

rabbah commented Jun 4, 2018

Max duration isn’t enough since there could be requests queued still in memory (feeds are double buffered).

@markusthoemmes
Copy link
Contributor

I'm wondering if there is a race condition possible: Can the pool be intermittently non-busy while getting new messages from the feed? That is: Should we attach this condition to the feed as in:

  1. Shut down the feed.
  2. Wait until the feed has no more messages and has received a "Processed" message for every message it already handed out.

@rabbah
Copy link
Member

rabbah commented Jun 5, 2018

There’s no other way to know the invoker has quiesced - but a processed message isn’t enough since the pool releases the resource (sends the message) before all the book keeping has finished. For example outstanding db operations might still be in flight to record the activation metadata.

while (Await.result(pool ? Busy, timeout.duration).asInstanceOf[Boolean] == true) {
logging.info(this, s"Container pool is busy")
Thread.sleep(1000)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about applying some Future composition here to simplify the flow, like

/** Polls the pools status and returns a future which completes once the pool is idle. */
def waitForContainerPoolIdle(pool: ActorRef): Future[Unit] = {
  (pool ? Busy).mapTo[Boolean].flatMap {
    case false => akka.pattern.after(1.second, system.scheduler)(waitForContainerPoolIdle(pool))
    case true => Future.successful(())
  }.recoverWith { case _ => akka.pattern.after(1.second, system.scheduler) }
}

val shutdowns = Seq(
  gracefulStop(healthScheduler, 5.seconds).recover { case _ => logging.info(this, "Health communication failed to shutdown gracefully")}, 
  gracefulStop(activationFeed, 5.seconds).recover { case _ => logging.info(this, "Activation feed failed to shutdown gracefully")}, 
  waitForContainerPoolIdle(pool))

// Allow the shutdown to take a maximum of 3 times the maximum action runtime since the 
// feed can be buffered and we want to allow for some grace period.
Await.result(shutdowns, TimeLimit.MAX_DURATION * 3)

Does that make sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you emit a log message while waiting for quiescence?

sendOutstandingMessages()
stay

case Event(FillCompleted(messages), _) =>
Copy link
Member Author

@dubee dubee Jun 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still concerned about a race condition here. There might be a chance that the Busy event returns false while fillPipeline() is running and has yet to trigger the FillCompleted event.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right - if filling, you need to wait.
you'll also need to prevent a further fill.

Copy link
Member Author

@dubee dubee Jun 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, do we know if events are handled in synchronous or asynchronous fashion in the Akka FSM?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always asynchronous for akka.

@dubee dubee changed the title Provide graceful shutdown for invoker Invoker graceful shutdown and drain mode Jun 8, 2018
@dubee dubee added review Review for this PR has been requested and yet needs to be done. and removed wip labels Jun 8, 2018
try {
// Allow the shutdown to take a maximum of 3 times the maximum action runtime since the feed can be
// buffered and we want to allow for some grace period.
Await.result(shutdowns, TimeLimit.MAX_DURATION * 3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be we return a different exit code in case of unclean shutdown

} finally {
containerFactory.cleanup()
logging.info(this, "Shutting down invoker")
System.exit(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given its a System.exit we should catch and log exception of Await operation

// Capture SIGTERM signals to gracefully shutdown the invoker. When gracefully shutting down, the health scheduler is
// shutdown preventing additional actions from being scheduler to the invoker, then the invoker processes its buffered
// messages from the activation feed, and waits for its user containers to finish running before the process exits.
Signal.handle(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to rely on Signal here or normal shutdown hook would work as its also invoked in response to SIGTERM

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From experimenting, order of shutdown hooks is not guaranteed. That means the actor system may shutdown before the graceful shutdown hook is executed. However, the actor system needs to be running in order to perform the graceful shutdown.

Copy link
Member

@chetanmeh chetanmeh Jun 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okie. Reading this comment and Signal Source Docs it appeared that JVM would ignore any handler for signals for which it registers its own handler like SIGTERM.

 * Java code cannot register a handler for signals that are already used
 * by the Java VM implementation. The <code>Signal.handle</code>
 * function raises an <code>IllegalArgumentException</code> if such an attempt
 * is made.

Per comment Oracle JDK would not throw exception and silently ignore the handler. So was not sure if current approach would always work or not

@markusthoemmes
Copy link
Contributor

@dubee what's the status of this? Seems like you've run into some dead-ends/race conditions?

@markusthoemmes markusthoemmes added the awaits-contributor The contributor needs to respond to comments from reviewer. label Sep 18, 2018
@markusthoemmes markusthoemmes added wip and removed wip labels Sep 18, 2018
@dubee dubee added awaits-reviewer The reviewer needs to respond to comments from contributer. and removed awaits-contributor The contributor needs to respond to comments from reviewer. labels Sep 24, 2018
@dubee
Copy link
Member Author

dubee commented Jan 14, 2019

Closing this as there seem to be race conditions.

@dubee dubee closed this Jan 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaits-reviewer The reviewer needs to respond to comments from contributer. review Review for this PR has been requested and yet needs to be done.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants