Add auto backpressure, proportional load balancer and realtime monitor to your service.
Right now we are at 0.4.0, the plan is the stablize the API from 0.5.0 going on. So there will still be major API changes coming.
When you hit a service that has a max capacity of 100 request per second with 200 requests per second, you get:
Without Kanaloa
Latency increases until few requests get handled within accetable time
With Kanaloa
Latency is controlled, half of requests are handled timely, the other half are rejected immediately.
Kanaloa automatically figures out the capacity of your service and let it handles at maximum capacity, rejects requests above it.
Kanaloa also adaptes to your backend capacity change automatically.
Kanaloa work dispatchers sit in front of your service and provides auto back pressure through the following means:
- Autothrottle - it dynamically figures out the optimal number of concurreny your service can handle, and make sure that at any given time your service handles no more than that number of concurrent requests. This simple mechanism was also ported and contributed to Akka as Optimal Size Exploring Resizer with some limitations. See details of the algorithm below.
- PIE - A traffic regulator based on the PIE algo (Proportional Integral controller Enhanced) suggested in this paper https://www.ietf.org/mail-archive/web/iccrg/current/pdfB57AZSheOH.pdf by Rong Pan and his collaborators. See details of the algorithm below.
- Circuit breaker - when error rate from your service goes above a certain threshold, the kanaloa will slow down sending work to them for a short period of time to give your service a chance to "cool down".
- Real-time monitoring - a built-in statsD reporter allows you to monitor a set of critical metrics (throughput, failure rate, queue length, expected wait time, service process time, number of concurrent requests, etc) in real time. It also provides real-time insights into how kanaloa dispatchers are working. An example on Grafana:
resolvers += Resolver.jcenterRepo
libraryDependencies += "com.iheart" %% "kanaloa-core" % "0.4.0"
An example of a my-dispatcher
kanaloa {
dispatchers {
my-service1 {
workerPool {
startingPoolSize = 8
}
workTimeout = 3s
}
}
}
For more configuration settings and their documentation please see the reference configuration
There are two types kanaloa dispatchers: PushingDispatcher
and PullingDispatcher
.
PushingDispatcher
takes work requests when you send message to it.
val system = ActorSystem()
// suppose you wrote your service in an actor,
// which takes a SomeWork(someRequest) message and
// replies SuccessResult(someResults) when it succeeds
val serviceActor = system.actorOf(MyServiceActor.props)
val dispatcher =
system.actorOf(PushingDispatcher.props(
name = "my-service1",
serviceActor
) {
case SuccessResult(r) => Right(r) // ResultChecker that tells kanaloa if the request is handled succesffully
case _ => Left("something bad happened")
})
dispatcher ! SomeWork("blahblah") //dispatcher replies the result (whatever wrapped in the SuccessResult) back.
PullingPatcher
pulls work from an iterator, ideal when you have control over the source of the work (e.g. a task iterates through your DB)
//assume you have the same system and serviceActor as above.
//unrealistic example of an iterator you would pass into a PullingDispatcher
val iterator = List(SomeWork("work1"), SomeWork("work2"), SomeWork("work3")).iterator
val dispatcher =
system.actorOf(PullingDispatcher.props(
name = "my-service2",
iterator,
serviceActor
) {
case SuccessResult(r) => Right(r) // this is your ResultChecker which tell kanaloa if the request is handled succesffully
case _ => Left("something bad happened")
})
// dispatcher will pull all work in dispatch them to service and shutdown itself when all done.
We used an ActorRef as service here, Kanaloa also supports Props and T => Future[R]
function as service. You can use any T
as service if you implement an implicit T => Backend
where Backend
is a simple trait:
trait Backend { def apply(af: ActorRefFactory): ActorRef }
Kanaloa has a built-in statsD reporter that allows users to monitor important metrics in real time.
Add following to your config
kanaloa {
default-dispatcher {
metrics {
enabled = on
statsd {
host = "localhost" #host of your statsD server
port = 8125
}
}
}
}
Metrics reporting settings at the dispatcher level, e.g.
kanaloa {
dispatchers {
example {
metrics.enabled = off
}
example2 {
metrics {
statsd {
eventSampleRate = 0.01
}
}
}
}
}
For more settings please see the reference configuration
We provide a grafana dashboard if you are using grafana for statsD visualization. We also provide a docker image with which you can quickly get up and running a statsD server and visualization web app. Please follow the instructions there.
Disclosure: some of the following descriptions were adapted from the documentation of Akka's OptimalSizeExploringResizer, which was also written by the original author of this document.
Behind the scene kanaloa dispatchers create a set of workers that pull work on behave of your service. These workers wait for result coming back from the service before they accept more work from the dispatcher. This way it thorttles the number of concurrent requests the service handles. The autothrottle constantly search for the optimal concurrency, i.e. the just right concurrency that allows backend to provide maximum capactity. For example, if the service achieves maximum output by handling 30 requests at the same time, kanaloa will make sure it doesn't handle more than 30 requests concurrently. Typcially, hitting a service with above its optimum concurrency will result in either higher latency (the extra requests have to wait in a queue) or lower throughput (extra concucrrent requests have to compete for limited resource causing contention.)
The autothrottle keeps track of throughput and latency at each pool size and perform the following three resizing operations (one at a time) periodically:
- Downsize if it hasn't seen all workers ever fully utilized for a period of time.
- Explore to a random nearby pool size to try and collect throughput metrics.
- Optimize to a nearby pool size with a better (than any other nearby sizes) throughput and latency metrics.
When the pool is fully-utilized (i.e. all workers are busy), it randomly chooses between exploring and optimizing. When the pool has not been fully-utilized for a period of time, it will downsize the pool to the last seen max utilization multiplied by a configurable ratio.
By constantly exploring and optimizing, the resizer will eventually walk to the optimal size and remain nearby. When the optimal size changes it will start walking towards the new one.
If autothrottle is all you need, you should consider using OptimalSizeExploringResizer in an Akka router. (The caveat is that you need to make sure that your routees block on handling messages, i.e. they can’t simply asynchronously pass work to the backend service).
A traffic regulator based on the PIE algo (Proportional Integral controller Enhanced) suggested in this paper https://www.ietf.org/mail-archive/web/iccrg/current/pdfB57AZSheOH.pdf by Rong Pan and his collaborators. The algo drop request with a probability, here is the pseudocode:
Every update interval Tupdate
-
Estimation current queueing delay
currentDelay = queueLength / averageDequeueRate
-
Based on current drop probability, p, determine suitable step scales:
if p < 1% : α = α΄ / 8, β = β΄ / 8 else if p < 10% : α = α΄ / 2, β = β΄ 2 else : α = α΄, β = β΄
-
Calculate drop probability as:
p = p + α * (currentDelay - referenceDelay) / referenceDelay + β * (currentDelay - oldDelay) / referenceDelay
-
Update previous delay sample rate as
OldDelay - currentDelay
The regulator allows for a burst, here is the calculation
-
When enqueueing
if burstAllowed > 0 enqueue request bypassing random drop
-
upon
Tupdate
if p == 0 and currentDelay < referenceDelay / 2 and oldDelay < referenceDelay / 2 burstAllowed = maxBurst else burstAllowed = burstAllowed - timePassed (roughly Tupdate)
Any contribution and feedback is more than welcome.
The autothrottle algorithm was first suggested by @richdougherty. @ktoso kindly reviewed this library and provided valuable feedback.