New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Low throughput when using Flow.throttle with low maxBurst #23475
Comments
To me the api docs are pretty clear on this:
I'm not sure such a detailed description belongs in the "quickstart" but if you think this could be made more clear in the API docs, or in fact should be explained in the quickstart, a PR is definitely welcome! |
Right, that's how I figured out that burst is not above the baseline. I'm ok to add the note to the quickstart (and api doc) saying: "To achieve 1000 QPS you should use @ktoso what's your opinion? |
The sample isn't about achieving 1000qps though, so that sounds weird. I'm thinking something more along the lines of replacing "(the second 1 in the argument list is the maximum size of a burst that ...)." with something that better describes the saving-up-to-a-burst behaviour. Hard to describe it in a succinct way, maybe just end the parenthesis with a "read more in the API docs" linked to API docs? |
Why not? As a user of the quickstart guide, what I'm looking for is how to throttle at X QPS*, not necessarily worrying about token bucket, spare tokens, ... If I want to know more how throttle works, I'll dig into the api doc. |
What would One could probably go lower, but below 20 ms interval per burst I don't think it will be very smooth. Is that not how it's working? |
I hadn't thought about what a |
ok, I hadn't thought of that scenario :) |
With import java.util.concurrent.ConcurrentHashMap
import akka.actor.ActorSystem
import akka.stream.scaladsl.Source
import akka.stream.{ActorMaterializer, ThrottleMode}
import scala.collection.JavaConverters._
import scala.collection.immutable.{SortedMap, SortedSet}
import scala.concurrent.Await
import scala.concurrent.duration._
object WhatsWithThrottling {
def main(args: Array[String]): Unit = {
implicit val system = ActorSystem("QuickStart")
implicit val materializer = ActorMaterializer()
val startMillis = System.currentTimeMillis()
val ticks = new ConcurrentHashMap[Long, Long]()
val future = Source.fromIterator(() => Iterator.continually(1))
.throttle(1000, per = 1.second, maximumBurst = 100, ThrottleMode.shaping)
.take(1000)
.runForeach(_ => {
val bucket = (System.currentTimeMillis() - startMillis) / 10
ticks.putIfAbsent(bucket, 0)
ticks.computeIfPresent(bucket, (k, v) => v + 1)
})
Await.result(future, atMost = 1.hour)
system.terminate()
println(s"Elapsed: ${System.currentTimeMillis() - startMillis} ms, got ${ticks.asScala.to[SortedSet]}")
}
} I get
which means that every 30 ms there are ~30 elements going through the throttler With .throttle(1000, per = 1.second, maximumBurst = 10, ThrottleMode.shaping) I get
which means that every 30 ms there are ~11 elements going through the throttler. With .throttle(1000, per = 1.second, maximumBurst = 1, ThrottleMode.shaping) I get (trimmed to the first 3 seconds to not pollute the output)
which means that every 30 ms there are 2 elements going through the throttler. So the bottom line is that to achieve 1000 QPS the maxBurst has to be at least 1000/30=34 where 30 is akka-internal value which I probably should not rely on (?). |
Here is a case when burst > rate import java.util.concurrent.ConcurrentHashMap
import akka.actor.ActorSystem
import akka.stream.scaladsl.Source
import akka.stream.{ActorMaterializer, ThrottleMode}
import scala.collection.JavaConverters._
import scala.collection.immutable.SortedSet
import scala.concurrent.Await
import scala.concurrent.duration._
object WhatsWithThrottling {
def main(args: Array[String]): Unit = {
implicit val system = ActorSystem("QuickStart")
implicit val materializer = ActorMaterializer()
val startMillis = System.currentTimeMillis()
val ticks = new ConcurrentHashMap[Long, Long]()
val future = Source.fromIterator(() => Iterator.continually(1))
.dropWithin(4.seconds)
.throttle(1000, per = 1.second, maximumBurst = 3000, ThrottleMode.shaping)
.take(10000)
.runForeach(_ => {
val bucket = (System.currentTimeMillis() - startMillis) / 100
ticks.putIfAbsent(bucket, 0)
ticks.computeIfPresent(bucket, (k, v) => v + 1)
})
Await.result(future, atMost = 20.seconds)
system.terminate()
println(s"Elapsed: ${System.currentTimeMillis() - startMillis} ms, got ${ticks.asScala.to[SortedSet]}")
}
} I get
which means that after 4 seconds I get burst of elements (3018) and then I get 90-120 elements every 100 ms, which gives roughtly 1000/s. Actually they come at ~30 elements every 30 ms but that's too much data to put here - you can change proper line to That's quite nice that elements are spread over time. |
The reason for the confusion is bucket model that is implemented.
So for burst = 0 there is no bucket and each time event is going through - system schedule timer for up to 1 millisecond. The problem here is that time span of timer between scheduling and up next push is 30 milliseconds. So for 1000 scheduled timers, we suppose to wait 30 seconds. The second drawback - is that bucket is filling with tokens only if time that passed is more than one token (time between to stream elements) So I would reccomend to mark in a documentation that bucket size must cover schedule delay in 30 milliseconds. |
Is and will be 30 ms guaranteed by Akka? To me it sounds like an internal
thing that can change.
In the case when there are no tokens in the bucket should we refill 30ms
worth of tokens? Even if the bucket would overflow for a moment.
Marcin
…On Aug 11, 2017 12:07 AM, "Alexander Golubev" ***@***.***> wrote:
The reason for the confusion is bucket model that is implemented.
Throttle implements the token bucket model. There is a bucket with a given token capacity (burst size or maximumBurst). Tokens drop into the bucket at a given rate and can be `spared` for later use up to bucket capacity to allow some burstiness.
So for burst = 0 there is no bucket and each time event is going through -
system schedule timer for up to 1 millisecond. The problem here is that
time span of timer between scheduling and up next push is 30 milliseconds.
So for 1000 scheduled timers, we suppose to wait 30 seconds.
The second drawback - is that bucket is filling with tokens only if time
that passed is more than one token (time between to stream elements)
So I would reccomend to mark in a documentation that bucket size must
cover schedule delay in 30 milliseconds.
Also will be good to count the time that is less than one token (timespan
between two elements) to increase accuracy.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#23475 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAY8B5af9-a3EK7lEvCe86bHzamCiv9oks5sW9OJgaJpZM4Ouml4>
.
|
Throttle is heavily depended on SLA - so it's dangerous to vary bucket size without developer's approval. |
@patriknw @johanandren do you think would be good to have throttle without burst parameter (it looks like there should be something simple that can be used without understanding of bucket model approach)? |
I think the throttler and the bucket model should be described in documentation, preferably with some illustrations and typical use cases. We could introduce another When you say SLA and QPS, do you mean that it should not exceed the given frequency and simply schedule with the given |
My guess is that developers believe that with setting: 1000 events in 1 seconds (and Burst is 0) they are expecting that there can be a burst up to 1000 events during each second.
|
regarding |
To sum up we need to:
did I miss anything? |
sounds good, it would be great if you can work on that @agolubev |
@patriknw, I'll take a look |
Ok, finally started this ticket. |
When I run
then I get "Elapsed: 30103 ms" which is ~33 QPS, not 1000
For
numRecords=5000
I get "Elapsed: 150093 ms", so it linearly increased 5x.When I run with a modification
then I get "Elapsed: 115 ms"
When I run with a different modification
then I get "Elapsed: 4125 ms", which is how I use it now (but it does not feel intuitive to set up both params to the same value).
When I run with yet another modification, just in case
0
burst is a problem, (maximumBurst = 5
)then I get "Elapsed: 5041 ms" (down from 30s before) but still below expectations.
I'm not sure if the behaviour with low values for maxBurst is intended. Examples I find in the internet use 1 or 10 elements per second, which is too low to observe the problem.
That's something that would be good to clarify in the documentation (or at least give real-world example if
.throttle(500, per = 1.second, maximumBurst = 500, shaping)
is the expected usage)Currently documentation uses 1 QPS, so the burst does not matter that much
http://doc.akka.io/docs/akka/2.5.3/scala/stream/stream-quickstart.html#time-based-processing
Based on that I thought
.throttle(1000, 1.second, 0, ThrottleMode.shaping)
would give me 1000 QPS with 0 elements that get through immediately.According to the documentation of
Flow.throttle
https://github.com/akka/akka/blob/master/akka-stream/src/main/scala/akka/stream/scaladsl/Flow.scala#L1784-L1793It seems that tokens are generated way too slow for small buckets
I used
Thoughts?
The text was updated successfully, but these errors were encountered: