[SPARK-28574][CORE] Allow to config different sizes for event queues#25307
[SPARK-28574][CORE] Allow to config different sizes for event queues#25307yunzoud wants to merge 5 commits intoapache:masterfrom
Conversation
|
add to whitelist |
|
ok to test |
jiangxb1987
left a comment
There was a problem hiding this comment.
Looks good only some nits
| test("event queue size can be configued through spark conf") { | ||
| val conf = new SparkConf(false) | ||
| .set(LISTENER_BUS_EVENT_QUEUE_CAPACITY, 5) | ||
| .set("spark.scheduler.listenerbus.eventqueue.shared.capacity", "1") |
There was a problem hiding this comment.
nit: I would use s${SHARED_QUEUE}.capacity
| val conf = new SparkConf(false) | ||
| .set(LISTENER_BUS_EVENT_QUEUE_CAPACITY, 5) | ||
| .set("spark.scheduler.listenerbus.eventqueue.shared.capacity", "1") | ||
| .set("spark.scheduler.listenerbus.eventqueue.eventLog.capacity", "2") |
There was a problem hiding this comment.
similarly, I would use s${EVENT_LOG_QUEUE}.capacity
| val counter2 = new BasicJobCounter() | ||
| val counter3 = new BasicJobCounter() | ||
|
|
||
| bus.addToSharedQueue(counter1) |
There was a problem hiding this comment.
Maybe add comment to explain this is just to trigger add a new Queue.
| // if no such conf is specified, use the value specified in | ||
| // LISTENER_BUS_EVENT_QUEUE_CAPACITY | ||
| protected def capacity: Int = conf.getInt( | ||
| s"spark.scheduler.listenerbus.eventqueue.${name}.capacity", |
There was a problem hiding this comment.
We shall assert the capacity is > 0, and add test to cover the case.
|
|
||
| // For testing only. | ||
| private[scheduler] def getQueueCapacity(name: String): Int = { | ||
| queues.asScala.find(_.name == name) match { |
There was a problem hiding this comment.
maybe queues.asScala.find(_.name == name).map(_.capacity).getOrElse(-1) ?
|
the |
|
add to whitelist |
|
ok to test |
|
Test build #108483 has finished for PR 25307 at commit
|
|
Test build #108485 has finished for PR 25307 at commit
|
zsxwing
left a comment
There was a problem hiding this comment.
Looks good overall. Left some nits.
| } | ||
|
|
||
| // For testing only. | ||
| private[scheduler] def getQueueCapacity(name: String): Int = { |
There was a problem hiding this comment.
nit: could you change the return type to Option[Int] and use None to indicate an unknown queue rather than -1?
| .set(s"spark.scheduler.listenerbus.eventqueue.${SHARED_QUEUE}.capacity", "1") | ||
| .set(s"spark.scheduler.listenerbus.eventqueue.${EVENT_LOG_QUEUE}.capacity", "2") | ||
|
|
||
| val bus = new LiveListenerBus(conf) |
There was a problem hiding this comment.
nit: bus.stop() is missing. It should be called in finally to stop the internal thread.
There was a problem hiding this comment.
The bus is actually never started, so no need to stop.
| private[scheduler] def capacity: Int = { | ||
| val queuesize = conf.getInt(s"spark.scheduler.listenerbus.eventqueue.${name}.capacity", | ||
| conf.get(LISTENER_BUS_EVENT_QUEUE_CAPACITY)) | ||
| assert(queuesize > 0, s"capacity for event queue $name must be greater than 0," + |
Since names of the queues are private, it's fine to not document them to keep them internal right now. |
|
LGTM pending tests |
|
Test build #108574 has finished for PR 25307 at commit
|
|
Test build #108576 has finished for PR 25307 at commit
|
|
retest this please |
|
LGTM |
|
Test build #108579 has finished for PR 25307 at commit
|
|
Thanks! Merging to master. |
| // LISTENER_BUS_EVENT_QUEUE_CAPACITY | ||
| private[scheduler] def capacity: Int = { | ||
| val queuesize = conf.getInt(s"spark.scheduler.listenerbus.eventqueue.${name}.capacity", | ||
| conf.get(LISTENER_BUS_EVENT_QUEUE_CAPACITY)) |
| conf.get(LISTENER_BUS_EVENT_QUEUE_CAPACITY)) | ||
| // The capacity can be configured by spark.scheduler.listenerbus.eventqueue.${name}.capacity, | ||
| // if no such conf is specified, use the value specified in | ||
| // LISTENER_BUS_EVENT_QUEUE_CAPACITY |
There was a problem hiding this comment.
We need to update the conf description of LISTENER_BUS_EVENT_QUEUE_CAPACITY.
| // if no such conf is specified, use the value specified in | ||
| // LISTENER_BUS_EVENT_QUEUE_CAPACITY | ||
| private[scheduler] def capacity: Int = { | ||
| val queuesize = conf.getInt(s"spark.scheduler.listenerbus.eventqueue.${name}.capacity", |
There was a problem hiding this comment.
Instead of hard-coded here, can we define it in core/src/main/scala/org/apache/spark/internal/config/package.scala ?
What changes were proposed in this pull request?
Add configuration spark.scheduler.listenerbus.eventqueue.${name}.capacity to allow configuration of different event queue size.
How was this patch tested?
Unit test in core/src/test/scala/org/apache/spark/scheduler/SparkListenerSuite.scala