New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-8979][Streaming] Implements a PIDRateEstimator #17
[SPARK-8979][Streaming] Implements a PIDRateEstimator #17
Conversation
94c2225
to
2144472
Compare
*/ | ||
protected [streaming] val rateEstimator = ssc.conf | ||
.getOption("spark.streaming.RateEstimator") | ||
.getOrElse("noop") match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd break this logic into its own method. The list of implementations might grow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you suggest the signature of that method would be ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about String => RateEstimator
?
219e1bd
to
ef7b0a4
Compare
* an estimate of the speed at which this stream should ingest messages, | ||
* given an estimate computation from a `RateEstimator` | ||
*/ | ||
@DeveloperApi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This must be private[streaming]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the same goes for RateEstimator
, NoopRateEstimator
, and PIDRateEstimator
, at least for now, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@transient private val executionContext = ExecutionContext.fromExecutorService( | ||
ThreadUtils.newDaemonSingleThreadExecutor("stream-rate-update")) | ||
|
||
private val speedLimit : AtomicLong = new AtomicLong(-1L) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
speedLimit --> rateLimit
…ements the ReceiverRateController
…into its own method
Here should be the test plan.
These should be part of subtask 1 or dynamic rate limiting. Additionally, with the rate controller subtask.
|
Overall, this whole PR looks quite good and ready for PRs to the Spark main repo. I suggest breaking them down into smaller PRs according to the subtasks. |
Refer to this link for build results (access rights to CI server needed): Build Log
Test FAILed. |
Refer to this link for build results (access rights to CI server needed): Build Log
Test FAILed. |
13ada97
to
0c51959
Compare
…onfig option. ## What changes were proposed in this pull request? Currently, `OptimizeIn` optimizer replaces `In` expression into `InSet` expression if the size of set is greater than a constant, 10. This issue aims to make a configuration `spark.sql.optimizer.inSetConversionThreshold` for that. After this PR, `OptimizerIn` is configurable. ```scala scala> sql("select a in (1,2,3) from (select explode(array(1,2)) a) T").explain() == Physical Plan == WholeStageCodegen : +- Project [a#7 IN (1,2,3) AS (a IN (1, 2, 3))#8] : +- INPUT +- Generate explode([1,2]), false, false, [a#7] +- Scan OneRowRelation[] scala> sqlContext.setConf("spark.sql.optimizer.inSetConversionThreshold", "2") scala> sql("select a in (1,2,3) from (select explode(array(1,2)) a) T").explain() == Physical Plan == WholeStageCodegen : +- Project [a#16 INSET (1,2,3) AS (a IN (1, 2, 3))#17] : +- INPUT +- Generate explode([1,2]), false, false, [a#16] +- Scan OneRowRelation[] ``` ## How was this patch tested? Pass the Jenkins tests (with a new testcase) Author: Dongjoon Hyun <dongjoon@apache.org> Closes apache#12562 from dongjoon-hyun/SPARK-14796.
This depends on #16
Derivation in https://www.dropbox.com/s/dwgl7wa1z5wbkg6/PIDderivation.pdf?dl=0