[SPARK-21873][SS] - Avoid using return inside CachedKafkaConsumer.get#19059
Closed
YuvalItzchakov wants to merge 2 commits intoapache:masterfrom
YuvalItzchakov:master
Closed
[SPARK-21873][SS] - Avoid using return inside CachedKafkaConsumer.get#19059YuvalItzchakov wants to merge 2 commits intoapache:masterfrom YuvalItzchakov:master
return inside CachedKafkaConsumer.get#19059YuvalItzchakov wants to merge 2 commits intoapache:masterfrom
YuvalItzchakov:master
Conversation
… to `org.apache.spark.util.UninterruptibleThread.runUninterruptibly` as a function type which causes a NonLocalReturnControl to be called for every call
srowen
reviewed
Aug 26, 2017
| resetFetchedData() | ||
| null | ||
|
|
||
| if (isFetchComplete) consumerRecord else { |
Member
There was a problem hiding this comment.
Go ahead and put the if clause on a new line with braces
| while (toFetchOffset != UNKNOWN_OFFSET && !isFetchComplete) { | ||
| try { | ||
| return fetchData(toFetchOffset, untilOffset, pollTimeoutMs, failOnDataLoss) | ||
| consumerRecord = fetchData(toFetchOffset, untilOffset, pollTimeoutMs, failOnDataLoss) |
Member
There was a problem hiding this comment.
Can fetchData return null? if not, then the condition can just be on consumerRecord == null
Author
There was a problem hiding this comment.
It may return null:
|
Test build #3907 has finished for PR 19059 at commit
|
srowen
approved these changes
Aug 28, 2017
|
Test build #3910 has finished for PR 19059 at commit
|
Member
|
By the way @YuvalItzchakov could you make a JIRA and link it? this is small but it's a non-trivial improvement and think we should handle it as a normal issue. |
return inside CachedKafkaConsumer.getreturn inside CachedKafkaConsumer.get
Author
Member
|
Merged to master |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
During profiling of a structured streaming application with Kafka as the source, I came across this exception:
This is a 1 minute sample, which caused 106K
NonLocalReturnControlexceptions to be thrown.This happens because
CachedKafkaConsumer.getis ran inside:private def runUninterruptiblyIfPossible[T](body: => T): TWhere
body: => Tis thegetmethod. Turning the method into a function means that in order to escape thewhileloop defined ingetthe runtime has to do dirty tricks which involve throwing the above exception.What changes were proposed in this pull request?
Instead of using
return(which is generally not recommended in Scala), we place the result of thefetchDatamethod inside a local variable and use a boolean flag to indicate the status of fetching data, which we monitor as our predicate to thewhileloop.How was this patch tested?
I've ran the
KafkaSourceSuiteto make sure regression passes. Since the exception isn't visible from user code, there is no way (at least that I could think of) to add this as a test to the existing suite.