-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-19168][Structured Streaming] StateStore should be aborted upon error #16547
Conversation
Test build #71193 has finished for PR 16547 at commit
|
s"$outputMode output mode not supported when there are streaming aggregations on " + | ||
s"streaming DataFrames/DataSets")(plan) | ||
s"$outputMode output mode requires watermark to be speficied when there are " + | ||
s"streaming aggregations on streaming DataFrames/DataSets")(plan) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an easy fix, so I've also included it in this patch
Test build #71195 has finished for PR 16547 at commit
|
AddData(inputData, 10), // Should not emit anything as data less than watermark | ||
CheckLastBatch(), | ||
assertNumStateRows(2) | ||
assertNumStateRows(2, 1) // We also processed the data 10, which is less than watermark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note this is assertNumStateRows(2, 1)
without this patch, and is assertNumStateRows(2, 0)
with this patch
AddData(inputData, 10), // Should not emit anything as data less than watermark | ||
CheckLastBatch(), | ||
assertNumStateRows(2) | ||
assertNumStateRows(2, 0) // We processed nothing in this batch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note this is assertNumStateRows(2, 1)
without this patch, and is assertNumStateRows(2, 0)
with this patch.
Test build #71203 has finished for PR 16547 at commit
|
There is another issue in this class: it doesn't call |
@zsxwing thanks for the comments. I'll update this. |
Test build #71248 has started for PR 16547 at commit |
Jenkins retest this please |
Test build #71266 has finished for PR 16547 at commit
|
@zsxwing updated as per your comments; would you take another look? |
Discussed with @tdas offline. Regarding the changes to the append mode, it's unknown that if adding the filter is better because it will apply the filter on all rows but there are usually only few rows out of watermark. It's better to just leave it as it it. For aborting StateStore, could you use |
Could you also update the JIRA and the PR description after your changes? |
@zsxwing thanks for the feedback! Ah, sure, let me update accordingly. |
Append
mode@@ -335,4 +344,67 @@ class StreamingAggregationSuite extends StreamTest with BeforeAndAfterAll { | |||
CheckLastBatch((90L, 1), (100L, 1), (105L, 1)) | |||
) | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of the following test is to inject a mock state store, which:
- throws an error in its
commit()
method - marks the
aborted
flag in itsabort()
method
Then at the end of the test, we would check if aborted
is true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to not add this complicated test because:
- TaskContext.get().addTaskCompletionListener has already been tested in other tests.
- Scala Reflection is not thread-safe in 2.10 and this test may be flaky since SQL also uses Scala Reflection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! I hesitated a little when I added it -- yea it was complicated -- so let me remove it
Test build #71512 has finished for PR 16547 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to not add the test. Otherwise, LGTM.
@@ -335,4 +344,67 @@ class StreamingAggregationSuite extends StreamTest with BeforeAndAfterAll { | |||
CheckLastBatch((90L, 1), (100L, 1), (105L, 1)) | |||
) | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to not add this complicated test because:
- TaskContext.get().addTaskCompletionListener has already been tested in other tests.
- Scala Reflection is not thread-safe in 2.10 and this test may be flaky since SQL also uses Scala Reflection.
Test build #71565 has finished for PR 16547 at commit
|
LGTM. Merging to master and 2.1. Thanks! |
… error ## What changes were proposed in this pull request? We should call `StateStore.abort()` when there should be any error before the store is committed. ## How was this patch tested? Manually. Author: Liwei Lin <lwlin7@gmail.com> Closes #16547 from lw-lin/append-filter. (cherry picked from commit 569e506) Signed-off-by: Shixiong Zhu <shixiong@databricks.com>
… error ## What changes were proposed in this pull request? We should call `StateStore.abort()` when there should be any error before the store is committed. ## How was this patch tested? Manually. Author: Liwei Lin <lwlin7@gmail.com> Closes apache#16547 from lw-lin/append-filter.
… error ## What changes were proposed in this pull request? We should call `StateStore.abort()` when there should be any error before the store is committed. ## How was this patch tested? Manually. Author: Liwei Lin <lwlin7@gmail.com> Closes apache#16547 from lw-lin/append-filter.
What changes were proposed in this pull request?
We should call
StateStore.abort()
when there should be any error before the store is committed.How was this patch tested?
Manually.