-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-13146][SQL] Management API for continuous queries #11030
Conversation
""" | ||
""".stripMargin | ||
|
||
def assert(condition: => Boolean, message: String): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are a bunch of changes to make sure that any kind of assert
and eventually
that is needed within testStream
is wrapped in a try catch so that we can catch them and enrich the message with the testState
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this method, but I wouldn't shadow an existing method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to verify.
Test build #50589 has finished for PR 11030 at commit
|
} | ||
} | ||
|
||
test("awaitAnyTermination") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests are likely to be flaky because of the timing related issues. Other then making the timings more coarse, i am not sure how else to test awaitTerminations's behavior.
Test build #50597 has finished for PR 11030 at commit
|
@@ -84,6 +84,17 @@ final class DataStreamWriter private[sql](df: DataFrame) { | |||
} | |||
|
|||
/** | |||
* Specifies a name to the [[ContinuousQuery]] to be started. This name must be unique among |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Specifies a name for this [[ContinuousQuery]]. The name must be unique for any active query, and will be auto assigned if unspecified."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DataStreamWriter is not a ContinuousQuery. I think looking at the scala doc page of DataStreamWriter, the phrase "this ContinuousQuery" is confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't have to be exactly that text, but Specifies a name to the ContinuousQuery to be started
doesn't sound right. In other parts of the documentation we say "for the underlying datastream"
How about just "Specifies a name for this query."... and then stuff about auto assigning an uniqueness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still confusing. this query
... this
is a not a query. That's why it make more sense if name is associated directly with the start()
method.
val query = ds.streamTo.path("some-path").format("text").start("query-name")
OR
val query = ds.streamTo("path").format("text").start("query-name")
Then its easy to write the docs of def start(name: String)
"@param name Name of the query".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then its easy to write the docs of def start(name: String) "@param name Name of the query".
There is already a start(path: String)
Test build #50954 has finished for PR 11030 at commit
|
Test build #50955 has finished for PR 11030 at commit
|
* If the query has terminated with an exception, then the exception will be thrown. | ||
* | ||
* If the query has terminated, then all subsequent calls to this method will either return | ||
* `true` immediately (if the query was terminated by `stop()`), or throw the exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this method doesn't return a Boolean.
Just some nits. Otherwise LGTM |
I addressed the multifailure case and added unit test for it. |
Test build #51020 has finished for PR 11030 at commit
|
Test build #51021 has finished for PR 11030 at commit
|
Test build #51022 has finished for PR 11030 at commit
|
test this again. |
test this please. |
retest this |
Test build #2529 has finished for PR 11030 at commit
|
Test build #51030 has finished for PR 11030 at commit
|
Found the failure cause: Hadoop FileSystem API will wrap
So if we call One workaround is adding Another one is we should allow exception happens in this test since we randomly start and stop the stream multiple times. |
Test build #51055 has finished for PR 11030 at commit
|
Test build #51062 has finished for PR 11030 at commit
|
LGTM |
Management API for Continuous Queries
API for getting status of each query
API for managing each query
API for managing multiple queries
API for listening to query life cycle events