-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-20844] Remove experimental from Structured Streaming APIs #18065
Conversation
Test build #77197 has started for PR 18065 at commit |
test this please |
Test build #77203 has finished for PR 18065 at commit
|
@@ -10,7 +10,7 @@ title: Structured Streaming Programming Guide | |||
# Overview | |||
Structured Streaming is a scalable and fault-tolerant stream processing engine built on the Spark SQL engine. You can express your streaming computation the same way you would express a batch computation on static data. The Spark SQL engine will take care of running it incrementally and continuously and updating the final result as streaming data continues to arrive. You can use the [Dataset/DataFrame API](sql-programming-guide.html) in Scala, Java, Python or R to express streaming aggregations, event-time windows, stream-to-batch joins, etc. The computation is executed on the same optimized Spark SQL engine. Finally, the system ensures end-to-end exactly-once fault-tolerance guarantees through checkpointing and Write Ahead Logs. In short, *Structured Streaming provides fast, scalable, fault-tolerant, end-to-end exactly-once stream processing without the user having to reason about streaming.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main title still says Experimental :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha, good catch
@@ -2800,8 +2800,6 @@ object functions { | |||
* @group datetime_funcs | |||
* @since 2.0.0 | |||
*/ | |||
@Experimental | |||
@InterfaceStability.Evolving |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you intend to remove the evolving
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I did. This has been out since 2.0 and works in batch, so I don't think we can change it at this point.
all the python apis also need to be marked not experimental! |
Other scala class that are still marked experimental are
|
Good catch on python! Fixed but please help me make sure I didn't miss anything. Leaving all the GroupState stuff experimental was on purpose, since this will be the first release including it. |
* Policy used to indicate how often results should be produced by a [[StreamingQuery]]. | ||
* | ||
* @since 2.0.0 | ||
*/ | ||
@Experimental | ||
@InterfaceStability.Evolving | ||
public class Trigger { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file has more places with ":: Experimental ::" in the scala docs
Test build #77317 has finished for PR 18065 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following places are not removed yet.
- [SQLContext/SparkSession].[readStream/streams], both Scala and Python.
- ProcessingTime.scala
- org.apache.spark.sql.streaming.OutputMode
I'm supposed that we don't want to remove experimental from R APIs.
@@ -35,7 +35,6 @@ import org.apache.spark.sql.types.StructType | |||
* | |||
* @since 2.0.0 | |||
*/ | |||
@Experimental |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
importing Experimental
in this file is not used now.
Test build #77431 has finished for PR 18065 at commit
|
LGTM. Merging to master and 2.2. |
Now that Structured Streaming has been out for several Spark release and has large production use cases, the `Experimental` label is no longer appropriate. I've left `InterfaceStability.Evolving` however, as I think we may make a few changes to the pluggable Source & Sink API in Spark 2.3. Author: Michael Armbrust <michael@databricks.com> Closes #18065 from marmbrus/streamingGA.
Now that Structured Streaming has been out for several Spark release and has large production use cases, the
Experimental
label is no longer appropriate. I've leftInterfaceStability.Evolving
however, as I think we may make a few changes to the pluggable Source & Sink API in Spark 2.3.