Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20844] Remove experimental from Structured Streaming APIs #18065

Closed
wants to merge 3 commits into from

Conversation

marmbrus
Copy link
Contributor

Now that Structured Streaming has been out for several Spark release and has large production use cases, the Experimental label is no longer appropriate. I've left InterfaceStability.Evolving however, as I think we may make a few changes to the pluggable Source & Sink API in Spark 2.3.

@SparkQA
Copy link

SparkQA commented May 22, 2017

Test build #77197 has started for PR 18065 at commit fe03241.

@marmbrus
Copy link
Contributor Author

test this please

@SparkQA
Copy link

SparkQA commented May 22, 2017

Test build #77203 has finished for PR 18065 at commit fe03241.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -10,7 +10,7 @@ title: Structured Streaming Programming Guide
# Overview
Structured Streaming is a scalable and fault-tolerant stream processing engine built on the Spark SQL engine. You can express your streaming computation the same way you would express a batch computation on static data. The Spark SQL engine will take care of running it incrementally and continuously and updating the final result as streaming data continues to arrive. You can use the [Dataset/DataFrame API](sql-programming-guide.html) in Scala, Java, Python or R to express streaming aggregations, event-time windows, stream-to-batch joins, etc. The computation is executed on the same optimized Spark SQL engine. Finally, the system ensures end-to-end exactly-once fault-tolerance guarantees through checkpointing and Write Ahead Logs. In short, *Structured Streaming provides fast, scalable, fault-tolerant, end-to-end exactly-once stream processing without the user having to reason about streaming.*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main title still says Experimental :P

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha, good catch

@@ -2800,8 +2800,6 @@ object functions {
* @group datetime_funcs
* @since 2.0.0
*/
@Experimental
@InterfaceStability.Evolving
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you intend to remove the evolving here?

Copy link
Contributor Author

@marmbrus marmbrus May 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I did. This has been out since 2.0 and works in batch, so I don't think we can change it at this point.

@tdas
Copy link
Contributor

tdas commented May 24, 2017

all the python apis also need to be marked not experimental!

@tdas
Copy link
Contributor

tdas commented May 24, 2017

Other scala class that are still marked experimental are

  • FlatMapGroupsWithStateFunction
  • GroupState
  • GroupStateTimeout
    Not sure if you missed or you intentionally kept them as experimental. Since the mapGroupsWithState is so new, I think its better to actually keep them experimental.

@marmbrus
Copy link
Contributor Author

Good catch on python! Fixed but please help me make sure I didn't miss anything.

Leaving all the GroupState stuff experimental was on purpose, since this will be the first release including it.

* Policy used to indicate how often results should be produced by a [[StreamingQuery]].
*
* @since 2.0.0
*/
@Experimental
@InterfaceStability.Evolving
public class Trigger {

Copy link
Contributor

@tdas tdas May 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file has more places with ":: Experimental ::" in the scala docs

@SparkQA
Copy link

SparkQA commented May 25, 2017

Test build #77317 has finished for PR 18065 at commit 8221c01.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@zsxwing zsxwing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following places are not removed yet.

  • [SQLContext/SparkSession].[readStream/streams], both Scala and Python.
  • ProcessingTime.scala
  • org.apache.spark.sql.streaming.OutputMode

I'm supposed that we don't want to remove experimental from R APIs.

@@ -35,7 +35,6 @@ import org.apache.spark.sql.types.StructType
*
* @since 2.0.0
*/
@Experimental
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

importing Experimental in this file is not used now.

@SparkQA
Copy link

SparkQA commented May 26, 2017

Test build #77431 has finished for PR 18065 at commit 98e8326.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member

zsxwing commented May 26, 2017

LGTM. Merging to master and 2.2.

asfgit pushed a commit that referenced this pull request May 26, 2017
Now that Structured Streaming has been out for several Spark release and has large production use cases, the `Experimental` label is no longer appropriate.  I've left `InterfaceStability.Evolving` however, as I think we may make a few changes to the pluggable Source & Sink API in Spark 2.3.

Author: Michael Armbrust <michael@databricks.com>

Closes #18065 from marmbrus/streamingGA.
@asfgit asfgit closed this in d935e0a May 26, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants