The Internals of Spark Structured Streaming
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
graffles StreamingQueryManager and Marking Streaming Query as Terminated Oct 27, 2017
images StreamingQueryManager and Marking Streaming Query as Terminated Oct 27, 2017
src/main/scala OffsetSeqLog Aug 16, 2017
LICENSE Initial commit Feb 25, 2017
SUMMARY.adoc HDFSBackedStateStoreProvider et al. Nov 27, 2018
book-intro.adoc Spark 2.4.0 + Intro Nov 8, 2018
book.json Spark 2.4.0 + Intro Nov 8, 2018
spark-sql-streaming-BaseStreamingSource.adoc ContinuousExecution.continuousSources and ContinuousReader Contract Nov 11, 2018
spark-sql-streaming-CachedKafkaConsumer.adoc KafkaSourceRDD et al Mar 6, 2017
spark-sql-streaming-CommitLog.adoc CommitLog -- HDFSMetadataLog for Batch Completion Log Jul 30, 2018
spark-sql-streaming-CompactibleFileStreamLog.adoc OffsetSeqLog -- HDFSMetadataLog with OffsetSeq Metadata Aug 16, 2017
spark-sql-streaming-ConsoleSink.adoc FileStreamSink -- Streaming Sink for Parquet Format Jun 14, 2018
spark-sql-streaming-ConsoleSinkProvider.adoc ContinuousExecution -- StreamExecution of StreamWriteSupport Sinks wi… Nov 11, 2018
spark-sql-streaming-ConsumerStrategy.adoc HDFSMetadataLog -- MetadataLog with Hadoop HDFS for Storage Aug 15, 2017
spark-sql-streaming-ContinuousExecution.adoc ContinuousExecution.logicalPlan -- Analyzed Logical Plan of Streaming… Nov 12, 2018
spark-sql-streaming-ContinuousExecutionRelation.adoc ContinuousExecutionRelation Leaf Logical Operator Nov 12, 2018
spark-sql-streaming-ContinuousReadSupport.adoc ContinuousExecution.continuousSources and ContinuousReader Contract Nov 11, 2018
spark-sql-streaming-ContinuousReader.adoc ContinuousExecution.continuousSources and ContinuousReader Contract Nov 11, 2018
spark-sql-streaming-DataSource.adoc FileStreamSink -- Streaming Sink for Parquet Format Jun 14, 2018
spark-sql-streaming-DataSourceReader.adoc ContinuousExecution.continuousSources and ContinuousReader Contract Nov 11, 2018
spark-sql-streaming-DataStreamReader.adoc DataStreamReader and DataStreamWriter APIs + menu reorg Jul 30, 2018
spark-sql-streaming-DataStreamWriter.adoc Misc (while hunting down the uses of OutputMode) Nov 22, 2018
spark-sql-streaming-Dataset-dropDuplicates.adoc withWatermark Operator -- Event Time Watermark (got promoted) Sep 4, 2017
spark-sql-streaming-Dataset-explain.adoc explain operator Sep 17, 2017
spark-sql-streaming-Dataset-groupBy.adoc groupByKey Operator "promoted" Aug 31, 2017
spark-sql-streaming-Dataset-groupByKey.adoc groupByKey Operator "promoted" Aug 31, 2017
spark-sql-streaming-Dataset-operators.adoc withWatermark Operator -- Event Time Watermark (got promoted) Sep 4, 2017
spark-sql-streaming-Dataset-withWatermark.adoc withWatermark Operator -- Event Time Watermark (got promoted) Sep 4, 2017
spark-sql-streaming-Deduplicate.adoc Streaming Dataset Operators / Streaming Dataset API Aug 17, 2017
spark-sql-streaming-EventTimeStatsAccum.adoc EventTimeStatsAccum (and EventTimeWatermarkExec physical operator) Aug 11, 2017
spark-sql-streaming-EventTimeWatermark.adoc FlatMapGroupsWithStateExec Unary Physical Operator Nov 24, 2018
spark-sql-streaming-EventTimeWatermarkExec.adoc MicroBatchExecution Jul 29, 2018
spark-sql-streaming-FileStreamSink.adoc FileStreamSink -- Streaming Sink for Parquet Format Jun 14, 2018
spark-sql-streaming-FileStreamSource.adoc KafkaSourceProvider and DataSource Sep 13, 2017
spark-sql-streaming-FlatMapGroupsWithState.adoc UnsupportedOperationChecker, FlatMapGroupsWithState and Append output… Sep 8, 2017
spark-sql-streaming-FlatMapGroupsWithStateExec.adoc HDFSBackedStateStoreProvider et al. Nov 27, 2018
spark-sql-streaming-FlatMapGroupsWithStateStrategy.adoc StateStore, StateStoreCoordinator and StateStoreRDD Aug 25, 2017
spark-sql-streaming-ForeachBatchSink.adoc DataStreamWriter.foreachBatch and ForeachBatchSink Nov 20, 2018
spark-sql-streaming-ForeachSink.adoc [MINOR] Page renames Jul 1, 2017
spark-sql-streaming-ForeachWriter.adoc HDFSMetadataLog -- MetadataLog with Hadoop HDFS for Storage Aug 15, 2017
spark-sql-streaming-ForeachWriterProvider.adoc StreamWriteSupport Contract -- DataSourceV2 Sinks with Custom StreamW… Nov 10, 2018
spark-sql-streaming-GroupState.adoc [MINOR] HDFSBackedStateStore, StateStore and StateStoreRDD Aug 25, 2017
spark-sql-streaming-GroupStateImpl.adoc GroupState -- Contract for State Per Group For Arbitrary Stateful Agg… Aug 20, 2017
spark-sql-streaming-GroupStateTimeout.adoc MicroBatchExecution Jul 29, 2018
spark-sql-streaming-HDFSBackedStateStore.adoc StateStoreProvider Contract Nov 27, 2018
spark-sql-streaming-HDFSBackedStateStoreProvider.adoc HDFSBackedStateStoreProvider and spark.sql.streaming.maxBatchesToReta… Nov 28, 2018
spark-sql-streaming-HDFSMetadataLog.adoc KafkaSource.initialPartitionOffsets -- Initial Partition Offsets (of … Nov 9, 2018
spark-sql-streaming-IncrementalExecution.adoc HDFSBackedStateStore.commit -- Committing State Changes Nov 26, 2018
spark-sql-streaming-InputProcessor.adoc InputProcessor Helper Class of FlatMapGroupsWithStateExec Physical Op… Nov 25, 2018
spark-sql-streaming-KafkaOffsetReader.adoc KafkaSourceProvider and Specifying Name and Schema of Streaming Sourc… Sep 14, 2017
spark-sql-streaming-KafkaRelation.adoc StreamExecution and runBatch + KafkaRelation Oct 24, 2017
spark-sql-streaming-KafkaSink.adoc FileStreamSink -- Streaming Sink for Parquet Format Jun 14, 2018
spark-sql-streaming-KafkaSource.adoc KafkaSource.initialPartitionOffsets -- Initial Partition Offsets (of … Nov 9, 2018
spark-sql-streaming-KafkaSourceOffset.adoc KafkaSource.initialPartitionOffsets -- Initial Partition Offsets (of … Nov 9, 2018
spark-sql-streaming-KafkaSourceProvider.adoc StreamExecution and runBatch + KafkaRelation Oct 24, 2017
spark-sql-streaming-KafkaSourceRDD.adoc KafkaSourceRDD et al Mar 6, 2017
spark-sql-streaming-KeyToNumValuesStore.adoc StateStoreHandler Contract Nov 27, 2018
spark-sql-streaming-KeyValueGroupedDataset-flatMapGroupsWithState.adoc StreamingExecutionRelation logical operator and StreamExecution Sep 16, 2017
spark-sql-streaming-KeyValueGroupedDataset-mapGroupsWithState.adoc UnsupportedOperationChecker, FlatMapGroupsWithState and Append output… Sep 8, 2017
spark-sql-streaming-KeyValueGroupedDataset.adoc Misc (while hunting down the uses of OutputMode) Nov 22, 2018
spark-sql-streaming-KeyWithIndexToValueStore.adoc StateStoreHandler Contract Nov 27, 2018
spark-sql-streaming-MemoryPlan.adoc StreamExecution and runBatch (withNewSources phase) Sep 16, 2017
spark-sql-streaming-MemorySink.adoc FileStreamSink -- Streaming Sink for Parquet Format Jun 14, 2018
spark-sql-streaming-MemorySinkV2.adoc StreamWriteSupport Contract -- DataSourceV2 Sinks with Custom StreamW… Nov 10, 2018
spark-sql-streaming-MemoryStateStore.adoc StateStore Contract -- Kay-Value Store for State Management Nov 25, 2018
spark-sql-streaming-MemoryStream.adoc OffsetSeqLog Aug 16, 2017
spark-sql-streaming-MetadataLog.adoc MicroBatchExecution Jul 29, 2018
spark-sql-streaming-MicroBatchExecution.adoc StateStoreWriter Contract -- Physical Operators That Write to StateStore Nov 24, 2018
spark-sql-streaming-Offset.adoc StreamExecution and uniqueSources and constructNextBatch Aug 6, 2017
spark-sql-streaming-OffsetSeq.adoc OffsetSeq promoted (got its own page) Jul 27, 2018
spark-sql-streaming-OffsetSeqLog.adoc OffsetSeq promoted (got its own page) Jul 27, 2018
spark-sql-streaming-OffsetSeqMetadata.adoc OffsetSeqMetadata.relevantSQLConfs Nov 26, 2018
spark-sql-streaming-OutputMode.adoc Misc (while hunting down the uses of OutputMode) Nov 22, 2018
spark-sql-streaming-ProgressReporter.adoc StateStoreWriter Contract -- Physical Operators That Write to StateStore Nov 24, 2018
spark-sql-streaming-QueryExecutionThread.adoc Changes to reflect [SPARK-22733] Split StreamExecution into MicroBatc… Jul 29, 2018
spark-sql-streaming-RateSourceProvider.adoc RateStreamSource Jul 1, 2017
spark-sql-streaming-RateStreamContinuousReader.adoc ContinuousExecution.continuousSources and ContinuousReader Contract Nov 11, 2018
spark-sql-streaming-RateStreamSource.adoc Example of flatMapGroupsWithState operator (and groupByKey also) Aug 20, 2017
spark-sql-streaming-SQLConf.adoc OffsetSeqMetadata.relevantSQLConfs Nov 26, 2018
spark-sql-streaming-Sink.adoc MicroBatchExecution Jul 29, 2018
spark-sql-streaming-Source.adoc MicroBatchExecution Jul 29, 2018
spark-sql-streaming-StateStore.adoc HDFSBackedStateStoreProvider et al. Nov 27, 2018
spark-sql-streaming-StateStoreConf.adoc HDFSBackedStateStoreProvider and spark.sql.streaming.maxBatchesToReta… Nov 28, 2018
spark-sql-streaming-StateStoreCoordinator.adoc StateStore, StateStoreCoordinator and StateStoreRDD Aug 25, 2017
spark-sql-streaming-StateStoreCoordinatorRef.adoc StreamExecution.queryExecutionThread — Stream Execution Thread Jul 29, 2018
spark-sql-streaming-StateStoreCustomMetric.adoc StateStoreMetrics + StateStoreCustomMetric Nov 27, 2018
spark-sql-streaming-StateStoreHandler.adoc HDFSBackedStateStoreProvider et al. Nov 27, 2018
spark-sql-streaming-StateStoreId.adoc StateStoreProvider Contract Nov 27, 2018
spark-sql-streaming-StateStoreMetrics.adoc StateStoreMetrics + StateStoreCustomMetric Nov 27, 2018
spark-sql-streaming-StateStoreOps.adoc HDFSBackedStateStoreProvider et al. Nov 27, 2018
spark-sql-streaming-StateStoreProvider.adoc HDFSBackedStateStoreProvider et al. Nov 27, 2018
spark-sql-streaming-StateStoreRDD.adoc HDFSBackedStateStoreProvider et al. Nov 27, 2018
spark-sql-streaming-StateStoreReader.adoc StatefulAggregationStrategy Execution Planning Strategy and planning … Jul 18, 2017
spark-sql-streaming-StateStoreRestoreExec.adoc StateStoreHandler Contract Nov 27, 2018
spark-sql-streaming-StateStoreSaveExec-Complete.adoc Demo: StateStoreSaveExec in Append Output Mode (cntd) Sep 4, 2017
spark-sql-streaming-StateStoreSaveExec-Update.adoc Demo: StateStoreSaveExec in Append Output Mode (cntd) Sep 4, 2017
spark-sql-streaming-StateStoreSaveExec.adoc StateStoreHandler Contract Nov 27, 2018
spark-sql-streaming-StateStoreUpdater.adoc FlatMapGroupsWithStateExec Jul 17, 2017
spark-sql-streaming-StateStoreWriter.adoc HDFSBackedStateStoreProvider and spark.sql.streaming.maxBatchesToReta… Nov 28, 2018
spark-sql-streaming-StatefulAggregationStrategy.adoc StreamingAggregationStateManagers Nov 26, 2018
spark-sql-streaming-StatefulOperator.adoc FlatMapGroupsWithStateExec Unary Physical Operator Nov 24, 2018
spark-sql-streaming-StatefulOperatorStateInfo.adoc HDFSBackedStateStore.commit -- Committing State Changes Nov 26, 2018
spark-sql-streaming-StreamExecution.adoc Misc (while hunting down the uses of OutputMode) Nov 22, 2018
spark-sql-streaming-StreamProgress.adoc OffsetSeq promoted (got its own page) Jul 27, 2018
spark-sql-streaming-StreamSinkProvider.adoc [MINOR] Streaming Source and Sink Sep 11, 2017
spark-sql-streaming-StreamSourceProvider.adoc KafkaSourceProvider and Specifying Name and Schema of Streaming Sourc… Sep 14, 2017
spark-sql-streaming-StreamWriteSupport.adoc StreamWriteSupport Contract -- DataSourceV2 Sinks with Custom StreamW… Nov 10, 2018
spark-sql-streaming-StreamingAggregationStateManager.adoc StreamingAggregationStateManagerImplV2 -- Default State Manager for S… Nov 26, 2018
spark-sql-streaming-StreamingAggregationStateManagerBaseImpl.adoc StateStoreHandler Contract Nov 27, 2018
spark-sql-streaming-StreamingAggregationStateManagerImplV1.adoc StreamingAggregationStateManagerImplV2 -- Default State Manager for S… Nov 26, 2018
spark-sql-streaming-StreamingAggregationStateManagerImplV2.adoc [MINOR] Polishing Nov 26, 2018
spark-sql-streaming-StreamingDeduplicateExec.adoc FlatMapGroupsWithStateExec Unary Physical Operator Nov 24, 2018
spark-sql-streaming-StreamingDeduplicationStrategy.adoc StateStore, StateStoreCoordinator and StateStoreRDD Aug 25, 2017
spark-sql-streaming-StreamingExecutionRelation.adoc MicroBatchExecution Jul 29, 2018
spark-sql-streaming-StreamingGlobalLimitExec.adoc StateStoreWriter Contract -- Physical Operators That Write to StateStore Nov 24, 2018
spark-sql-streaming-StreamingQuery.adoc StreamingQuery Contract and StreamingQueryWrapper Nov 11, 2018
spark-sql-streaming-StreamingQueryListener.adoc Changes to reflect [SPARK-22733] Split StreamExecution into MicroBatc… Jul 29, 2018
spark-sql-streaming-StreamingQueryListenerBus.adoc StreamingQueryListener -- Intercepting Streaming Events Sep 11, 2017
spark-sql-streaming-StreamingQueryManager.adoc Misc (while hunting down the uses of OutputMode) Nov 22, 2018
spark-sql-streaming-StreamingQueryProgress.adoc StreamingQueryProgress Sep 11, 2017
spark-sql-streaming-StreamingQueryWrapper.adoc StreamingQuery Contract and StreamingQueryWrapper Nov 11, 2018
spark-sql-streaming-StreamingRelation.adoc Changes to reflect [SPARK-22733] Split StreamExecution into MicroBatc… Jul 29, 2018
spark-sql-streaming-StreamingRelationExec.adoc StreamingRelationExec Leaf Physical Operator Aug 19, 2017
spark-sql-streaming-StreamingRelationStrategy.adoc StreamingRelationStrategy Execution Planning Strategy Sep 15, 2017
spark-sql-streaming-StreamingRelationV2.adoc ContinuousExecutionRelation Leaf Logical Operator Nov 12, 2018
spark-sql-streaming-StreamingSymmetricHashJoinExec.adoc StreamingQueryManager Oct 26, 2017
spark-sql-streaming-SymmetricHashJoinStateManager.adoc StateStoreMetrics + StateStoreCustomMetric Nov 27, 2018
spark-sql-streaming-TextSocketSource.adoc Example of StateStoreSaveExec with socket source (to control what and… Aug 26, 2017
spark-sql-streaming-TextSocketSourceProvider.adoc Initial notes drop Feb 25, 2017
spark-sql-streaming-Trigger.adoc ContinuousExecution -- StreamExecution of StreamWriteSupport Sinks wi… Nov 11, 2018
spark-sql-streaming-TriggerExecutor.adoc Changes to reflect [SPARK-22733] Split StreamExecution into MicroBatc… Jul 29, 2018
spark-sql-streaming-UnsupportedOperationChecker.adoc Streaming aggregation with Append output mode requires watermark Sep 8, 2017
spark-sql-streaming-WatermarkSupport.adoc StateStore Contract -- Kay-Value Store for State Management Nov 25, 2018
spark-sql-streaming-demo-StreamingQueryManager-awaitAnyTermination-resetTerminated.adoc Demo: Using StreamingQueryManager for Query Termination Management Oct 27, 2017
spark-sql-streaming-demo-current_timestamp.adoc MicroBatchExecution Jul 29, 2018
spark-sql-streaming-demo-custom-sink-webui.adoc Demo: Using StreamingQueryManager for Query Termination Management Oct 27, 2017
spark-sql-streaming-demo-groupBy-aggregation-append.adoc StreamExecution, id and streamMetadata from checkpoint Sep 9, 2017
spark-sql-streaming-logging.adoc ProgressReporter and Updating Progress Jul 4, 2017
spark-sql-streaming-properties.adoc HDFSBackedStateStoreProvider and spark.sql.streaming.maxBatchesToReta… Nov 28, 2018
spark-sql-streaming-webui.adoc StateStore, StateStoreCoordinator and StateStoreRDD Aug 25, 2017
spark-sql-streaming-window.adoc window Function -- Stream Time Windows Sep 18, 2017
spark-structured-streaming.adoc withWatermark Operator -- Event Time Watermark (got promoted) Sep 4, 2017