OutputMode

Output mode (OutputMode) describes what data is written to a streaming sink when there is new data available in streaming data sources.

Output mode of a streaming query is specified using outputMode method of DataStreamWriter.

val inputStream = spark.
  readStream.
  format("rate").
  load
import org.apache.spark.sql.streaming.{OutputMode, Trigger}
import scala.concurrent.duration._
val consoleOutput = inputStream.
  writeStream.
  format("console").
  option("truncate", false).
  trigger(Trigger.ProcessingTime(10.seconds)).
  queryName("rate-console").
  option("checkpointLocation", "checkpoint").
  outputMode(OutputMode.Update).  // <-- update output mode
  start

Note	Append output mode is the default output mode.

Table 1. Available Output Modes

OutputMode Name Behaviour

Append

append

Default output mode that writes new rows only

Required for datasets with FileFormat format (to create FileStreamSink)

Used for flatMapGroupsWithState operator

Complete

complete

Writes all rows (every time there are updates) and therefore corresponds to a traditional batch query.

Note	Supported only for streaming queries with groupBy or groupByKey aggregations (as asserted by UnsupportedOperationChecker).

Update

update

Write the rows that were updated (every time there are updates). If the query does not contain aggregations, it is equivalent to Append mode.

Used for mapGroupsWithState and flatMapGroupsWithState operators

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spark-sql-streaming-OutputMode.adoc

spark-sql-streaming-OutputMode.adoc

OutputMode

Files

spark-sql-streaming-OutputMode.adoc

Latest commit

History

spark-sql-streaming-OutputMode.adoc

File metadata and controls

OutputMode