### Writing to files
- Structured Streaming supports writing streaming query output to files in the same formats as reads. However, it only supports append mode, because while it is easy to write new files in the output directory (i.e., append data to a directory), it is hard to modify existing data files (as would be expected with update and complete modes). It also supports partitioning.
- For <b>Memort Sink</b>, it supports <b>Append, Complete</b> modes.
- For <b>Console Sink</b>, it supports <b>Append, Complete, and Update</b> modes.
- However, Console and Memory Sinks are usually used only for debugging. Such as <b>.show()</b> in static tables.

In [None]:
import pyspark
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("Writing to Memory").getOrCreate()

In [1]:
df = spark.readStream.format("socket") \
    .option("host", "localhost") \
    .option("port", 12345) \
    .load()

24/04/07 14:43:34 WARN TextSocketSourceProvider: The socket source should not be used for production applications! It does not support recovery.


In [2]:
writer = df.writeStream.outputMode("append") \
    .format("memory")  \
    .queryName('myTable')

In [3]:
query= writer.start()

24/04/07 14:43:59 WARN ResolveWriteToStream: Temporary checkpoint location created which is deleted normally when the query didn't fail: /tmp/temporary-32b394ab-ddbc-4877-b245-1e274c507d44. If it's required to delete it under any circumstances, please set spark.sql.streaming.forceDeleteTempCheckpointLocation to true. Important to know deleting temp checkpoint folder is best effort.
24/04/07 14:43:59 WARN ResolveWriteToStream: spark.sql.adaptive.enabled is not supported in streaming DataFrames/Datasets and will be disabled.
                                                                                

In [6]:
from IPython.display import display, clear_output
from time import sleep

while True:
    clear_output(wait=True)
    display(query.status)
    display(spark.sql('SELECT * FROM myTable').show())
    sleep(1)

{'message': 'Waiting for data to arrive',
 'isDataAvailable': False,
 'isTriggerActive': False}

+-----------------+
|            value|
+-----------------+
|hi hello hi hello|
|      hi hello hi|
|               hi|
|             ds42|
|                 |
|      hi hello hi|
|             ds44|
|      hi hello hi|
|             ds44|
|               hi|
|      hi hello hi|
|               hi|
|             ds44|
|             ds43|
|            hello|
|      hi hello hi|
|            hello|
|             ds43|
|             ds42|
|hi hello hi hello|
+-----------------+
only showing top 20 rows



None

KeyboardInterrupt: 

In [5]:
spark.sql('SELECT * FROM myTable').show()

+-----------------+
|            value|
+-----------------+
|hi hello hi hello|
|      hi hello hi|
|               hi|
|             ds42|
|                 |
|      hi hello hi|
|             ds44|
|      hi hello hi|
|             ds44|
|               hi|
|      hi hello hi|
|               hi|
|             ds44|
|             ds43|
|            hello|
|      hi hello hi|
|            hello|
|             ds43|
|             ds42|
+-----------------+



In [7]:
query.stop()