Observe what content do we have in our remote mounted data source container (we'll assume that content is automatically updated periodically with some batch data)

In [0]:
%fs 
 
ls "/mnt/blobcontainer1/databricks_files"

path,name,size,modificationTime
dbfs:/mnt/blobcontainer1/databricks_files/video_games_data_01.csv,video_games_data_01.csv,3713,1678470406000
dbfs:/mnt/blobcontainer1/databricks_files/video_games_data_02.csv,video_games_data_02.csv,3830,1678469932000


Auto Loader is used with spark.readStream command, both for the stream and batch data

In [0]:
video_games_data = spark.readStream.format("cloudFiles") \
                        .option("cloudFiles.format", "csv") \
                        .option("inferSchema", "true") \
                        .option("cloudFiles.schemaLocation", "dbfs:/FileStore/schema/video_games_schema") \
                        .option("cloudFiles.schemaHints", "Year_of_Release int")\
                        .load("dbfs:/mnt/blobcontainer1/databricks_files/*")

In [0]:
# filter stream data if needed
racing_games_data = video_games_data.filter('Genre = "Racing"')

In [0]:
# write stream data as a batch into a delta table, using one-time trigger
racing_games_data.writeStream\
                 .format("delta")\
                 .outputMode("append")\
                 .option("checkpointLocation", "/delta/events/_checkpoints/racing")\
                 .trigger(once=True) \
                 .start("/delta/racing")

In [0]:
%sql

SELECT * from delta.`/delta/racing`

Name,Platform,Year_of_Release,Genre,Publisher,_rescued_data
Mario Kart Wii,Wii,2008,Racing,Nintendo,
Mario Kart DS,DS,2005,Racing,Nintendo,
Gran Turismo 3: A-Spec,PS2,2001,Racing,Sony Computer Entertainment,
Mario Kart 7,3DS,2011,Racing,Nintendo,
Gran Turismo 4,PS2,2004,Racing,Sony Computer Entertainment,
Gran Turismo,PS,1997,Racing,Sony Computer Entertainment,
Gran Turismo 5,PS3,2010,Racing,Sony Computer Entertainment,
Mario Kart 64,N64,1996,Racing,Nintendo,
Gran Turismo 2,PS,1999,Racing,Sony Computer Entertainment,
Super Mario Kart,SNES,1992,Racing,Nintendo,


In [0]:
All we need to load the batch data is to schedule this notebook to be run at periodic intervals