#Stream Customers Data From Cloud files to delta lake using autoloader
1. Read files from cloud storage using DataStreamReader API using autoloader
2. Transform Dataframe to the following:
    i. file path: Cloud File path
    ii. ingestion date: Current timestamp
3. Write the transformed data stream to delta lake table

###1. Read files using AutoLoader

In [0]:
customers_df = (
                spark.readStream
                     .format("cloudFiles")
                     .option("cloudFiles.format","json")
                     .option("cloudFiles.schemaLocation","/Volumes/shoppix/landing/operational_data/customer_stream/_schema")
                     .option("cloudFiles.inferColumnTypes","true")
                     .option("cloudFiles.schemaHints","date_of_birth DATE, member_since DATE, created_timestamp TIMESTAMP")
                     .load("/Volumes/shoppix/landing/operational_data/customer_autoloader/")
)
customers_df.printSchema()

###2. Transform the datafram to add following columns
- file path: cloud file path
- ingestion path: current timestamp

In [0]:
from pyspark.sql.functions import current_timestamp, col

customers_transformed_df = (
                                customers_df.withColumn("file_path", col("_metadata.file_path"))
                                            .withColumn("ingestiondate",current_timestamp())
)

###3. write transformed data to delta lake

In [0]:
from pyspark.sql.streaming import DataStreamWriter

streaming_Query = (
                    customers_transformed_df.writeStream
                                            .format("delta")
                                            .option("checkpointLocation","/Volumes/shoppix/landing/operational_data/customer_autoloader/_checkpoint_stream")
                                            .trigger(availableNow=True)
                                            .toTable("shoppix.bronze.customer_autoloader")
)

In [0]:
%sql
SELECT * FROM shoppix.bronze.customer_autoloader

2. 