## Explain query
### # Display the detailed Spark execution plan to verify filter pushdown and partition pruning for purchase events

In [0]:
spark.sql("SELECT * FROM ecommerce.silver.events WHERE event_type='purchase'").explain(True)

### Create a partitioned Delta Silver table to improve query performance via partition pruning on event_date and event_type

In [0]:
# Partitioned table
spark.sql("""
  CREATE TABLE ecommerce.silver.events_part
  USING DELTA
  PARTITIONED BY (event_date, event_type)
  AS SELECT * FROM ecommerce.silver.events
""")

### Optimize the Delta table by clustering data to improve query performance through data skipping on user_id and product_id

In [0]:
%sql
optimize ecommerce.silver.events_part zorder by(user_id,product_id)

### Benchmark query performance by measuring execution time for filtering on user_id

In [0]:
import time
start = time.time()
spark.sql("SELECT * FROM ecommerce.silver.events WHERE user_id=12345").count()
print(f"Time: {time.time()-start:.2f}s")

###  Cache the Silver table in memory to speed up repeated or iterative analytical queries

In [0]:

# Cache for iterative queries
cached = spark.table("ecommerce.silver.events").cache()
cached.count()  # Materialize