### Day 08 | Databricks 14 Days AI Challenge  
## Unity Catalog – Data Governance

## About Day 08
Unity Catalog is Databricks’ centralized governance layer used to manage **data access, security, and metadata** across the platform.

## What this notebook covers
- Combining multiple CSV files
- Creating a governed **Delta table**
- Applying **catalog, schema, and table-level permissions**
- Creating a **secure view** for controlled access

## Goal
To understand how **Unity Catalog enables secure, scalable, and enterprise-grade data governance** in Databricks.


### Step 1 : Create Catalog

In [0]:
%sql
CREATE CATALOG IF NOT EXISTS day08_catalog;


In [0]:
%sql
SHOW CATALOGS;


catalog
day08_catalog
samples
system
workspace


### Step 2: Load & Reanme CSV Datasets

In [0]:
# Load October CSV data
oct_events = spark.read.csv(
    "/Volumes/workspace/eccomerce/ecommerce_data/2019-Oct.csv",
    header=True,
    inferSchema=True
)

# Load November CSV data
nov_events = spark.read.csv(
    "/Volumes/workspace/eccomerce/ecommerce_data/2019-Nov.csv",
    header=True,
    inferSchema=True
)


In [0]:
# Rename October DataFrame
df_oct = oct_events

# Rename November DataFrame
df_nov = nov_events


In [0]:
df_oct.limit(5).display()



event_time,event_type,product_id,category_id,category_code,brand,price,user_id,user_session
2019-10-01T00:00:00.000Z,view,44600062,2103807459595387724,,shiseido,35.79,541312140,72d76fde-8bb3-4e00-8c23-a032dfed738c
2019-10-01T00:00:00.000Z,view,3900821,2053013552326770905,appliances.environment.water_heater,aqua,33.2,554748717,9333dfbd-b87a-4708-9857-6336556b0fcc
2019-10-01T00:00:01.000Z,view,17200506,2053013559792632471,furniture.living_room.sofa,,543.1,519107250,566511c2-e2e3-422b-b695-cf8e6e792ca8
2019-10-01T00:00:01.000Z,view,1307067,2053013558920217191,computers.notebook,lenovo,251.74,550050854,7c90fc70-0e80-4590-96f3-13c02c18c713
2019-10-01T00:00:04.000Z,view,1004237,2053013555631882655,electronics.smartphone,apple,1081.98,535871217,c6bd7419-2748-4c56-95b4-8cec9ff8b80d


In [0]:
df_nov.limit(5).display()

event_time,event_type,product_id,category_id,category_code,brand,price,user_id,user_session
2019-11-01T00:00:00.000Z,view,1003461,2053013555631882655,electronics.smartphone,xiaomi,489.07,520088904,4d3b30da-a5e4-49df-b1a8-ba5943f1dd33
2019-11-01T00:00:00.000Z,view,5000088,2053013566100866035,appliances.sewing_machine,janome,293.65,530496790,8e5f4f83-366c-4f70-860e-ca7417414283
2019-11-01T00:00:01.000Z,view,17302664,2053013553853497655,,creed,28.31,561587266,755422e7-9040-477b-9bd2-6a6e8fd97387
2019-11-01T00:00:01.000Z,view,3601530,2053013563810775923,appliances.kitchen.washer,lg,712.87,518085591,3bfb58cd-7892-48cc-8020-2f17e6de6e7f
2019-11-01T00:00:01.000Z,view,1004775,2053013555631882655,electronics.smartphone,xiaomi,183.27,558856683,313628f1-68b8-460d-84f6-cec7a8796ef2


### Step 3 : Combine CSV

In [0]:
combined_df = df_oct.unionByName(df_nov)

In [0]:
combined_df.limit(5).display()


event_time,event_type,product_id,category_id,category_code,brand,price,user_id,user_session
2019-10-01T00:00:00.000Z,view,44600062,2103807459595387724,,shiseido,35.79,541312140,72d76fde-8bb3-4e00-8c23-a032dfed738c
2019-10-01T00:00:00.000Z,view,3900821,2053013552326770905,appliances.environment.water_heater,aqua,33.2,554748717,9333dfbd-b87a-4708-9857-6336556b0fcc
2019-10-01T00:00:01.000Z,view,17200506,2053013559792632471,furniture.living_room.sofa,,543.1,519107250,566511c2-e2e3-422b-b695-cf8e6e792ca8
2019-10-01T00:00:01.000Z,view,1307067,2053013558920217191,computers.notebook,lenovo,251.74,550050854,7c90fc70-0e80-4590-96f3-13c02c18c713
2019-10-01T00:00:04.000Z,view,1004237,2053013555631882655,electronics.smartphone,apple,1081.98,535871217,c6bd7419-2748-4c56-95b4-8cec9ff8b80d


### Step 4 : Save the Combined Data as a Delta Table Unity Catalog

In [0]:
%sql
CREATE SCHEMA IF NOT EXISTS day08_catalog.sales_schema;

In [0]:
combined_df.write \
    .format("delta") \
    .mode("overwrite") \
    .saveAsTable("day08_catalog.sales_schema.ecommerce_data")


In [0]:
%sql
SELECT * FROM day08_catalog.sales_schema.ecommerce_data LIMIT 5;


event_time,event_type,product_id,category_id,category_code,brand,price,user_id,user_session
2019-10-01T00:00:00.000Z,view,44600062,2103807459595387724,,shiseido,35.79,541312140,72d76fde-8bb3-4e00-8c23-a032dfed738c
2019-10-01T00:00:00.000Z,view,3900821,2053013552326770905,appliances.environment.water_heater,aqua,33.2,554748717,9333dfbd-b87a-4708-9857-6336556b0fcc
2019-10-01T00:00:01.000Z,view,17200506,2053013559792632471,furniture.living_room.sofa,,543.1,519107250,566511c2-e2e3-422b-b695-cf8e6e792ca8
2019-10-01T00:00:01.000Z,view,1307067,2053013558920217191,computers.notebook,lenovo,251.74,550050854,7c90fc70-0e80-4590-96f3-13c02c18c713
2019-10-01T00:00:04.000Z,view,1004237,2053013555631882655,electronics.smartphone,apple,1081.98,535871217,c6bd7419-2748-4c56-95b4-8cec9ff8b80d


### Step 5: Grant Basic Permissions using Unity Catalog

In [0]:
%sql
GRANT SELECT ON SCHEMA day08_catalog.sales_schema TO `account users`;

In [0]:
%sql
GRANT SELECT ON TABLE day08_catalog.sales_schema.ecommerce_data TO `account users`;

In [0]:
%sql
SHOW GRANTS ON TABLE day08_catalog.sales_schema.ecommerce_data;


Principal,ActionType,ObjectType,ObjectKey
account users,SELECT,TABLE,day08_catalog.sales_schema.ecommerce_data
account users,SELECT,SCHEMA,day08_catalog.sales_schema


### Step 6: Create a Secure View (Controlled Access)

In [0]:
%sql
CREATE OR REPLACE VIEW day08_catalog.sales_schema.ecommerce_view AS
SELECT
  event_time,
  event_type,
  product_id,
  price,
  user_id
FROM day08_catalog.sales_schema.ecommerce_data;


In [0]:
%sql
SELECT * 
FROM day08_catalog.sales_schema.ecommerce_view
LIMIT 5;


event_time,event_type,product_id,price,user_id
2019-10-01T00:00:00.000Z,view,44600062,35.79,541312140
2019-10-01T00:00:00.000Z,view,3900821,33.2,554748717
2019-10-01T00:00:01.000Z,view,17200506,543.1,519107250
2019-10-01T00:00:01.000Z,view,1307067,251.74,550050854
2019-10-01T00:00:04.000Z,view,1004237,1081.98,535871217


## ✅ Conclusion – Day 08

In this notebook, we implemented **Unity Catalog** to govern data created from multiple CSV files.  
We combined datasets, stored them as a **Delta table**, applied **catalog, schema, and table-level permissions**, and exposed the data safely using a **secure view**.

This exercise demonstrates how Unity Catalog enables **centralized, secure, and scalable data governance**, which is essential for real-world and enterprise data platforms.
