## Data Architecture in Training

This training uses **Unity Catalog Volumes for data storage**:

- **Source**: Unity Catalog Volumes (`/Volumes/ecommerce_platform_<user>/default/datasets`)
- **Variable**: `DATASET_BASE_PATH`
- **Purpose**: Demonstration of advanced UC features (Lakeflow, Governance)
- **Example**: `spark.read.csv("/Volumes/ecommerce_platform_trainer/default/datasets/customers/customers.csv")`

> **Note (2025)**: We use Unity Catalog Volumes instead of DBFS for better governance, security, and lineage tracking.

---

In [0]:
# Get current user and create isolated schema
current_user_email = spark.sql("SELECT current_user()").collect()[0][0]
username = current_user_email.split("@")[0].replace(".", "_").replace("-", "_")

# Trainer detection (same as Demo notebooks)
if "trainer" in username or "krzysztof_burejza" in username:
    effective_user = "trainer"
else:
    effective_user = username

# Configuration - SAME as Demo notebooks for consistency
CATALOG = f"ecommerce_platform_{effective_user}"  # Must exist (created by instructor)
BRONZE_SCHEMA = "bronze"
SILVER_SCHEMA = "silver"
GOLD_SCHEMA = "gold"
ISOLATION_MODE = "Catalog"

print(f"✓ User slug: {effective_user}")
print(f"✓ Isolation mode: {ISOLATION_MODE}")
print(f"✓ Schemas: {BRONZE_SCHEMA}, {SILVER_SCHEMA}, {GOLD_SCHEMA}")


In [0]:
LOCATION = f"abfss://unity-catalog-storage@dbstoragejvahleou7jwnq.dfs.core.windows.net/695459409840976/{CATALOG}"
print(LOCATION)

In [0]:

spark.sql(f"CREATE CATALOG IF NOT EXISTS {CATALOG} MANAGED LOCATION '{LOCATION}'")

In [0]:
spark.sql(f"CREATE VOLUME IF NOT EXISTS {CATALOG}.default.datasets")

In [0]:
# === Catalog and Schema Configuration ===
# Get current user (for production environment)
# raw_user = spark.sql("SELECT current_user()").first()[0]

# For Databricks environment - Unity Catalog Volume:
DATASET_BASE_PATH = f"/Volumes/{CATALOG}/default/datasets"
print(f"Dataset base path: {DATASET_BASE_PATH}")

for s in [BRONZE_SCHEMA, SILVER_SCHEMA, GOLD_SCHEMA]:
    spark.sql(f'CREATE SCHEMA IF NOT EXISTS {CATALOG}.{s}')

# Optionally: Create volume for data if not exists (for training purposes)
spark.sql(f"CREATE SCHEMA IF NOT EXISTS {CATALOG}.default")

display(
    spark.sql(f"DESCRIBE CATALOG EXTENDED {CATALOG}")
)

spark.sql(f'USE CATALOG {CATALOG}')
spark.sql(f'USE SCHEMA {BRONZE_SCHEMA}')

print(f"✓ User slug: {effective_user}")
print(f"✓ Isolation mode: {ISOLATION_MODE}")
print(f"✓ Working catalog: {CATALOG}")
print(f"✓ Schemas: {BRONZE_SCHEMA}, {SILVER_SCHEMA}, {GOLD_SCHEMA}")