# Workshop: Unity Catalog - Row-Level Security and Column Masking

**Training Objective:** Implement advanced data security mechanisms in Unity Catalog: Row-Level Security (RLS) and Dynamic Column Masking.

**Topics covered:**
- Row-Level Security (RLS) concepts and implementation
- Dynamic Column Masking for sensitive data protection
- User and group-based access control
- Unity Catalog native security features

**Duration:** 30 minutes

## Context and Requirements

- **Training Day**: Day 1 - Governance and Security
- **Notebook Type**: Workshop
- **Technical Requirements**:
  - Databricks Runtime 13.0+ (recommended: 14.3 LTS)
  - Unity Catalog enabled
  - Permissions: CREATE TABLE, CREATE VIEW, CREATE FUNCTION, SELECT, MODIFY
  - Cluster: Standard with minimum 2 workers

## Theoretical Introduction

### Row-Level Security (RLS)

Row-Level Security allows you to control which rows a user can see based on their identity or group membership.

| Method | Description | Use Case |
|--------|-------------|----------|
| `current_user()` | Returns current user's email | User-specific data access |
| `is_member('group')` | Checks group membership | Role-based access control |
| `is_account_group_member()` | Checks account-level group | Cross-workspace access |

### Column Masking

Column Masking hides sensitive data from unauthorized users while keeping the column structure intact.

| Masking Type | Example | Result |
|--------------|---------|--------|
| Full masking | `'***MASKED***'` | `***MASKED***` |
| Partial masking | `CONCAT('***', RIGHT(email, 4))` | `***.com` |
| Hash masking | `SHA2(email, 256)` | Hash value |
| Null masking | `NULL` | NULL |

**Why is this important?**
Data security is critical in enterprise environments. RLS and Column Masking allow fine-grained access control without duplicating data or creating multiple tables for different user groups.

## Environment Initialization

Run the initialization script for per-user catalog and schema isolation:

In [0]:
%run ../00_setup

---

## Part 1: Preparing Sample Data

### Task 1.1: Create Base Tables with Sensitive Data

**Objective:** Create tables containing sensitive customer data that needs protection.

**Instructions:**
1. Create a customers table with PII (email, phone, address)
2. Add a `region` column for RLS demonstration
3. Add an `owner_email` column for user-specific filtering

**Hints:**
- Use `CREATE TABLE IF NOT EXISTS` for idempotency
- Include variety of regions: 'EMEA', 'APAC', 'AMERICAS'
- Use realistic email patterns for owner assignments

In [0]:
# Create sample customers table with sensitive data
spark.sql(f"""
    CREATE TABLE IF NOT EXISTS {CATALOG}.{SCHEMA}.customers_sensitive (
        customer_id INT,
        customer_name STRING,
        email STRING,
        phone STRING,
        address STRING,
        city STRING,
        country STRING,
        region STRING,
        owner_email STRING,
        total_spent DECIMAL(10,2),
        customer_tier STRING
    )
    USING DELTA
""")

# Insert sample data with different regions and owners
spark.sql(f"""
    INSERT OVERWRITE {CATALOG}.{SCHEMA}.customers_sensitive VALUES
    (1, 'John Smith', 'john.smith@example.com', '+1-555-0101', '123 Main St', 'New York', 'USA', 'AMERICAS', 'analyst_us@company.com', 15000.00, 'Gold'),
    (2, 'Emma Johnson', 'emma.j@example.com', '+1-555-0102', '456 Oak Ave', 'Los Angeles', 'USA', 'AMERICAS', 'analyst_us@company.com', 8500.50, 'Silver'),
    (3, 'Hans Mueller', 'hans.m@example.de', '+49-30-12345', '789 Berlin Str', 'Berlin', 'Germany', 'EMEA', 'analyst_emea@company.com', 22000.00, 'Platinum'),
    (4, 'Marie Dupont', 'marie.d@example.fr', '+33-1-23456789', '10 Rue Paris', 'Paris', 'France', 'EMEA', 'analyst_emea@company.com', 12000.00, 'Gold'),
    (5, 'Yuki Tanaka', 'yuki.t@example.jp', '+81-3-1234-5678', '5-1 Shibuya', 'Tokyo', 'Japan', 'APAC', 'analyst_apac@company.com', 18500.00, 'Gold'),
    (6, 'Chen Wei', 'chen.w@example.cn', '+86-21-12345678', '100 Nanjing Rd', 'Shanghai', 'China', 'APAC', 'analyst_apac@company.com', 9000.00, 'Silver'),
    (7, 'Sarah Brown', 'sarah.b@example.co.uk', '+44-20-12345678', '50 London Bridge', 'London', 'UK', 'EMEA', 'analyst_emea@company.com', 31000.00, 'Platinum'),
    (8, 'Carlos Garcia', 'carlos.g@example.mx', '+52-55-12345678', '200 Reforma', 'Mexico City', 'Mexico', 'AMERICAS', 'analyst_us@company.com', 7500.00, 'Silver')
""")

print(f"Table {CATALOG}.{SCHEMA}.customers_sensitive created with sample data")
spark.table(f"{CATALOG}.{SCHEMA}.customers_sensitive").display()

### Task 1.2: Create Region-User Mapping Table

**Objective:** Create a mapping table that links users to their allowed regions.

**Instructions:**
1. Create a mapping table with user email and allowed region
2. This will be used for RLS lookups

**Hints:**
- Use the pattern `user_email -> region` for mapping
- Include `current_user()` for testing with your own user

In [0]:
# TODO: Create a region-user mapping table
# This table defines which users can access which regions

spark.sql(f"""
    CREATE TABLE IF NOT EXISTS {CATALOG}.{SCHEMA}.user_region_access (
        user_email STRING,
        allowed_region STRING
    )
    USING DELTA
""")

# Get current user for testing
current_user = spark.sql("SELECT current_user()").collect()[0][0]
print(f"Current user: {current_user}")

# Insert mappings - include current user for testing
spark.sql(f"""
    INSERT OVERWRITE {CATALOG}.{SCHEMA}.user_region_access VALUES
    ('analyst_us@company.com', 'AMERICAS'),
    ('analyst_emea@company.com', 'EMEA'),
    ('analyst_apac@company.com', 'APAC'),
    ('admin@company.com', 'ALL'),
    ('{current_user}', '___')  -- TODO: Choose region for current user: 'EMEA', 'AMERICAS', or 'APAC'
""")

spark.table(f"{CATALOG}.{SCHEMA}.user_region_access").display()

---

## Part 2: Row-Level Security (RLS)

### Task 2.1: Simple RLS with current_user()

**Objective:** Create a view that shows only rows owned by the current user.

**Instructions:**
1. Create a view filtering by `owner_email = current_user()`
2. Test that users only see their own data
3. Verify the filter is applied

**Hints:**
- Use `CREATE OR REPLACE VIEW` for repeatability
- `current_user()` returns the email of the logged-in user
- Compare with the `owner_email` column

In [0]:
# TODO: Create a view with Row-Level Security based on current_user()
# Users should only see customers they own

spark.sql(f"""
    CREATE OR REPLACE VIEW {CATALOG}.{SCHEMA}.v_my_customers AS
    SELECT 
        customer_id,
        customer_name,
        email,
        city,
        country,
        region,
        total_spent,
        customer_tier
    FROM {CATALOG}.{SCHEMA}.customers_sensitive
    WHERE owner_email = ___  -- TODO: Use current_user() function
""")

print("View v_my_customers created with RLS")

# Test: Show what current user can see
print(f"\nData visible to {spark.sql('SELECT current_user()').collect()[0][0]}:")
spark.table(f"{CATALOG}.{SCHEMA}.v_my_customers").display()

### Task 2.2: RLS with Region Mapping Table

**Objective:** Create a view that filters data based on user's allowed regions from mapping table.

**Instructions:**
1. Join customers table with user_region_access mapping
2. Filter where current user has access
3. Handle 'ALL' region for admins

**Hints:**
- Use `EXISTS` or `JOIN` with the mapping table
- Use `current_user()` to find user's allowed regions
- Handle special case where `allowed_region = 'ALL'`

In [0]:
# TODO: Create a view with RLS using the mapping table
# Users can only see customers from their allowed regions

spark.sql(f"""
    CREATE OR REPLACE VIEW {CATALOG}.{SCHEMA}.v_customers_by_region AS
    SELECT 
        c.customer_id,
        c.customer_name,
        c.email,
        c.city,
        c.country,
        c.region,
        c.total_spent,
        c.customer_tier
    FROM {CATALOG}.{SCHEMA}.customers_sensitive c
    WHERE EXISTS (
        SELECT 1 
        FROM {CATALOG}.{SCHEMA}.user_region_access a
        WHERE a.user_email = ___  -- TODO: Use current_user()
        AND (a.allowed_region = c.___ OR a.allowed_region = '___')  -- TODO: Match region or allow 'ALL'
    )
""")

print("View v_customers_by_region created with region-based RLS")

# Test the view
print(f"\nData visible to current user (by region):")
spark.table(f"{CATALOG}.{SCHEMA}.v_customers_by_region").display()

### Task 2.3: RLS with is_member() Group Check

**Objective:** Create a view that filters based on group membership.

**Instructions:**
1. Use `is_member('group_name')` to check group membership
2. Different groups see different data subsets
3. Admins group sees all data

**Hints:**
- `is_member('data_analysts')` returns TRUE/FALSE
- Use CASE WHEN or OR conditions for multiple groups
- Groups must exist in Unity Catalog (or use account groups)

In [0]:
# TODO: Create a view with RLS based on group membership
# - 'admins' group: sees all data
# - 'analysts_emea' group: sees only EMEA region
# - 'analysts_americas' group: sees only AMERICAS region
# - 'analysts_apac' group: sees only APAC region

spark.sql(f"""
    CREATE OR REPLACE VIEW {CATALOG}.{SCHEMA}.v_customers_by_group AS
    SELECT 
        customer_id,
        customer_name,
        email,
        city,
        country,
        region,
        total_spent,
        customer_tier
    FROM {CATALOG}.{SCHEMA}.customers_sensitive
    WHERE 
        is_member('___')  -- TODO: admins see all
        OR (is_member('analysts_emea') AND region = '___')  -- TODO: EMEA region
        OR (is_member('analysts_americas') AND region = '___')  -- TODO: AMERICAS region
        OR (is_member('analysts_apac') AND region = 'APAC')
        OR is_account_group_member('admins')  -- Account-level admin group
""")

print("View v_customers_by_group created with group-based RLS")

# Note: This view may return empty if you're not member of any group
# In real scenario, groups would be configured in Unity Catalog
spark.table(f"{CATALOG}.{SCHEMA}.v_customers_by_group").display()

In [0]:
---

## Part 3: Dynamic Column Masking

### Task 3.1: Basic Column Masking with CASE WHEN

**Objective:** Create a view that masks sensitive columns for non-privileged users.

**Instructions:**
1. Mask `email` column - show only domain for non-admins
2. Mask `phone` column - show only last 4 digits
3. Full access for admins group

**Hints:**
- Use `CASE WHEN is_member('group') THEN ... ELSE ... END`
- `CONCAT('***', SUBSTRING(email, INSTR(email, '@'), 100))` for email domain
- `CONCAT('***-', RIGHT(phone, 4))` for partial phone

In [0]:
# TODO: Create a view with column masking for sensitive data
# Admins see full data, others see masked versions

spark.sql(f"""
    CREATE OR REPLACE VIEW {CATALOG}.{SCHEMA}.v_customers_masked AS
    SELECT 
        customer_id,
        customer_name,
        
        -- TODO: Mask email - show full for admins, only domain for others
        CASE 
            WHEN is_member('admins') OR is_account_group_member('admins') 
            THEN email
            ELSE CONCAT('***', SUBSTRING(email, INSTR(email, '@'), 100))
        END AS ___,  -- TODO: name this column 'email'
        
        -- TODO: Mask phone - show full for admins, only last 4 digits for others
        CASE 
            WHEN is_member('admins') OR is_account_group_member('admins') 
            THEN phone
            ELSE CONCAT('***-', ___(phone, 4))  -- TODO: Use RIGHT() function
        END AS phone,
        
        -- TODO: Mask address completely for non-admins
        CASE 
            WHEN is_member('admins') OR is_account_group_member('admins') 
            THEN address
            ELSE '___'  -- TODO: Masked placeholder
        END AS address,
        
        city,
        country,
        region,
        total_spent,
        customer_tier
    FROM {CATALOG}.{SCHEMA}.customers_sensitive
""")

print("View v_customers_masked created with column masking")
spark.table(f"{CATALOG}.{SCHEMA}.v_customers_masked").display()

In [0]:
### Task 3.2: Advanced Masking - Hash and Tokenization

**Objective:** Implement more advanced masking techniques for analytics.

**Instructions:**
1. Use SHA2 hash for email (useful for joining without exposing data)
2. Use tokenization pattern for customer_id
3. Preserve data utility while protecting privacy

**Hints:**
- `SHA2(email, 256)` creates a consistent hash
- Hashes can be used for joining/grouping without exposing actual values
- Consider format-preserving masking for IDs

In [0]:
# TODO: Create a view with hash-based masking for analytics
# This allows joining/grouping while protecting PII

spark.sql(f"""
    CREATE OR REPLACE VIEW {CATALOG}.{SCHEMA}.v_customers_hashed AS
    SELECT 
        -- TODO: Hash customer_id for non-admins (format: CUST_XXXX where XXXX is hash)
        CASE 
            WHEN is_member('admins') OR is_account_group_member('admins') 
            THEN CAST(customer_id AS STRING)
            ELSE CONCAT('CUST_', SUBSTRING(___(CAST(customer_id AS STRING), 256), 1, 8))
        END AS customer_id,  -- TODO: Use SHA2() function
        
        -- Mask name - show only first letter and asterisks
        CASE 
            WHEN is_member('admins') OR is_account_group_member('admins') 
            THEN customer_name
            ELSE CONCAT(LEFT(customer_name, 1), '****')
        END AS customer_name,
        
        -- TODO: Hash email for consistent joining (non-admins)
        CASE 
            WHEN is_member('admins') OR is_account_group_member('admins') 
            THEN email
            ELSE SHA2(___, 256)  -- TODO: Hash the email column
        END AS email_hash,
        
        -- Aggregate-safe columns (not masked)
        region,
        country,
        total_spent,
        customer_tier
    FROM {CATALOG}.{SCHEMA}.customers_sensitive
""")

print("View v_customers_hashed created with hash-based masking")
spark.table(f"{CATALOG}.{SCHEMA}.v_customers_hashed").display()

---

## Part 4: Combining RLS and Column Masking

### Task 4.1: Complete Secure View

**Objective:** Create a production-ready view combining both RLS and column masking.

**Instructions:**
1. Apply region-based RLS from mapping table
2. Apply column masking for PII
3. Add audit columns (who accessed, when)

**Hints:**
- Combine WHERE clause (RLS) with CASE expressions (masking)
- Add `current_user()` and `current_timestamp()` as audit columns
- Consider performance implications of complex views

# TODO: Create a complete secure view with RLS + Column Masking + Audit
spark.sql(f"""
    CREATE OR REPLACE VIEW {CATALOG}.{SCHEMA}.v_customers_secure AS
    SELECT 
        c.customer_id,
        c.customer_name,
        
        -- Column masking for email
        CASE 
            WHEN is_member('admins') THEN c.email
            ELSE CONCAT('***', SUBSTRING(c.email, INSTR(c.email, '@'), 100))
        END AS email,
        
        -- Column masking for phone
        CASE 
            WHEN is_member('admins') THEN c.phone
            ELSE CONCAT('***-', RIGHT(c.phone, 4))
        END AS phone,
        
        c.city,
        c.country,
        c.region,
        c.total_spent,
        c.customer_tier,
        
        -- Audit columns
        current_user() AS accessed_by,
        current_timestamp() AS accessed_at
        
    FROM {CATALOG}.{SCHEMA}.customers_sensitive c
    -- Row-Level Security using mapping table
    WHERE EXISTS (
        SELECT 1 
        FROM {CATALOG}.{SCHEMA}.user_region_access a
        WHERE a.user_email = current_user()
        AND (a.allowed_region = c.region OR a.allowed_region = 'ALL')
    )
""")

print("Complete secure view created with RLS + Column Masking + Audit")
spark.table(f"{CATALOG}.{SCHEMA}.v_customers_secure").display()

In [0]:
### Task 4.2: Testing Security - Simulate Different Users

**Objective:** Verify security works correctly for different access scenarios.

**Instructions:**
1. Test view access simulation
2. Verify row counts match expected access
3. Confirm masking is applied correctly

In [0]:
# Security testing - verify row counts and masking

print("=" * 60)
print("SECURITY VERIFICATION REPORT")
print("=" * 60)

# 1. Check current user access
current_user = spark.sql("SELECT current_user()").collect()[0][0]
print(f"\n1. Current User: {current_user}")

# 2. Check user's allowed regions
allowed_regions = spark.sql(f"""
    SELECT allowed_region 
    FROM {CATALOG}.{SCHEMA}.user_region_access 
    WHERE user_email = current_user()
""").collect()

print(f"   Allowed Regions: {[r[0] for r in allowed_regions]}")

# 3. Compare base table vs secure view
base_count = spark.table(f"{CATALOG}.{SCHEMA}.customers_sensitive").count()
secure_count = spark.table(f"{CATALOG}.{SCHEMA}.v_customers_secure").count()

print(f"\n2. Row Count Comparison:")
print(f"   Base Table: {base_count} rows")
print(f"   Secure View: {secure_count} rows")
print(f"   Rows filtered by RLS: {base_count - secure_count}")

# 4. Verify masking is applied
print(f"\n3. Column Masking Verification:")
sample = spark.table(f"{CATALOG}.{SCHEMA}.v_customers_secure").limit(1).collect()
if sample:
    row = sample[0]
    print(f"   Email (masked): {row['email']}")
    print(f"   Phone (masked): {row['phone']}")

print("\n" + "=" * 60)

---

## Part 5: Row Filters and Column Masks (Unity Catalog Native)

### Task 5.1: Unity Catalog Row Filters (Preview Feature)

**Objective:** Use Unity Catalog native row filter feature (if available).

**Note:** Row Filters are a Unity Catalog feature that applies security at the table level, not view level.

**Instructions:**
1. Create a row filter function
2. Apply filter to table using ALTER TABLE
3. All queries to the table automatically apply the filter

# Unity Catalog Native Row Filters (requires UC with row filter support)
# This is a more advanced feature than view-based RLS

# Step 1: Create a row filter function
try:
    spark.sql(f"""
        CREATE OR REPLACE FUNCTION {CATALOG}.{SCHEMA}.region_access_filter(region_value STRING)
        RETURNS BOOLEAN
        RETURN EXISTS (
            SELECT 1 
            FROM {CATALOG}.{SCHEMA}.user_region_access 
            WHERE user_email = current_user() 
            AND (allowed_region = region_value OR allowed_region = 'ALL')
        )
    """)
    print("Row filter function created successfully")
    
    # Step 2: Apply row filter to table (commented - requires appropriate permissions)
    # spark.sql(f"""
    #     ALTER TABLE {CATALOG}.{SCHEMA}.customers_sensitive 
    #     SET ROW FILTER {CATALOG}.{SCHEMA}.region_access_filter ON (region)
    # """)
    # print("Row filter applied to customers_sensitive table")
    
    print("\nNote: ALTER TABLE SET ROW FILTER requires table owner or admin privileges")
    print("The function is created and ready to use when permissions are granted")
    
except Exception as e:
    print(f"Note: Row filter functions require Unity Catalog with appropriate feature flags")
    print(f"Error: {e}")

In [0]:
### Task 5.2: Unity Catalog Column Masks (Preview Feature)

**Objective:** Use Unity Catalog native column mask feature (if available).

**Instructions:**
1. Create a column mask function
2. Apply mask to column using ALTER TABLE
3. All queries automatically apply the mask

In [0]:
# Unity Catalog Native Column Masks (requires UC with column mask support)

# Step 1: Create column mask functions
try:
    # Email masking function
    spark.sql(f"""
        CREATE OR REPLACE FUNCTION {CATALOG}.{SCHEMA}.mask_email(email_value STRING)
        RETURNS STRING
        RETURN CASE 
            WHEN is_member('admins') OR is_account_group_member('admins') 
            THEN email_value
            ELSE CONCAT('***', SUBSTRING(email_value, INSTR(email_value, '@'), 100))
        END
    """)
    print("Email mask function created")
    
    # Phone masking function
    spark.sql(f"""
        CREATE OR REPLACE FUNCTION {CATALOG}.{SCHEMA}.mask_phone(phone_value STRING)
        RETURNS STRING
        RETURN CASE 
            WHEN is_member('admins') OR is_account_group_member('admins') 
            THEN phone_value
            ELSE CONCAT('***-', RIGHT(phone_value, 4))
        END
    """)
    print("Phone mask function created")
    
    # Step 2: Apply column masks (commented - requires appropriate permissions)
    # spark.sql(f"""
    #     ALTER TABLE {CATALOG}.{SCHEMA}.customers_sensitive 
    #     ALTER COLUMN email SET MASK {CATALOG}.{SCHEMA}.mask_email
    # """)
    # spark.sql(f"""
    #     ALTER TABLE {CATALOG}.{SCHEMA}.customers_sensitive 
    #     ALTER COLUMN phone SET MASK {CATALOG}.{SCHEMA}.mask_phone
    # """)
    
    print("\nNote: ALTER TABLE SET MASK requires table owner or admin privileges")
    print("Functions are created and ready to use when permissions are granted")
    
except Exception as e:
    print(f"Note: Column mask functions require Unity Catalog with appropriate feature flags")
    print(f"Error: {e}")

---

## Workshop Summary

### Implemented Security Mechanisms

| Mechanism | Implementation | Use Case |
|-----------|----------------|----------|
| **Simple RLS** | `WHERE owner_email = current_user()` | User-owned data |
| **Mapping RLS** | JOIN with access mapping table | Region-based access |
| **Group RLS** | `is_member('group')` check | Role-based access |
| **Basic Masking** | CASE WHEN with partial reveal | Email, phone masking |
| **Hash Masking** | SHA2() for analytics | Privacy-preserving joins |
| **Native Row Filters** | UC Function + ALTER TABLE | Table-level RLS |
| **Native Column Masks** | UC Function + ALTER TABLE | Table-level masking |

---

### Best Practices

1. **Principle of Least Privilege:** Start with no access, grant only what's needed
2. **Use Views for Flexibility:** Easier to modify than table-level security
3. **Use Native Features When Possible:** Row Filters and Column Masks are more secure
4. **Audit Everything:** Add audit columns to track access
5. **Test Thoroughly:** Verify security with different user contexts
6. **Document Access Patterns:** Maintain mapping tables for traceability
7. **Consider Performance:** Complex RLS/masking can impact query performance

---

### Security Comparison: Views vs Native UC Features

| Aspect | View-Based | Native UC Row Filter/Mask |
|--------|------------|--------------------------|
| **Security Level** | Application | Engine level |
| **Bypass Risk** | Can be bypassed with direct table access | Cannot be bypassed |
| **Flexibility** | High (SQL logic) | Function-based |
| **Performance** | Depends on view complexity | Optimized by engine |
| **Administration** | Per-view management | Centralized on table |

---

---

## Solutions

Below are the complete solutions for all workshop tasks.

In [0]:
# =============================================================================
# SOLUTIONS - Workshop 3: Row-Level Security & Column Masking
# =============================================================================

# -----------------------------------------------------------------------------
# Task 1.2: Region-User Mapping (the TODO part)
# -----------------------------------------------------------------------------
# Solution: Choose a region for testing, e.g., 'EMEA'
# ('{current_user}', 'EMEA')

# -----------------------------------------------------------------------------
# Task 2.1: Simple RLS with current_user()
# -----------------------------------------------------------------------------
spark.sql(f"""
    CREATE OR REPLACE VIEW {CATALOG}.{SCHEMA}.v_my_customers_solution AS
    SELECT 
        customer_id, customer_name, email, city, country, region, total_spent, customer_tier
    FROM {CATALOG}.{SCHEMA}.customers_sensitive
    WHERE owner_email = current_user()
""")

# -----------------------------------------------------------------------------
# Task 2.2: RLS with Region Mapping Table
# -----------------------------------------------------------------------------
spark.sql(f"""
    CREATE OR REPLACE VIEW {CATALOG}.{SCHEMA}.v_customers_by_region_solution AS
    SELECT 
        c.customer_id, c.customer_name, c.email, c.city, c.country, c.region, c.total_spent, c.customer_tier
    FROM {CATALOG}.{SCHEMA}.customers_sensitive c
    WHERE EXISTS (
        SELECT 1 FROM {CATALOG}.{SCHEMA}.user_region_access a
        WHERE a.user_email = current_user()
        AND (a.allowed_region = c.region OR a.allowed_region = 'ALL')
    )
""")

# -----------------------------------------------------------------------------
# Task 2.3: RLS with is_member()
# -----------------------------------------------------------------------------
spark.sql(f"""
    CREATE OR REPLACE VIEW {CATALOG}.{SCHEMA}.v_customers_by_group_solution AS
    SELECT customer_id, customer_name, email, city, country, region, total_spent, customer_tier
    FROM {CATALOG}.{SCHEMA}.customers_sensitive
    WHERE 
        is_member('admins')
        OR (is_member('analysts_emea') AND region = 'EMEA')
        OR (is_member('analysts_americas') AND region = 'AMERICAS')
        OR (is_member('analysts_apac') AND region = 'APAC')
        OR is_account_group_member('admins')
""")

# -----------------------------------------------------------------------------
# Task 3.1: Basic Column Masking
# -----------------------------------------------------------------------------
spark.sql(f"""
    CREATE OR REPLACE VIEW {CATALOG}.{SCHEMA}.v_customers_masked_solution AS
    SELECT 
        customer_id,
        customer_name,
        CASE 
            WHEN is_member('admins') OR is_account_group_member('admins') THEN email
            ELSE CONCAT('***', SUBSTRING(email, INSTR(email, '@'), 100))
        END AS email,
        CASE 
            WHEN is_member('admins') OR is_account_group_member('admins') THEN phone
            ELSE CONCAT('***-', RIGHT(phone, 4))
        END AS phone,
        CASE 
            WHEN is_member('admins') OR is_account_group_member('admins') THEN address
            ELSE '[MASKED]'
        END AS address,
        city, country, region, total_spent, customer_tier
    FROM {CATALOG}.{SCHEMA}.customers_sensitive
""")

# -----------------------------------------------------------------------------
# Task 3.2: Hash-based Masking
# -----------------------------------------------------------------------------
spark.sql(f"""
    CREATE OR REPLACE VIEW {CATALOG}.{SCHEMA}.v_customers_hashed_solution AS
    SELECT 
        CASE 
            WHEN is_member('admins') OR is_account_group_member('admins') THEN CAST(customer_id AS STRING)
            ELSE CONCAT('CUST_', SUBSTRING(SHA2(CAST(customer_id AS STRING), 256), 1, 8))
        END AS customer_id,
        CASE 
            WHEN is_member('admins') OR is_account_group_member('admins') THEN customer_name
            ELSE CONCAT(LEFT(customer_name, 1), '****')
        END AS customer_name,
        CASE 
            WHEN is_member('admins') OR is_account_group_member('admins') THEN email
            ELSE SHA2(email, 256)
        END AS email_hash,
        region, country, total_spent, customer_tier
    FROM {CATALOG}.{SCHEMA}.customers_sensitive
""")

print("All solution views created successfully!")

---

## Resource Cleanup (optional)

In [0]:
# WARNING: Run only if you want to delete all created objects

# Uncomment the lines below to clean up:

# Drop views
# spark.sql(f"DROP VIEW IF EXISTS {CATALOG}.{SCHEMA}.v_my_customers")
# spark.sql(f"DROP VIEW IF EXISTS {CATALOG}.{SCHEMA}.v_customers_by_region")
# spark.sql(f"DROP VIEW IF EXISTS {CATALOG}.{SCHEMA}.v_customers_by_group")
# spark.sql(f"DROP VIEW IF EXISTS {CATALOG}.{SCHEMA}.v_customers_masked")
# spark.sql(f"DROP VIEW IF EXISTS {CATALOG}.{SCHEMA}.v_customers_hashed")
# spark.sql(f"DROP VIEW IF EXISTS {CATALOG}.{SCHEMA}.v_customers_secure")

# Drop solution views
# spark.sql(f"DROP VIEW IF EXISTS {CATALOG}.{SCHEMA}.v_my_customers_solution")
# spark.sql(f"DROP VIEW IF EXISTS {CATALOG}.{SCHEMA}.v_customers_by_region_solution")
# spark.sql(f"DROP VIEW IF EXISTS {CATALOG}.{SCHEMA}.v_customers_by_group_solution")
# spark.sql(f"DROP VIEW IF EXISTS {CATALOG}.{SCHEMA}.v_customers_masked_solution")
# spark.sql(f"DROP VIEW IF EXISTS {CATALOG}.{SCHEMA}.v_customers_hashed_solution")

# Drop tables
# spark.sql(f"DROP TABLE IF EXISTS {CATALOG}.{SCHEMA}.customers_sensitive")
# spark.sql(f"DROP TABLE IF EXISTS {CATALOG}.{SCHEMA}.user_region_access")

# Drop functions
# spark.sql(f"DROP FUNCTION IF EXISTS {CATALOG}.{SCHEMA}.region_access_filter")
# spark.sql(f"DROP FUNCTION IF EXISTS {CATALOG}.{SCHEMA}.mask_email")
# spark.sql(f"DROP FUNCTION IF EXISTS {CATALOG}.{SCHEMA}.mask_phone")

print("Resource cleanup is commented out. Uncomment to delete objects.")

---

## Additional Resources

- [Unity Catalog Row Filters](https://docs.databricks.com/en/data-governance/unity-catalog/row-filters.html)
- [Unity Catalog Column Masks](https://docs.databricks.com/en/data-governance/unity-catalog/column-masks.html)
- [Dynamic Views for RLS](https://docs.databricks.com/en/data-governance/unity-catalog/create-views.html)
- [is_member() Function](https://docs.databricks.com/en/sql/language-manual/functions/is_member.html)
- [current_user() Function](https://docs.databricks.com/en/sql/language-manual/functions/current_user.html)

---

**Workshop Complete!**