# 🧪 Manufacturing ABAC Demo - Step 4: Test ABAC Policies

## 📋 Overview
This notebook **tests all masking functions** created in the Manufacturing ABAC demo.

### What This Notebook Does:
1. **Verifies Data**: Confirms all tables have correct row counts
2. **Tests Each Function**: Runs before/after examples for every masking function
3. **Validates Output**: Ensures masked data meets requirements
4. **Demonstrates Usage**: Shows how to apply functions in real queries

### Why Test Masking Functions?
Testing ensures:
- **Correctness**: Functions work as designed
- **Data Integrity**: Original data isn't corrupted
- **Performance**: Functions execute efficiently
- **Compliance**: Masking meets regulatory requirements
- **User Experience**: Masked output is appropriate for different roles

### What You'll See:
For each masking function, you'll see:
- **Original Data**: Unmasked values from tables
- **Masked Data**: Transformed values after function application
- **Side-by-Side Comparison**: Before and after for easy validation

## 🎓 How to Use This Notebook
1. **Ensure Steps 1-3 Complete**: All functions, tables, and data must exist
2. **Run All Cells**: Execute sequentially to see all test results
3. **Review Output**: Compare original vs masked data
4. **Verify Expectations**: Check that masking behavior is appropriate

## ⚙️ Prerequisites
- ✅ **Step 1 completed**: All masking functions created
- ✅ **Step 2 completed**: Core schema with data
- ✅ **Step 3 completed**: Extended tables with data
- ✅ SELECT permission on all tables and functions

## 📊 Expected Results
After running this notebook:
- ✅ Table row counts displayed
- ✅ Each masking function tested with real data
- ✅ Before/after comparisons shown
- ✅ Confidence that ABAC setup is working correctly

## 🎯 What Comes Next?
After validating masking functions:
1. **Create Groups/Users**: Set up roles for ABAC policies
2. **Apply Tags**: Tag columns with sensitivity classifications
3. **Create Policies**: Build ABAC policies using these masking functions
4. **Test Access Control**: Verify different users see different data

---


## ⚙️ Configuration

Testing functions in:
- **Catalog**: `your_catalog_name`
- **Schema**: `manufacturing`


In [0]:
pip install pyyaml

In [0]:
# 📋 Load Configuration from config.yaml
import yaml
from pathlib import Path

config_file = Path('config.yaml')
if config_file.exists():
    with open(config_file) as f:
        config = yaml.safe_load(f)
    CATALOG = config['catalog']
    SCHEMA = config['schema']
    print(f'✅ Configuration loaded from config.yaml')
    print(f'   📊 Catalog: {CATALOG}')
    print(f'   📁 Schema: {SCHEMA}')
else:
    # Fallback defaults
    CATALOG = 'your_catalog_name'
    SCHEMA = 'manufacturing'
    print(f'⚠️  config.yaml not found - using defaults')
    print(f'   📊 Catalog: {CATALOG}')
    print(f'   📁 Schema: {SCHEMA}')

# Set catalog and schema to use in following cells
spark.sql(f"USE CATALOG {CATALOG}")
spark.sql(f"USE SCHEMA {SCHEMA}")

In [0]:
%sql
SELECT '🧪 Testing functions in: ' || current_catalog() || '.' || current_schema() AS status;

## Query tables before applying any policies

In [0]:
%sql
SELECT * FROM performance_metrics LIMIT 5;

In [0]:
%sql
SELECT * FROM employee_contacts LIMIT 5;

In [0]:
%sql
SELECT * FROM shipments LIMIT 5;

In [0]:
%sql
SELECT * FROM product_specs LIMIT 5;

In [0]:
%sql
SELECT * FROM maintenance_events LIMIT 5;

In [0]:
%sql
SELECT * FROM suppliers LIMIT 5;

In [0]:
%sql
SELECT * FROM assets LIMIT 5;

## Test: EMAIL MASKING DEMO


In [0]:
spark.sql(f"""CREATE OR REPLACE POLICY mfg_email_privacy
ON SCHEMA {SCHEMA}
COMMENT 'Mask email addresses for regular users'
COLUMN MASK mask_email
TO `account users`
FOR TABLES
MATCH COLUMNS hasTagValue('ip_sensitivity_manufacturing','Internal') AND hasTagValue('sensitive_type_manufacturing','email') AS email_cols
ON COLUMN email_cols;""")

In [0]:
%sql
SELECT 
  event_id,
  technician_name,
  technician_email
FROM maintenance_events
WHERE event_id IN ('ME-1000', 'ME-1001', 'ME-1002', 'ME-1003', 'ME-1004')
ORDER BY event_id
LIMIT 5;

In [0]:
spark.sql(f"""DROP POLICY mfg_email_privacy ON SCHEMA {SCHEMA}""")

## Test: PHONE MASKING DEMO


In [0]:
spark.sql(f"""CREATE OR REPLACE POLICY mfg_phone_privacy
ON SCHEMA {SCHEMA}
COMMENT 'Mask phone numbers for regular users'
COLUMN MASK mask_phone
TO `account users`
FOR TABLES
MATCH COLUMNS hasTagValue('ip_sensitivity_manufacturing','Internal') AND hasTagValue('sensitive_type_manufacturing','phone') AS phone_cols
ON COLUMN phone_cols;""")

In [0]:
%sql
SELECT 
  employee_id,
  full_name,
  phone
FROM employee_contacts
WHERE employee_id IN ('EMP-1001', 'EMP-1002', 'EMP-1003', 'EMP-1004', 'EMP-1005')
ORDER BY employee_id
LIMIT 5;

In [0]:
spark.sql(f"""DROP POLICY mfg_phone_privacy ON SCHEMA {SCHEMA}""")

## Test: SPEC TEXT REDACTION DEMO


In [0]:
spark.sql(f"""CREATE OR REPLACE POLICY mfg_ip_protection_specs
ON SCHEMA {SCHEMA}
COMMENT 'Mask spec_text for regular users; admins see full data'
COLUMN MASK mask_spec_text
TO `account users`
FOR TABLES
MATCH COLUMNS hasTagValue('ip_sensitivity_manufacturing','Trade_Secret') AND hasTagValue('sensitive_type_manufacturing','specification') AS ip_cols
ON COLUMN ip_cols;""")

In [0]:
%sql
SELECT 
  spec_id,
  product_name,
  LEFT(spec_text, 30) || '...' as spec
FROM product_specs
WHERE spec_id IN ('SPEC-001', 'SPEC-002', 'SPEC-003', 'SPEC-004', 'SPEC-005')
ORDER BY spec_id
LIMIT 5;

In [0]:
spark.sql(f"""DROP POLICY mfg_ip_protection_specs ON SCHEMA {SCHEMA}""")

## Test: CAD URI HASHING DEMO


In [0]:
spark.sql(f"""CREATE OR REPLACE POLICY mfg_ip_protection_cad
ON SCHEMA {SCHEMA}
COMMENT 'Hash CAD references for regular users; admins see full URIs'
COLUMN MASK mask_cad_reference
TO `account users`
FOR TABLES
MATCH COLUMNS hasTagValue('ip_sensitivity_manufacturing','Trade_Secret') AND hasTagValue('sensitive_type_manufacturing','CAD') AS cad_cols
ON COLUMN cad_cols;""")

In [0]:
%sql
SELECT 
  spec_id,
  product_name,
  cad_file_uri
FROM product_specs
WHERE spec_id IN ('SPEC-001', 'SPEC-006', 'SPEC-007', 'SPEC-008', 'SPEC-009')
ORDER BY spec_id
LIMIT 5;

In [0]:
spark.sql(f"""DROP POLICY mfg_ip_protection_cad ON SCHEMA {SCHEMA}""")

## Test: TIMESTAMP ROUNDING DEMO


In [0]:
spark.sql(f"""CREATE OR REPLACE POLICY mfg_timestamp_rounding
ON SCHEMA {SCHEMA}
COMMENT 'Round telemetry timestamps to 15-min intervals for regular users'
COLUMN MASK mask_timestamp_15min
TO `account users`
FOR TABLES
MATCH COLUMNS hasTagValue('asset_criticality_manufacturing','High') AND hasTagValue('sensitive_type_manufacturing','ts') AS timestamp_cols
ON COLUMN timestamp_cols;""")

In [0]:
%sql
SELECT 
  event_id,
  event_type,
  start_time
FROM maintenance_events
WHERE event_id IN ('ME-1000', 'ME-1001', 'ME-1002', 'ME-1003', 'ME-1004')
ORDER BY start_time
LIMIT 5;

In [0]:
spark.sql(f"""DROP POLICY mfg_timestamp_rounding ON SCHEMA {SCHEMA}""")

## Test: SERIAL NUMBER MASKING DEMO


In [0]:
spark.sql(f"""CREATE OR REPLACE POLICY mfg_serial_masking
ON SCHEMA {SCHEMA}
COMMENT 'Show only last 4 of serial numbers to regular users'
COLUMN MASK mask_serial_last4
TO `account users`
FOR TABLES
MATCH COLUMNS hasTagValue('asset_criticality_manufacturing','Medium') AND hasTagValue('sensitive_type_manufacturing','serial_number') AS serial_cols
ON COLUMN serial_cols;""")

In [0]:
%sql
SELECT 
    tracking_number,
    carrier,
    status
FROM shipments
LIMIT 5;

In [0]:
spark.sql(f"""DROP POLICY mfg_serial_masking ON SCHEMA {SCHEMA}""")

## Test: GPS PRECISION MASKING DEMO


In [0]:
spark.sql(f"""CREATE OR REPLACE POLICY mfg_gps_precision
ON SCHEMA {SCHEMA}
COMMENT 'Reduce GPS precision for regular users on critical assets'
COLUMN MASK mask_gps_precision
TO `account users`
FOR TABLES
MATCH COLUMNS hasTagValue('sensitive_type_manufacturing','gps') AS gps_cols
ON COLUMN gps_cols;""")

In [0]:
%sql
SELECT 
  latitude,
  longitude,
  asset_id
FROM assets
LIMIT 5;

In [0]:
spark.sql(f"""DROP POLICY mfg_gps_precision ON SCHEMA {SCHEMA}""")

## Test: COST BUCKETING DEMO


In [0]:
spark.sql(f"""CREATE OR REPLACE POLICY mfg_cost_bucketing
ON SCHEMA {SCHEMA}
COMMENT 'Show cost ranges instead of exact amounts to regular users'
COLUMN MASK mask_cost_bucket
TO `account users`
FOR TABLES
MATCH COLUMNS hasTagValue('data_purpose_manufacturing','SupplyChain') AND hasTagValue('sensitive_type_manufacturing','cost') AS cost_cols
ON COLUMN cost_cols;""")

In [0]:
%sql
SELECT
  metric_id,
  downtime_hours,
  maintenance_cost
FROM performance_metrics
LIMIT 5;

In [0]:
spark.sql(f"""DROP POLICY mfg_cost_bucketing ON SCHEMA {SCHEMA}""")

## Test: SUPPLIER NAME MASKING DEMO


In [0]:
spark.sql(f"""CREATE OR REPLACE POLICY mfg_supplier_privacy
ON SCHEMA {SCHEMA}
COMMENT 'Hash supplier names for regular users to protect commercial relationships'
COLUMN MASK mask_string_hash
TO `account users`
FOR TABLES
MATCH COLUMNS 
    hasTagValue('data_purpose_manufacturing','Audit') AND hasTagValue('sensitive_type_manufacturing','name') AS supplier_cols
ON COLUMN supplier_cols;""")

In [0]:
%sql
SELECT
    supplier_id,
    supplier_name,
    site_region 
FROM suppliers
LIMIT 5;

In [0]:
spark.sql(f"""DROP POLICY mfg_supplier_privacy ON SCHEMA {SCHEMA}""")

## Test: BUSINESS HOURS FILTER DEMO


In [0]:
spark.sql(f"""CREATE OR REPLACE POLICY mfg_business_hours_access
ON SCHEMA {SCHEMA}
COMMENT 'Restrict sensitive data access to business hours for regular users'
ROW FILTER business_hours_filter
TO `account users`
FOR TABLES
WHEN hasTagValue('shift_hours_manufacturing','Day');""")

In [0]:
%sql
SELECT 
  *
FROM performance_metrics
LIMIT 5;

In [0]:
spark.sql(f"""DROP POLICY mfg_business_hours_access ON SCHEMA {SCHEMA}""")

## ✅ All Tests Complete!

Congratulations! All Manufacturing ABAC functions are working correctly!

### What You Verified:
- ✅ All tables contain expected data
- ✅ Masking functions produce correct output
- ✅ Data transformations maintain privacy requirements
- ✅ Functions are ready for ABAC policy integration

### Test Summary:
- **Email Masking**: ✅ Local part hidden, domain visible
- **Phone Masking**: ✅ Showing last 4 digits only
- **Identifiers**: ✅ Deterministic hashing working
- **Sensitive Fields**: ✅ Complete redaction successful

### 🎯 Next Steps:

**Clean Up Demo Resources:**

When you're done with the demo, proceed to **`5_Cleanup.ipynb`** to remove all resources:
- Drop all ABAC policies
- Delete tag policies  
- Drop schema and tables

This cleanup notebook will safely remove all demo resources while preserving your catalog.

### 📚 Additional Resources:
- [Unity Catalog ABAC Documentation](https://docs.databricks.com/aws/en/data-governance/unity-catalog/abac/)
- [Row Filters and Column Masks](https://docs.databricks.com/aws/en/data-governance/unity-catalog/filters-and-masks)

---
**🎉 Great Job!** Your Manufacturing ABAC demo foundation is complete and tested!