# 🧪 Healthcare ABAC Demo - Step 4: Test ABAC Policies

## 📋 Overview
This notebook defines ABAC policies using the functions created in Step 1 and tests them using the datasets created in Step 2. Each policy uses the account users group that is defined by default.


### What This Notebook Does:
1. **Tests Each Function**: Runs before/after examples for every masking function
2. **Validates Output**: Ensures masked data meets requirements
3. **Demonstrates Usage**: Shows how to apply functions in real queries

### Why Test Masking Functions?
Testing ensures:
- **Correctness**: Functions work as designed
- **Data Integrity**: Original data isn't corrupted
- **Performance**: Functions execute efficiently
- **Compliance**: Masking meets regulatory requirements
- **User Experience**: Masked output is appropriate for different roles

### What You'll See:
For each masking function, you'll see:
- **Original Data**: Unmasked values from tables
- **Masked Data**: Transformed values after function application
- **Side-by-Side Comparison**: Before and after for easy validation

## 🎓 How to Use This Notebook
1. **Ensure Steps 1-3 Complete**: All functions, tables, and data must exist
2. **Run All Cells**: Execute sequentially to see all test results
3. **Review Output**: Compare original vs masked data
4. **Verify Expectations**: Check that masking behavior is appropriate

## ⚙️ Prerequisites
- ✅ **Step 1 completed**: All functions created
- ✅ **Step 2 completed**: Core tables with data
- ✅ **Step 3 completed**: Governed tags defined and assigned
- ✅ SELECT permission on all tables and functions

## 📊 Expected Results
After running this notebook:
- ✅ Masking/filtering functions tested with ABAC policies on the healthcare schema
- ✅ Before/after comparisons shown
- ✅ Confidence that ABAC setup is working correctly

---


In [0]:
pip install pyyaml

In [0]:
# 📋 Load Configuration from config.yaml
import yaml
from pathlib import Path

config_file = Path('config.yaml')
if config_file.exists():
    with open(config_file) as f:
        config = yaml.safe_load(f)
    CATALOG = config['catalog']
    SCHEMA = config['schema']
    print(f'✅ Configuration loaded from config.yaml')
    print(f'   📊 Catalog: {CATALOG}')
    print(f'   📁 Schema: {SCHEMA}')
else:
    # Fallback defaults
    CATALOG = 'your_catalog_name'
    SCHEMA = 'healthcare'
    print(f'⚠️  config.yaml not found - using defaults')
    print(f'   📊 Catalog: {CATALOG}')
    print(f'   📁 Schema: {SCHEMA}')

# Set catalog and schema to use in following cells
spark.sql(f"USE CATALOG {CATALOG}")
spark.sql(f"USE SCHEMA {SCHEMA}")

## ⚙️ Configuration

Testing functions in:
- **Catalog**: `your_catalog_name`
- **Schema**: `healthcare`


In [0]:
%sql
SELECT '🧪 Testing functions in: ' || current_catalog() || '.' || current_schema() AS status;

## Query tables before applying any policies

In [0]:
%sql
SELECT * FROM patients

In [0]:
%sql
SELECT * FROM providers

In [0]:
%sql
SELECT * FROM insurance

In [0]:
%sql
SELECT * FROM visits

In [0]:
%sql
SELECT * FROM lab_results

In [0]:
%sql
SELECT * FROM prescriptions

In [0]:
%sql
SELECT * FROM billing

## Test: PATIENT ID MASKING (Deterministic)


In [0]:
spark.sql(f"""
    CREATE OR REPLACE POLICY healthcare_patient_id_masking
    ON SCHEMA {SCHEMA}
    COLUMN MASK mask_string_hash
    TO `account users`
    FOR TABLES
    MATCH COLUMNS 
        hasTagValue('pii_type_healthcare', 'patient_id') AND hasTagValue('phi_level_healthcare', 'High') AS patient_id_cols
    ON COLUMN patient_id_cols""")

In [0]:
%sql
-- =============================================
-- TEST 1: Patient ID Masking Demo
-- =============================================

SELECT 
  VisitID,
  PatientID,
  TestName,
  Status,
  TestDate
FROM lab_results 
LIMIT 5;

In [0]:
spark.sql(f"""DROP POLICY healthcare_patient_id_masking ON SCHEMA {SCHEMA}""")

## Test: PARTIAL NAME MASKING DEMO


In [0]:
spark.sql(f"""
    CREATE OR REPLACE POLICY healthcare_name_masking
    ON SCHEMA {SCHEMA}
    COLUMN MASK mask_string_partial
    TO `account users`
    FOR TABLES
    MATCH COLUMNS 
        hasTagValue('pii_type_healthcare', 'patient_name') AS name_cols
    ON COLUMN name_cols""")

In [0]:
%sql
-- =============================================
-- TEST 2:  Partial Name Masking Demo
-- =============================================

SELECT 
  PatientID,
  FirstName,
  LastName
FROM patients
LIMIT 5;


In [0]:
spark.sql(f"""DROP POLICY healthcare_name_masking ON SCHEMA {SCHEMA}""")

## Test: PHONE MASKING DEMO


In [0]:
spark.sql(f"""
    CREATE OR REPLACE POLICY healthcare_phone_masking
    ON SCHEMA {SCHEMA}
    COLUMN MASK mask_phone
    TO `account users`
    FOR TABLES
    MATCH COLUMNS 
        hasTagValue('data_sensitivity_healthcare', 'Sensitive') AND hasTagValue('pii_type_healthcare', 'phone') AS phone_cols
    ON COLUMN phone_cols""")

In [0]:
%sql
-- =============================================
-- TEST 3: Phone Masking Demo
-- =============================================

SELECT 
  PatientID,
  PhoneNumber
FROM patients
LIMIT 5;

In [0]:
spark.sql(f"""DROP POLICY healthcare_phone_masking ON SCHEMA {SCHEMA}""")

## Test: EMAIL MASKING DEMO

In [0]:
spark.sql(f"""
    CREATE OR REPLACE POLICY healthcare_email_masking
    ON SCHEMA {SCHEMA}
    COLUMN MASK mask_email
    TO `account users`
    FOR TABLES
    MATCH COLUMNS 
        hasTagValue('data_sensitivity_healthcare', 'Internal') AND hasTagValue('pii_type_healthcare', 'email') AS email_cols
    ON COLUMN email_cols""")

In [0]:
%sql
-- =============================================
-- TEST 4: Email Masking Demo
-- =============================================

SELECT 
  PatientID,
  Email
FROM patients
LIMIT 5;

In [0]:
spark.sql(f"""DROP POLICY healthcare_email_masking ON SCHEMA {SCHEMA}""")

## Test: INSURANCE MASKING DEMO


In [0]:
spark.sql(f"""
    CREATE OR REPLACE POLICY healthcare_insurance_masking
    ON SCHEMA {SCHEMA}
    COLUMN MASK mask_policy_number_last4
    TO `account users`
    FOR TABLES
    MATCH COLUMNS 
        hasTagValue('data_sensitivity_healthcare', 'Sensitive') OR hasTagValue('data_sensitivity_healthcare', 'Highly_Sensitive') AS insurance_cols
    ON COLUMN insurance_cols""")

In [0]:
%sql
-- =============================================
-- TEST 5: Insurance Number Masking
-- =============================================

SELECT 
  PatientID,
  PolicyNumber,
  GroupNumber
FROM insurance
LIMIT 5;

In [0]:
spark.sql(f"""DROP POLICY healthcare_insurance_masking ON SCHEMA {SCHEMA}""")

## Test: DOB TO AGE GROUP DEMO


In [0]:
spark.sql(f"""
    CREATE OR REPLACE POLICY healthcare_age_group_masking
    ON SCHEMA {SCHEMA}
    COLUMN MASK mask_date_year_only
    TO `account users`
    FOR TABLES
    MATCH COLUMNS 
        hasTagValue('pii_type_healthcare', 'dob') AND hasTagValue('phi_level_healthcare', 'High') AS dob_cols
    ON COLUMN dob_cols""")

In [0]:
%sql
-- =============================================
-- TEST 6: Date of Birth to Age Group
-- =============================================

SELECT 
  PatientID,
  DateOfBirth
FROM patients
LIMIT 5;

In [0]:
spark.sql(f"""DROP POLICY healthcare_age_group_masking ON SCHEMA {SCHEMA}""")

## Test: BUSINESS HOURS ACCESS FILTER DEMO

In [0]:
spark.sql(f"""
    CREATE OR REPLACE POLICY healthcare_business_hours_filter
    ON SCHEMA {SCHEMA}
    ROW FILTER business_hours_filter
    TO `account users`
    FOR TABLES
    WHEN hasTagValue('shift_hours_healthcare', 'Standard_Business')""")

In [0]:
%sql
-- =============================================
-- TEST 7: Business hours lab results access
-- =============================================

SELECT 
    TestName,
    ResultValue,
    TestDate,
    PatientID
FROM lab_results 
LIMIT 10;

In [0]:
spark.sql(f"""DROP POLICY healthcare_business_hours_filter ON SCHEMA {SCHEMA}""")

## ✅ All Tests Complete!

Congratulations! All Healthcare ABAC functions are working correctly!

### What You Verified:
- ✅ All tables contain expected data
- ✅ Masking functions produce correct output
- ✅ Data transformations maintain privacy requirements
- ✅ Functions are ready for ABAC policy integration

### Test Summary:
- **Email Masking**: ✅ Local part hidden, domain visible
- **Phone Masking**: ✅ Showing last 4 digits only
- **Identifiers**: ✅ Deterministic hashing working
- **Sensitive Fields**: ✅ Complete redaction successful

### 🎯 Next Steps:

**Clean Up Demo Resources:**

When you're done with the demo, proceed to **`5_Cleanup.ipynb`** to remove all resources:
- Drop all ABAC policies
- Delete tag policies  
- Drop schema and tables

This cleanup notebook will safely remove all demo resources while preserving your catalog.

### 📚 Additional Resources:
- [Unity Catalog ABAC Documentation](https://docs.databricks.com/aws/en/data-governance/unity-catalog/abac/)
- [Row Filters and Column Masks](https://docs.databricks.com/aws/en/data-governance/unity-catalog/filters-and-masks)

---
**🎉 Great Job!** Your Healthcare ABAC demo foundation is complete and tested!