# 🎯 Government ABAC Demo - Step 1: Create Masking Functions

## 📋 Overview
This notebook creates **masking functions** for the Government industry ABAC (Attribute-Based Access Control) demo.

### What are Masking Functions?
Masking functions are SQL user-defined functions (UDFs) that transform sensitive data to protect privacy while maintaining data utility for analytics. They are the foundation of ABAC policies in Unity Catalog.

### Why Use Masking Functions?
- **Compliance**: Meet GDPR, CCPA, HIPAA, and other privacy regulations
- **Security**: Protect sensitive data from unauthorized access
- **Flexibility**: Apply different masks based on user roles and attributes
- **Analytics**: Preserve data utility for analysis while protecting privacy
- **Audit**: Track and log all data access patterns

### What This Notebook Creates
This notebook will create specialized masking functions for the Government industry, including:
- **Identity Protection**: Email, phone, address masking
- **Financial Data**: Credit card, transaction amount bucketing
- **Identifiers**: Deterministic hashing for cross-table analytics
- **Confidential Data**: Complete redaction of sensitive fields
- **Network Data**: IP address masking

## 🎓 How to Use This Notebook
1. **Update Configuration**: Change the catalog name in the configuration cell below
2. **Run All Cells**: Execute cells sequentially (Shift+Enter or Run All)
3. **Verify Success**: Check for ✅ success messages after each function
4. **Proceed to Next Step**: Continue to notebook 2 to create the schema

## ⚙️ Prerequisites
- ✅ Unity Catalog enabled workspace
- ✅ CREATE FUNCTION permission in the target catalog
- ✅ SQL Warehouse or Cluster attached to this notebook
- ✅ Account admin or catalog owner role (recommended)

## 🔄 Next Steps
After completing this notebook:
1. **Step 2**: `2_Create_Schema.ipynb` - Create database schema and core tables
2. **Step 3**: `3_Create_Extended_Tables.ipynb` - Add supplementary tables
3. **Step 4**: `4_Test_Masking.ipynb` - Test all masking functions

---


In [None]:
# 📋 Load Configuration from config.yaml
import yaml
from pathlib import Path

config_file = Path('config.yaml')
if config_file.exists():
    with open(config_file) as f:
        config = yaml.safe_load(f)
    CATALOG = config['catalog']
    SCHEMA = config['schema']
    print(f'✅ Configuration loaded from config.yaml')
    print(f'   📊 Catalog: {CATALOG}')
    print(f'   📁 Schema: {SCHEMA}')
else:
    # Fallback defaults
    CATALOG = 'your_catalog_name'
    SCHEMA = 'government'
    print(f'⚠️  config.yaml not found - using defaults')
    print(f'   📊 Catalog: {CATALOG}')
    print(f'   📁 Schema: {SCHEMA}')

# Make variables available to SQL cells
spark.conf.set('catalog_name', CATALOG)
spark.conf.set('schema_name', SCHEMA)


## ⚙️ Configuration

### 🚨 IMPORTANT: Update Before Running!
Change `apscat` to **your catalog name** in the cell below.

### What This Does:
- Sets the target Unity Catalog
- Creates the `government` schema if it doesn't exist
- Confirms the target location

### Example:
If your catalog is named `my_catalog`, change:
```sql
USE CATALOG apscat;  -- Change this!
```
to:
```sql
USE CATALOG my_catalog;  -- Your catalog name
```


In [None]:
-- Configuration - UPDATE THE CATALOG NAME!
spark.sql(f"USE CATALOG {CATALOG}")
spark.sql(f"CREATE SCHEMA IF NOT EXISTS {SCHEMA}")
  COMMENT 'Government ABAC demo schema with masking functions';
spark.sql(f"USE SCHEMA {SCHEMA}")

SELECT '🎯 Target: ' || current_catalog() || '.' || current_schema() AS status;

## Function 1: GOVERNMENT ABAC FUNCTIONS


In [None]:
spark.sql(f"USE CATALOG {CATALOG}")
CREATE OR REPLACE FUNCTION mask_ssn_last4(ssn STRING) RETURNS STRING
COMMENT 'SSN masking' RETURN CASE WHEN ssn IS NULL THEN ssn ELSE CONCAT('XXX-XX-', RIGHT(REPLACE(ssn, '-', ''), 4)) END;
CREATE OR REPLACE FUNCTION mask_license_plate_partial(plate STRING) RETURNS STRING
COMMENT 'Plate partial' RETURN CASE WHEN plate IS NULL THEN plate ELSE CONCAT('***-', RIGHT(plate, 4)) END;
CREATE OR REPLACE FUNCTION mask_address_zip_only(address STRING, zip STRING) RETURNS STRING
COMMENT 'ZIP only' RETURN COALESCE(zip, '***');
CREATE OR REPLACE FUNCTION mask_tax_amount_bucket(amt DECIMAL(12,2)) RETURNS STRING
COMMENT 'Tax ranges' RETURN CASE WHEN amt IS NULL THEN 'Unknown' WHEN amt < 10000 THEN '$0-$10K'
WHEN amt < 50000 THEN '$10K-$50K' WHEN amt < 100000 THEN '$50K-$100K' ELSE '$100K+' END;
CREATE OR REPLACE FUNCTION mask_citizen_id_hash(id STRING) RETURNS STRING
COMMENT 'Deterministic' RETURN CONCAT('CIT_', SUBSTRING(SHA2(id, 256), 1, 12));
SELECT '✅ Government functions created!' AS status;

## ✅ Success!

All Government masking functions have been created successfully!

### What You Just Created:
- ✅ Masking functions registered in Unity Catalog
- ✅ Functions available for use in SQL queries
- ✅ Foundation for ABAC policies ready

### Verify Your Functions:
You can verify the functions were created by running:
```sql
SHOW FUNCTIONS IN apscat.government;
```

### 🎯 Next Step:
Continue to **`2_Create_Schema.ipynb`** to create the database tables and load sample data.

---
**Note**: These functions are stored in Unity Catalog and can be used across multiple notebooks and queries.
