# 🗄️ Telco ABAC Demo - Step 2: Create Database Schema

## 📋 Overview
This notebook creates the **core database schema** for the Telco industry ABAC demo.

### What This Notebook Does:
1. **Creates Tables**: Sets up primary tables with realistic Telco industry structure
2. **Loads Sample Data**: Inserts representative test data for demonstrations
3. **Validates Schema**: Confirms tables are created with correct row counts
4. **Establishes Relationships**: Sets up primary keys and table relationships

### Why This Schema?
This schema represents a typical Telco industry data structure with:
- **Realistic Fields**: Common columns found in Telco databases
- **Sensitive Data**: Fields that require masking (PII, financial, etc.)
- **Relationships**: Connected tables for realistic queries
- **Test Data**: Sufficient data for meaningful demonstrations

## 🎓 How to Use This Notebook
1. **Ensure Step 1 Complete**: Masking functions must be created first
2. **Run All Cells**: Execute cells sequentially
3. **Verify Counts**: Check table row counts match expected values
4. **Note Table Names**: You'll use these in testing and policy creation

## ⚙️ Prerequisites
- ✅ **Step 1 completed**: Masking functions created in `apscat.telco`
- ✅ Unity Catalog CREATE TABLE permission
- ✅ SQL Warehouse or Cluster attached
- ✅ Schema already created (from Step 1)

## 📊 Expected Results
After running this notebook, you'll have:
- Multiple tables with primary keys
- Rows of sample data in each table
- A validation query showing row counts

## 🔄 Next Steps
After completing this notebook:
1. **Step 3**: `3_Create_Extended_Tables.ipynb` - Add supplementary tables
2. **Step 4**: `4_Test_Masking.ipynb` - Test masking functions with real data

---


In [None]:
# 📋 Load Configuration from config.yaml
import yaml
from pathlib import Path

config_file = Path('config.yaml')
if config_file.exists():
    with open(config_file) as f:
        config = yaml.safe_load(f)
    CATALOG = config['catalog']
    SCHEMA = config['schema']
    print(f'✅ Configuration loaded from config.yaml')
    print(f'   📊 Catalog: {CATALOG}')
    print(f'   📁 Schema: {SCHEMA}')
else:
    # Fallback defaults
    CATALOG = 'your_catalog_name'
    SCHEMA = 'telco'
    print(f'⚠️  config.yaml not found - using defaults')
    print(f'   📊 Catalog: {CATALOG}')
    print(f'   📁 Schema: {SCHEMA}')

# Make variables available to SQL cells
spark.conf.set('catalog_name', CATALOG)
spark.conf.set('schema_name', SCHEMA)


## ⚙️ Configuration

Setting the target catalog and schema for table creation.
- **Catalog**: `apscat`
- **Schema**: `telco`

This should match what you used in Step 1.


In [None]:
spark.sql(f"USE CATALOG {CATALOG}")
spark.sql(f"USE SCHEMA {SCHEMA}")

SELECT '🗄️ Creating tables in: ' || current_catalog() || '.' || current_schema() AS status;

In [None]:
DROP TABLE IF EXISTS subscribers;

## Table: `subscribers`


In [None]:
CREATE TABLE subscribers (subscriber_id STRING, phone_number STRING, email STRING, plan_name STRING,
monthly_fee DECIMAL(8,2), status STRING, PRIMARY KEY (subscriber_id)) USING DELTA;
INSERT INTO subscribers VALUES
('S-1001', '555-1001', 'john@email.com', 'Unlimited', 79.99, 'Active'),
('S-1002', '555-1002', 'sarah@email.com', 'Premium', 99.99, 'Active'),
('S-1003', '555-1003', 'mike@email.com', 'Basic', 49.99, 'Active');

In [None]:
DROP TABLE IF EXISTS call_records;

## Table: `call_records`


In [None]:
CREATE TABLE call_records (record_id STRING, subscriber_id STRING, call_type STRING, duration_min INT,
call_date TIMESTAMP, PRIMARY KEY (record_id)) USING DELTA;
INSERT INTO call_records VALUES
('CDR-1', 'S-1001', 'Voice', 15, timestamp('2024-03-01 10:00:00')),
('CDR-2', 'S-1002', 'Voice', 30, timestamp('2024-03-01 11:00:00')),
('CDR-3', 'S-1003', 'SMS', 0, timestamp('2024-03-01 12:00:00'));

In [None]:
DROP TABLE IF EXISTS data_usage;

## Table: `data_usage`


In [None]:
CREATE TABLE data_usage (usage_id STRING, subscriber_id STRING, data_gb DECIMAL(10,2),
usage_date DATE, PRIMARY KEY (usage_id)) USING DELTA;
INSERT INTO data_usage VALUES
('U-1', 'S-1001', 8.5, '2024-03-01'),
('U-2', 'S-1002', 15.2, '2024-03-01'),
('U-3', 'S-1003', 2.3, '2024-03-01');
SELECT 'subscribers' AS tbl, COUNT(*) AS cnt FROM subscribers
UNION ALL SELECT 'call_records', COUNT(*) FROM call_records
UNION ALL SELECT 'data_usage', COUNT(*) FROM data_usage;

## ✅ Success!

Telco database schema has been created successfully!

### What You Just Created:
- ✅ Core tables with primary keys
- ✅ Sample data loaded and ready for testing
- ✅ Table relationships established
- ✅ Schema ready for masking function testing

### Verify Your Tables:
You can list all tables by running:
```sql
SHOW TABLES IN apscat.telco;
```

To see table details:
```sql
DESCRIBE TABLE apscat.telco.<table_name>;
```

### 📊 Data Summary:
The row count query above shows how many records are in each table. This data will be used for testing masking functions in the next step.

### 🎯 Next Step:
Continue to **`3_Create_Extended_Tables.ipynb`** to add supplementary tables that extend this schema.

---
**Tip**: Keep note of the table names and row counts for reference during testing.
