# Unity Catalog Governance - Demo

**Cel szkoleniowy:** Opanowanie Unity Catalog jako platformy governance dla Databricks Lakehouse, zarzƒÖdzanie dostƒôpami, data masking, lineage i audit logging

**Zakres tematyczny:**
- Unity Catalog Architecture: Metastore, Catalog, Schema, Tables/Views/Volumes
- ZarzƒÖdzanie dostƒôpami: GRANT/REVOKE privileges
- Data Masking i Row-Level Security
- Data Lineage i Audit Logging
- Delta Sharing - secure data sharing
- Best Practices for Data Governance

---

## Kontekst i wymagania

- **Dzie≈Ñ szkolenia**: Dzie≈Ñ 3 - Transformation, Governance & Integrations
- **Typ notebooka**: Demo
- **Wymagania techniczne**:
  - Databricks Runtime 13.0+ (zalecane: 14.3 LTS)
  - Unity Catalog w≈ÇƒÖczony (wymagane!)
  - Uprawnienia: CREATE CATALOG, CREATE SCHEMA, GRANT/REVOKE
  - Klaster: Standard z minimum 2 workers
- **Czas trwania**: 45 minut
- **Prerekvizity**: 03_databricks_jobs_orchestration.ipynb

## Wstƒôp teoretyczny

**Cel sekcji:** Zrozumienie Unity Catalog jako zunifikowanej platformy governance dla data lakehouse

**Podstawowe pojƒôcia:**
- **Unity Catalog**: Zunifikowane rozwiƒÖzanie governance dla wszystkich data assets
- **Metastore**: Region-level container dla katalog√≥w (top-level)
- **Three-level namespace**: catalog.schema.table
- **Securable objects**: Tables, Views, Functions, Volumes, Models
- **Fine-grained access control**: Table, column, row-level security
- **Automatic lineage**: End-to-end data flow tracking bez instrumentacji

**Hierarchia obiekt√≥w Unity Catalog:**
```
Metastore (region-level)
    ‚Üì
Catalog (domain/environment)
    ‚Üì
Schema (namespace/layer)
    ‚Üì
Securable Objects:
    - Tables / Views (data)
    - Functions (UDF, stored procedures)
    - Volumes (file storage)
    - Models (ML models)
```

**Kluczowe cechy:**
- **Unified governance**: Jedna platforma dla danych, ML, BI
- **ACID transactions**: Gwarancje transakcyjne na poziomie katalogu
- **Audit logging**: Who accessed what and when
- **Data discovery**: Metadata search i tagging
- **Delta Sharing**: Secure cross-organization sharing

**Dlaczego to wa≈ºne?**
Unity Catalog rozwiƒÖzuje fundamentalne problemy governance w data lake:
- Brak centralnej kontroli dostƒôpu
- Trudno≈õci z ≈õledzeniem lineage
- Brak audytu dostƒôpu do danych
- Problemy z compliance (GDPR, HIPAA)
- Silosy danych miƒôdzy zespo≈Çami

Unity Catalog zapewnia enterprise-grade governance przy zachowaniu flexibility data lakehouse.

## Izolacja per u≈ºytkownik

Uruchom skrypt inicjalizacyjny dla per-user izolacji katalog√≥w i schemat√≥w:

In [0]:
%run ../00_setup

## Konfiguracja

Import bibliotek i wy≈õwietlenie kontekstu u≈ºytkownika:

In [0]:
# ≈öcie≈ºki do katalog√≥w z danymi (podkatalogi w DATASET_BASE_PATH z 00_setup)
CUSTOMERS_PATH = f"{DATASET_BASE_PATH}/customers"
ORDERS_PATH = f"{DATASET_BASE_PATH}/orders"
PRODUCTS_PATH = f"{DATASET_BASE_PATH}/products"

# ≈öcie≈ºki do konkretnych plik√≥w
CUSTOMERS_CSV = f"{CUSTOMERS_PATH}/customers.csv"
ORDERS_JSON = f"{ORDERS_PATH}/orders_batch.json"
PRODUCTS_PARQUET = f"{PRODUCTS_PATH}/products.parquet"

**Konfiguracja ≈õcie≈ºek do plik√≥w danych:**

- **Customers CSV**: `{DATASET_BASE_PATH}/customers/customers.csv`
- **Orders JSON**: `{DATASET_BASE_PATH}/orders/orders_batch.json`  
- **Products Parquet**: `{DATASET_BASE_PATH}/products/products.parquet`

Te ≈õcie≈ºki bƒôdƒÖ u≈ºywane do wczytania danych do Unity Catalog.

## 2.1 Przygotowanie Danych z Dataset

Zanim przejdziemy do zarzƒÖdzania dostƒôpami, wczytamy rzeczywiste dane z katalogu dataset/, kt√≥re bƒôdziemy u≈ºywaƒá w przyk≈Çadach Unity Catalog.

In [0]:
customers_df = spark.read \
    .option("header", "true") \
    .option("inferSchema", "true") \
    .csv(CUSTOMERS_CSV)

In [None]:
customers_df.printSchema()

**Dane customers wczytane**

Wczytano dane klient√≥w z pliku CSV. Schema zosta≈Ç automatycznie wykryty przez `inferSchema=true`.

In [None]:
display(customers_df.limit(5))

In [0]:
orders_df = spark.read.json(ORDERS_JSON)

In [None]:
orders_df.printSchema()

**Dane orders wczytane**

Wczytano dane zam√≥wie≈Ñ z pliku JSON. Spark automatycznie wykry≈Ç schema dla nested JSON structures.

In [None]:
display(orders_df.limit(5))

In [0]:
products_df = spark.read.parquet(PRODUCTS_PARQUET)

In [None]:
products_df.printSchema()

**Dane products wczytane**

Wczytano dane produkt√≥w z pliku Parquet. Parquet zawiera optimized binary format with schema embedded in the file.

In [None]:
display(products_df.limit(5))

## 1Ô∏è‚É£ Unity Catalog Architecture

**Unity Catalog** to zunifikowane rozwiƒÖzanie governance dla Databricks Lakehouse.

### Hierarchia obiekt√≥w:

```
Metastore (region-level)
    ‚Üì
Catalog (database/domain)
    ‚Üì
Schema (namespace)
    ‚Üì
Securable Objects:
    - Tables / Views
    - Functions (UDF, stored procedures)
    - Volumes (files storage)
    - Models (ML models)
```

### Three-level namespace:
```sql
catalog.schema.table
```

Przyk≈Çad:
```sql
main.sales.orders
dev.analytics.customer_metrics
prod.gold.daily_revenue
```

### Kluczowe cechy:
- **Unified governance**: jedna platforma dla danych, ML, BI
- **Fine-grained access control**: table, column, row level
- **Automatic lineage**: end-to-end data flow tracking
- **Audit logging**: who accessed what and when
- **Data discovery**: metadata search i tagging

---

## üìã Setup i Basic Operations

### Creating Catalogs and Schemas:

In [0]:
# Create Catalog
spark.sql(f"""
    CREATE CATALOG IF NOT EXISTS {CATALOG}
    COMMENT 'Katalog KION dla danych szkoleniowych'
""")

**Weryfikacja listy katalog√≥w:**

In [None]:
# List catalogs
spark.sql("SHOW CATALOGS").display()

**Katalog utworzony/zweryfikowany**

Katalog s≈Çu≈ºy jako top-level container dla wszystkich schemat√≥w i tabel w naszym ≈õrodowisku szkoleniowym.

In [0]:
# Create Schemas within catalog
spark.sql(f"""
  CREATE SCHEMA IF NOT EXISTS {CATALOG}.{BRONZE_SCHEMA}
  COMMENT 'Bronze layer - surowe dane'
""")

spark.sql(f"""
  CREATE SCHEMA IF NOT EXISTS {CATALOG}.{SILVER_SCHEMA}
  COMMENT 'Silver layer - oczyszczone dane'
""")

spark.sql(f"""
  CREATE SCHEMA IF NOT EXISTS {CATALOG}.{GOLD_SCHEMA}
  COMMENT 'Gold layer - dane biznesowe'
""")

**Schematy Bronze, Silver, Gold utworzone**

Implementujemy architekturƒô Medallion:
- **Bronze**: Surowe dane (raw data ingestion)
- **Silver**: Oczyszczone dane (data validation, normalization)  
- **Gold**: Dane biznesowe (aggregated, business-ready)

In [0]:
# Ustawienie default catalog i schema
spark.sql(f"USE CATALOG {CATALOG}")
spark.sql(f"USE SCHEMA {SILVER_SCHEMA}")

**Weryfikacja aktywnego kontekstu:**

In [None]:
# Verify the active catalog and schema
print(f"‚úì Aktywny katalog: {CATALOG}")
print(f"‚úì Aktywny schemat: {SILVER_SCHEMA}")

# Weryfikacja utworzonych schemat√≥w
schemas = spark.sql(f"SHOW SCHEMAS IN {CATALOG}").select("databaseName").collect()
schema_names = [row.databaseName for row in schemas]

print("\n‚úì Utworzone schematy w katalogu:")
for schema_name in schema_names:
    print(f"  - {schema_name}")

**Aktywny katalog i schemat ustawione**

Ustawiamy domy≈õlny kontekst pracy - wszystkie kolejne operacje bƒôdƒÖ wykonywane w tym katalogu i schemacie, chyba ≈ºe zostanie podana pe≈Çna ≈õcie≈ºka.

In [None]:
# Weryfikacja utworzonych schemat√≥w
spark.sql(f"SHOW SCHEMAS IN {CATALOG}").display()

### Creating Tables in Unity Catalog:

In [0]:
# Zapisanie tabeli customers w Bronze layer
customers_df.write \
    .format("delta") \
    .mode("overwrite") \
    .option("mergeSchema", "true") \
    .saveAsTable(f"{CATALOG}.{BRONZE_SCHEMA}.customers")

**Tabela customers zapisana w Bronze layer**

Zapisujemy dane w formacie Delta Lake z `mergeSchema=true` dla automatycznego dostosowania schematu w przysz≈Ço≈õci.

In [None]:
# Weryfikacja liczby rekord√≥w
spark.sql(f"SELECT COUNT(*) as count FROM {CATALOG}.{BRONZE_SCHEMA}.customers").display()

In [0]:
# Zapisanie tabeli orders w Bronze layer
orders_df.write \
    .format("delta") \
    .mode("overwrite") \
    .option("mergeSchema", "true") \
    .saveAsTable(f"{CATALOG}.{BRONZE_SCHEMA}.orders")

print(f"‚úì Tabela orders zapisana w {CATALOG}.{BRONZE_SCHEMA}")

**Weryfikacja tabeli orders:**

In [None]:
# Weryfikacja
result = spark.sql(f"SELECT COUNT(*) as count FROM {CATALOG}.{BRONZE_SCHEMA}.orders").collect()[0]
print(f"‚úì Liczba rekord√≥w: {result.count}")

In [None]:
# Add table properties and comments
spark.sql(f"""
    ALTER TABLE {CATALOG}.{BRONZE_SCHEMA}.orders
    SET TBLPROPERTIES (
        'delta.enableChangeDataFeed' = 'true',
        'delta.autoOptimize.optimizeWrite' = 'true',
        'delta.autoOptimize.autoCompact' = 'true',
        'owner' = 'data-engineering-team',
        'department' = 'analytics',
        'pii_data' = 'true'
    )
""")

In [None]:
# Add comments to table and columns
spark.sql(f"""
    COMMENT ON TABLE {CATALOG}.{BRONZE_SCHEMA}.orders IS
    'Cleaned orders table with data quality validations applied'
""")

spark.sql(f"""
    COMMENT ON COLUMN {CATALOG}.{BRONZE_SCHEMA}.orders.customer_id IS
    'Customer identifier - PII data, access restricted'
""")

**W≈Ça≈õciwo≈õci tabeli i komentarze dodane**

Ustawili≈õmy:
- **Change Data Feed**: ≈öledzenie zmian w tabeli
- **Auto Optimize**: Automatyczna optymalizacja zapisu i kompakcja
- **Metadata**: Owner, department, oznaczenie PII data
- **Dokumentacja**: Komentarze do tabeli i wra≈ºliwych kolumn

In [None]:
# Weryfikacja liczby rekord√≥w orders
spark.sql(f"SELECT COUNT(*) as count FROM {CATALOG}.{BRONZE_SCHEMA}.orders").display()

In [0]:
# Zapisanie tabeli products w Bronze layer
products_df.write \
    .format("delta") \
    .mode("overwrite") \
    .option("mergeSchema", "true") \
    .saveAsTable(f"{CATALOG}.{BRONZE_SCHEMA}.products")

print(f"‚úì Tabela products zapisana w {CATALOG}.{BRONZE_SCHEMA}")

In [None]:
# Weryfikacja liczby rekord√≥w products
spark.sql(f"SELECT COUNT(*) as count FROM {CATALOG}.{BRONZE_SCHEMA}.products").display()

## 2.2 Data Classification (Tagging)

**Tagging** pozwala na klasyfikacjƒô danych (np. PII, Sensitive, GDPR) na poziomie tabeli lub kolumny.
U≈Çatwia to data discovery oraz governance (np. raportowanie wszystkich tabel zawierajƒÖcych dane osobowe).


In [None]:
# Dodaj tagi do tabeli orders
spark.sql(f"""
    ALTER TABLE {CATALOG}.{BRONZE_SCHEMA}.orders 
    SET TAGS ('sensitivity' = 'high', 'domain' = 'sales')
""")

# Dodaj tagi do kolumny customer_id
spark.sql(f"""
    ALTER TABLE {CATALOG}.{BRONZE_SCHEMA}.orders 
    ALTER COLUMN customer_id SET TAGS ('pii' = 'true')
""")

print("‚úì Tagi dodane do tabeli i kolumny")

**Wyszukiwanie po tagach (Information Schema):**

In [None]:
# Znajd≈∫ wszystkie kolumny oznaczone jako PII
pii_columns = spark.sql(f"""
    SELECT 
        catalog_name, 
        schema_name, 
        table_name, 
        column_name, 
        tag_value 
    FROM system.information_schema.column_tags
    WHERE tag_name = 'pii' AND tag_value = 'true'
      AND catalog_name = '{CATALOG}'
""")

display(pii_columns)

**Tabela products zapisana w Bronze layer**

Wszystkie trzy tabele (customers, orders, products) zosta≈Çy zapisane w Bronze schema w formacie Delta Lake. Gotowe do dalszej transformacji w architekturze Medallion.

## 4. Unity Catalog Volumes

**Volumes** to zarzƒÖdzane przestrzenie dla przechowywania plik√≥w (non-tabular data) w Unity Catalog:
- **Managed Volumes**: Databricks zarzƒÖdza cyklem ≈ºycia plik√≥w
- **External Volumes**: po≈ÇƒÖczenie z zewnƒôtrznymi lokalizacjami storage

**Zastosowania**:
- Przechowywanie plik√≥w ML models, checkpoints
- Staging area dla danych przed ingestion
- Archiwum dokument√≥w, log√≥w, raport√≥w

In [0]:
# Tworzenie Managed Volume
volume_name = "files"

spark.sql(f"""
  CREATE VOLUME IF NOT EXISTS {CATALOG}.{BRONZE_SCHEMA}.{volume_name}
  COMMENT 'Managed volume dla plik√≥w staging'
""")

**Volume 'files' utworzony**

Volume zosta≈Ç utworzony w Bronze schema jako managed volume dla przechowywania plik√≥w staging, dokument√≥w i innych zasob√≥w niebƒôdƒÖcych tabelami.

In [0]:
# Eksport customers do CSV w Volume
volume_path = f"/Volumes/{CATALOG}/{BRONZE_SCHEMA}/{volume_name}"

customers_df.coalesce(1).write \
    .mode("overwrite") \
    .option("header", "true") \
    .csv(f"{volume_path}/customers_export")

**Dane customers wyeksportowane do Volume**

Dane zosta≈Çy zapisane w ≈õcie≈ºce: `/Volumes/{CATALOG}/{BRONZE_SCHEMA}/files/customers_export/`

Volume umo≈ºliwia przechowywanie plik√≥w niestrukturalnych obok tabeli w Unity Catalog z kontrolƒÖ dostƒôpu.

In [0]:
# Weryfikacja plik√≥w w Volume
dbutils.fs.ls(f"{volume_path}/customers_export")

## 5. Unity Catalog Functions (UDF)

**Functions** w Unity Catalog pozwalajƒÖ na:
- Tworzenie reu≈ºywalnych funkcji SQL/Python
- Centralne zarzƒÖdzanie logikƒÖ biznesowƒÖ
- Kontrolƒô dostƒôpu przez GRANT/REVOKE
- Lineage tracking dla funkcji

**Rodzaje funkcji**:
- **Scalar Functions**: zwracajƒÖ pojedynczƒÖ warto≈õƒá
- **Table Functions**: zwracajƒÖ tabelƒô
- **SQL Functions**: napisane w SQL
- **Python Functions**: napisane w Python (UDF)

In [0]:
# SQL Function - maskowanie customer_id
spark.sql(f"""
  CREATE OR REPLACE FUNCTION {CATALOG}.{SILVER_SCHEMA}.mask_customer_id(customer_id INT)
  RETURNS STRING
  LANGUAGE SQL
  COMMENT 'Maskuje customer_id, pokazujƒÖc tylko ostatnie 3 cyfry'
  RETURN CONCAT('****', SUBSTRING(CAST(customer_id AS STRING), -3))
""")

**Funkcja mask_customer_id utworzona**

Funkcja SQL maskuje customer_id, pokazujƒÖc tylko ostatnie 3 cyfry. Mo≈ºe byƒá wykorzystana w Views do ukrywania wra≈ºliwych identyfikator√≥w.

In [0]:
# Test funkcji mask_customer_id
result_df = spark.sql(f"""
  SELECT 
    customer_id,
    {CATALOG}.{SILVER_SCHEMA}.mask_customer_id(customer_id) as masked_id,
    first_name,
    last_name
  FROM {CATALOG}.{BRONZE_SCHEMA}.customers
  LIMIT 5
""")

display(result_df)

In [0]:
# Python UDF - kategoryzacja cen
spark.sql(f"""
  CREATE OR REPLACE FUNCTION {CATALOG}.{SILVER_SCHEMA}.categorize_price(price DOUBLE)
  RETURNS STRING
  LANGUAGE PYTHON
  COMMENT 'Kategoryzuje ceny: Low, Medium, High'
  AS $$
    if price < 50:
        return "Low"
    elif price < 200:
        return "Medium"
    else:
        return "High"
  $$
""")

**Funkcja categorize_price utworzona**

Funkcja Python UDF kategoryzuje ceny produkt√≥w:
- **Low**: < 50
- **Medium**: 50-200  
- **High**: > 200

Python UDF mo≈ºe zawieraƒá dowolnƒÖ logikƒô Python.

In [0]:
# Test funkcji categorize_price
result_df = spark.sql(f"""
  SELECT 
    product_name,
    price,
    {CATALOG}.{SILVER_SCHEMA}.categorize_price(price) as price_category
  FROM {CATALOG}.{BRONZE_SCHEMA}.products
  ORDER BY price
  LIMIT 10
""")

display(result_df)

In [0]:
# Tworzenie View w Silver layer - agregacja zam√≥wie≈Ñ
spark.sql(f"""
  CREATE OR REPLACE VIEW {CATALOG}.{SILVER_SCHEMA}.customer_order_summary AS
  SELECT 
    c.customer_id,
    c.first_name,
    c.last_name,
    c.country,
    COUNT(o.order_id) as total_orders,
    SUM(o.total_amount) as total_spent,
    AVG(o.total_amount) as avg_order_value,
    MAX(o.order_datetime) as last_order_date
  FROM {CATALOG}.{BRONZE_SCHEMA}.customers c
  LEFT JOIN {CATALOG}.{BRONZE_SCHEMA}.orders o
    ON c.customer_id = o.customer_id
  GROUP BY c.customer_id, c.first_name, c.last_name, c.country
""")

**Weryfikacja utworzenia View:**

In [None]:
print(f"‚úì View customer_order_summary utworzony w {CATALOG}.{SILVER_SCHEMA}")

**View customer_order_summary utworzony**

Silver layer View ≈ÇƒÖczy customers i orders do agregacji per klient:
- **total_orders**: Liczba zam√≥wie≈Ñ
- **total_spent**: ≈ÅƒÖczna warto≈õƒá zam√≥wie≈Ñ
- **avg_order_value**: ≈örednia warto≈õƒá zam√≥wienia
- **last_order_date**: Data ostatniego zam√≥wienia

Gotowy do konsumowania przez Gold layer i BI tools.

---

## 6Ô∏è‚É£ ZarzƒÖdzanie dostƒôpami: GRANT / REVOKE

### Hierarchia Privileges w Unity Catalog:

**Poziomy uprawnie≈Ñ**:
1. **Metastore-level**: CREATE CATALOG, USE CATALOG
2. **Catalog-level**: USE CATALOG, CREATE SCHEMA
3. **Schema-level**: USE SCHEMA, CREATE TABLE, CREATE FUNCTION, CREATE VOLUME
4. **Object-level**: SELECT, MODIFY (INSERT/UPDATE/DELETE/MERGE), EXECUTE

**Securable Objects - Inheritance**:
- Uprawnienia dziedziczƒÖ siƒô w d√≥≈Ç hierarchii
- GRANT na Catalog ‚Üí dziedziczy na wszystkie Schemas i Tables
- GRANT na Schema ‚Üí dziedziczy na wszystkie Tables w tym Schema
- Mo≈ºna nadaƒá uprawnienia na konkretnym poziomie dla fine-grained control

### Przyk≈Çady GRANT/REVOKE:

In [None]:
# Setup: Create groups for demonstration purposes
# Note: This requires account admin privileges. If you don't have them, ensure these groups exist.
try:
    spark.sql("CREATE GROUP IF NOT EXISTS `data-analysts`")
    spark.sql("CREATE GROUP IF NOT EXISTS `data-engineers`")
    spark.sql("CREATE GROUP IF NOT EXISTS `finance-team`")
    spark.sql("CREATE GROUP IF NOT EXISTS `marketing-team`")
    print("‚úì Groups created or already exist")
except Exception as e:
    print(f"‚ö†Ô∏è Could not create groups (likely due to permissions): {e}")
    print("Please ensure these groups exist in your Databricks Account or use existing groups.")

In [0]:
# Grant catalog access to data analysts
spark.sql(f"""
    GRANT USE CATALOG ON CATALOG {CATALOG} TO `data-analysts`
""")

spark.sql(f"""
    GRANT USE SCHEMA ON SCHEMA {CATALOG}.{SILVER_SCHEMA} TO `data-analysts`
""")

spark.sql(f"""
    GRANT SELECT ON SCHEMA {CATALOG}.{SILVER_SCHEMA} TO `data-analysts`
""")

**Uprawnienia dla data-analysts ustawione**

Grupa `data-analysts` otrzyma≈Ça:
- **USE CATALOG**: Dostƒôp do katalogu
- **USE SCHEMA**: Dostƒôp do schematu Silver  
- **SELECT**: Odczyt danych ze schematu Silver

In [0]:
# Grant full access to data engineers
spark.sql(f"""
    GRANT USE CATALOG, CREATE SCHEMA ON CATALOG {CATALOG} TO `data-engineers`
""")

**Uprawnienia dla Data Analysts (Gold Layer):**

In [None]:
# GRANT dla data-analysts na Gold schema
spark.sql(f"""
  GRANT USE SCHEMA ON SCHEMA {CATALOG}.{GOLD_SCHEMA} TO `data-analysts`
""")

spark.sql(f"""
  GRANT SELECT ON SCHEMA {CATALOG}.{GOLD_SCHEMA} TO `data-analysts`
""")

**Pe≈Çne uprawnienia dla Data Engineers:**

In [None]:
# GRANT ALL PRIVILEGES dla data-engineers
spark.sql(f"""
  GRANT ALL PRIVILEGES ON SCHEMA {CATALOG}.{BRONZE_SCHEMA} TO `data-engineers`
""")

spark.sql(f"""
  GRANT ALL PRIVILEGES ON SCHEMA {CATALOG}.{SILVER_SCHEMA} TO `data-engineers`
""")

spark.sql(f"""
  GRANT ALL PRIVILEGES ON SCHEMA {CATALOG}.{GOLD_SCHEMA} TO `data-engineers`
""")

**Uprawnienia dla grup ustawione**

- **data-analysts**: USE SCHEMA + SELECT na Gold (tylko agregowane dane)
- **data-engineers**: ALL PRIVILEGES na Bronze/Silver/Gold (pe≈Çna kontrola nad data pipeline)

In [0]:
# GRANT na konkretnych tabelach
spark.sql(f"""
    GRANT SELECT ON TABLE {CATALOG}.{GOLD_SCHEMA}.customer_order_summary TO `finance-team`
""")

spark.sql(f"""
    GRANT SELECT ON TABLE {CATALOG}.{GOLD_SCHEMA}.customers_masked TO `marketing-team`
""")

**Weryfikacja nadania uprawnie≈Ñ:**

**Table-specific access control**

Fine-grained permissions:
- **finance-team**: Dostƒôp do customer_order_summary (revenue analysis)
- **marketing-team**: Dostƒôp do customers_masked (customer insights z maskowaniem PII)

In [0]:
# GRANT EXECUTE na Functions
spark.sql(f"""
  GRANT EXECUTE ON FUNCTION {CATALOG}.{SILVER_SCHEMA}.mask_customer_id TO `data-analysts`
""")

spark.sql(f"""
  GRANT EXECUTE ON FUNCTION {CATALOG}.{SILVER_SCHEMA}.categorize_price TO `data-analysts`
""")

**EXECUTE permissions na funkcjach**

Grupa `data-analysts` mo≈ºe wykonywaƒá:
- **mask_customer_id**: Maskowanie identyfikator√≥w klient√≥w
- **categorize_price**: Kategoryzacja cen produkt√≥w

Functions w Unity Catalog majƒÖ own access control layer.

In [0]:
# Weryfikacja uprawnie≈Ñ na tabeli
spark.sql(f"""
    SHOW GRANTS ON TABLE {CATALOG}.{BRONZE_SCHEMA}.customers
""").display()

**Uprawnienia na tabeli customers**

`SHOW GRANTS` pokazuje wszystkie uprawnienia na konkretnej tabeli:
- **Principal**: User/grupa z uprawnieniami
- **Action**: Typ uprawnienia (SELECT, MODIFY, etc.)
- **Object**: Scoped object (table, schema, catalog)

### Ownership and transfer:

---

## 3Ô∏è‚É£ Data Masking i Row-Level Security

### Column-level masking (Dynamic Views):

U≈ºyj funkcji `current_user()` i `is_account_group_member()` do conditional masking:

In [0]:
# Create masked view for PII data
spark.sql(f"""
  CREATE OR REPLACE VIEW {CATALOG}.{GOLD_SCHEMA}.customers_masked AS
  SELECT 
    customer_id,
    CASE 
      WHEN is_account_group_member('pii-access-group') THEN first_name
      ELSE CONCAT(LEFT(first_name, 1), '***')
    END as first_name,
    CASE 
      WHEN is_account_group_member('pii-access-group') THEN last_name
      ELSE CONCAT(LEFT(last_name, 1), '***')
    END as last_name,
    city,
    country,
    registration_date
  FROM {CATALOG}.{BRONZE_SCHEMA}.customers
""")

**View customers_masked utworzony**

View z dynamicznym maskowaniem danych PII:
- **pii-access-group**: Widzi pe≈Çne imiona i nazwiska  
- **Inne grupy**: Widzi tylko pierwszƒÖ literƒô + `***`

Maskowanie oparte na `is_account_group_member()` jest dynamiczne - bez duplikowania danych.

In [0]:
# Test View z maskowaniem
result_df = spark.sql(f"""
  SELECT * FROM {CATALOG}.{GOLD_SCHEMA}.customers_masked LIMIT 10
""")

display(result_df)

**Dane z maskowaniem wy≈õwietlone**

Imiona i nazwiska sƒÖ zamaskowane dla u≈ºytkownik√≥w bez uprawnie≈Ñ `pii-access-group`. Widoczne sƒÖ tylko pierwsze litery + `***`.

In [0]:
# Tworzenie view z zahashowanym customer_id
spark.sql(f"""
  CREATE OR REPLACE VIEW {CATALOG}.{GOLD_SCHEMA}.orders_hashed AS
  SELECT 
    order_id,
    SHA2(CAST(customer_id AS STRING), 256) as customer_id_hash,
    product_id,
    quantity,
    total_amount,
    order_datetime,
    status
  FROM {CATALOG}.{BRONZE_SCHEMA}.orders
""")

print(f"‚úì View orders_hashed utworzony - customer_id jest zahashowany")
print("  - Analitycy mogƒÖ agregowaƒá bez ujawniania customer_id")

**View orders_hashed utworzony**

Customer_id jest zahashowany przy u≈ºyciu SHA2-256. Umo≈ºliwia to:
- **Analitykom**: Agregacjƒô danych bez ujawniania customer_id
- **Privacy**: Zachowanie anonimowo≈õci przy zachowaniu mo≈ºliwo≈õci grupowania
- **Compliance**: Spe≈Çnienie wymaga≈Ñ GDPR/privacy regulations

### Row-Level Security (RLS):

Restrict which rows users can see based on their identity or group membership:

In [0]:
# Tworzenie RLS view - dostƒôp per kraj
spark.sql(f"""
    CREATE OR REPLACE VIEW {CATALOG}.{GOLD_SCHEMA}.customers_rls AS
    SELECT *
    FROM {CATALOG}.{BRONZE_SCHEMA}.customers
    WHERE 
        CASE 
            WHEN is_account_group_member('global-access') THEN TRUE
            WHEN is_account_group_member('poland-team') THEN country = 'Poland'
            WHEN is_account_group_member('germany-team') THEN country = 'Germany'
            WHEN is_account_group_member('france-team') THEN country = 'France'
            ELSE FALSE
        END
""")

print(f"‚úì RLS View utworzony - u≈ºytkownicy widzƒÖ tylko klient√≥w ze swojego kraju")

**RLS View customers_rls utworzony**

Row-Level Security filtruje dane bazujƒÖc na group membership:
- **global-access**: Widzi wszystkich klient√≥w
- **poland-team**: Tylko klient√≥w z Polski  
- **germany-team**: Tylko klient√≥w z Niemiec
- **france-team**: Tylko klient√≥w z Francji
- **Inne grupy**: Brak dostƒôpu (FALSE)

Automatyczne filtrowanie wierszy bez duplikowania danych.

In [0]:
# RLS na zam√≥wieniach - filtrowanie per rola
spark.sql(f"""
  CREATE OR REPLACE VIEW {CATALOG}.{GOLD_SCHEMA}.orders_rls AS
  SELECT 
    o.*
  FROM {CATALOG}.{BRONZE_SCHEMA}.orders o
  WHERE 
    is_account_group_member('admin') OR
    (is_account_group_member('finance-team') AND o.status IN ('completed', 'shipped')) OR
    (is_account_group_member('warehouse-team') AND o.status IN ('pending', 'processing', 'shipped'))
""")

**Nadanie uprawnie≈Ñ do RLS Views:**

**RLS View dla orders utworzony**

Role-based filtering zam√≥wie≈Ñ:
- **admin**: Widzi wszystkie zam√≥wienia
- **finance-team**: Tylko completed i shipped (revenue-relevant)
- **warehouse-team**: pending, processing, shipped (operational orders)

R√≥≈ºne grupy widzƒÖ r√≥≈ºne subsets danych z tej samej tabeli.

In [0]:
# GRANT dostƒôp do customers_rls
spark.sql(f"""
  GRANT SELECT ON VIEW {CATALOG}.{GOLD_SCHEMA}.customers_rls TO `all-users`
""")

**Nadanie uprawnie≈Ñ do orders_rls**

In [None]:
# GRANT dostƒôp do orders_rls
spark.sql(f"""
  GRANT SELECT ON VIEW {CATALOG}.{GOLD_SCHEMA}.orders_rls TO `all-users`
""")

**Odebranie dostƒôpu do tabel bazowych (Enforcement):**

In [None]:
# Revoke direct access to base table
spark.sql(f"""
    REVOKE SELECT ON TABLE {CATALOG}.{BRONZE_SCHEMA}.orders FROM `all-users`
""")

**RLS Views - Access control setup**

Security pattern:
1. **GRANT SELECT** na RLS Views dla `all-users`
2. **REVOKE SELECT** na base tables (force u≈ºycia Views)
3. **Automatic filtering** bazowane na group membership

Users mogƒÖ SELECT z Views, ale nie z base tables - enforcement RLS.

---

## 4Ô∏è‚É£ Data Lineage i Audit Logging

### Querying Data Lineage:

Unity Catalog automatically tracks lineage for:
- Table ‚Üí Table (ETL transformations)
- Notebook ‚Üí Table (data writes)
- Dashboard ‚Üí Table (BI queries)
- ML Model ‚Üí Table (training data)

**General Table Lineage**

In [0]:
# Query table lineage z system tables
lineage_df = spark.sql(f"""
  SELECT 
    source_table_full_name,
    source_type,
    target_table_full_name,
    target_type,
    created_at,
    created_by
  FROM system.access.table_lineage
  WHERE target_table_full_name LIKE '{CATALOG}.%'
  ORDER BY created_at DESC
  LIMIT 50
""")

display(lineage_df)

**Lineage dla tabel w katalogu wy≈õwietlony**

System automatycznie ≈õledzi lineage dla:
- **Table ‚Üí Table**: ETL transformations
- **Notebook ‚Üí Table**: Data writes  
- **Dashboard ‚Üí Table**: BI queries
- **ML Model ‚Üí Table**: Training data

Lineage jest dostƒôpny przez `system.access.table_lineage` bez dodatkowej instrumentacji.

**1. Upstream Lineage (Sources)**

In [0]:
# Find upstream dependencies (sources) for a table
upstream_df = spark.sql(f"""
    SELECT DISTINCT
        source_table_full_name,
        source_type
    FROM system.access.table_lineage
    WHERE target_table_full_name = '{CATALOG}.{SILVER_SCHEMA}.customer_order_summary'
""")

display(upstream_df)

**‚¨ÜÔ∏è Upstream: Tabele ≈∫r√≥d≈Çowe dla customer_order_summary**

Pokazuje wszystkie tabele, kt√≥re sƒÖ u≈ºywane jako ≈∫r√≥d≈Ça danych w View `customer_order_summary`. Pomoce w analizie impact analysis przy zmianach w upstream tables.

**2. Downstream Lineage (Consumers)**

In [0]:
# Find downstream dependencies (consumers) of a table
downstream_df = spark.sql(f"""
    SELECT DISTINCT
        target_table_full_name,
        target_type
    FROM system.access.table_lineage
    WHERE source_table_full_name = '{CATALOG}.{BRONZE_SCHEMA}.customers'
""")

display(downstream_df)

**‚¨áÔ∏è Downstream: Views/Tables korzystajƒÖce z customers**

Pokazuje wszystkie Views i tabele, kt√≥re konsumujƒÖ dane z tabeli `customers`. Krytyczne dla understanding impact of changes i data governance.

**3. Column-Level Lineage**

In [0]:
# Column-level lineage (je≈õli dostƒôpny)
column_lineage = spark.sql(f"""
    SELECT 
        source_table_full_name,
        source_column_name,
        target_table_full_name,
        target_column_name,
        created_at
    FROM system.access.column_lineage
    WHERE target_table_full_name = '{CATALOG}.{SILVER_SCHEMA}.customer_order_summary'
    ORDER BY target_column_name
""")

display(column_lineage)

**üìä Column-level lineage dla customer_order_summary**

Unity Catalog ≈õledzi lineage na poziomie kolumn - kt√≥re kolumny w source tables wp≈ÇywajƒÖ na kt√≥re kolumny w target table. Szczeg√≥≈Çowa informacja dla data governance i impact analysis.

### Audit Logging:

Unity Catalog logs all access and operations:

**1. General Audit Logs**

In [0]:
# Query audit logs
audit_df = spark.sql("""
    SELECT 
        event_time,
        user_identity.email as user_email,
        service_name,
        action_name,
        request_params.full_name_arg as table_name,
        response.status_code,
        request_id
    FROM system.access.audit
    WHERE action_name IN ('getTable', 'createTable', 'deleteTable', 'updateTable')
        AND event_date >= current_date() - INTERVAL 7 DAYS
    ORDER BY event_time DESC
    LIMIT 100
""")
audit_df.display()

**2. Sensitive Data Access**

In [0]:
# Track who accessed sensitive tables
sensitive_access = spark.sql(f"""
    SELECT 
        event_time,
        user_identity.email as user,
        action_name,
        request_params.full_name_arg as table_accessed,
        source_ip_address
    FROM system.access.audit
    WHERE request_params.full_name_arg LIKE '{CATALOG}.%.customers%'
        AND action_name = 'getTable'
        AND event_date >= current_date() - INTERVAL 7 DAYS
    ORDER BY event_time DESC
    LIMIT 100
""")

display(sensitive_access)

**üîí Audit logs: Dostƒôp do tabeli customers (ostatnie 7 dni)**

Monitoring dostƒôpu do wra≈ºliwych tabel z danymi PII:
- **Kto**: User email
- **Kiedy**: Event time  
- **Co**: Table name
- **SkƒÖd**: Source IP address

Kluczowe dla compliance (GDPR, HIPAA) i security monitoring.

**3. Privilege Changes**

In [0]:
# Grant/Revoke audit trail
grant_audit = spark.sql("""
    SELECT 
        event_time,
        user_identity.email as admin_user,
        action_name,
        request_params.privilege as privilege_granted,
        request_params.securable_full_name as object_name,
        request_params.principal as grantee
    FROM system.access.audit
    WHERE action_name IN ('grantPrivilege', 'revokePrivilege')
        AND event_date >= current_date() - INTERVAL 30 DAYS
    ORDER BY event_time DESC
""")

display(grant_audit)

**üìù Audit trail of privilege changes**

Kompletny audit trail zmian uprawnie≈Ñ:
- **Admin user**: Kto wykona≈Ç GRANT/REVOKE
- **Action**: grantPrivilege lub revokePrivilege
- **Privilege**: Kt√≥re uprawnienie (SELECT, MODIFY, etc.)
- **Object**: Na kt√≥rym obiekcie (table, schema, catalog)
- **Grantee**: Komu nadano/odebrano uprawnienia

Niezbƒôdne dla governance i compliance audits.

---

## 5Ô∏è‚É£ Delta Sharing

**Delta Sharing** = Secure data sharing protocol (cross-org, cross-cloud)

### Komponenty:
- **Share**: kolekcja tabel do udostƒôpnienia
- **Recipient**: organizacja/u≈ºytkownik otrzymujƒÖcy dane
- **Provider**: w≈Ça≈õciciel danych (Ty)

### Create Share:

In [0]:
# Tworzenie Share dla zewnƒôtrznych partner√≥w
share_name = f"{CATALOG}_partner_share"

spark.sql(f"""
  CREATE SHARE IF NOT EXISTS {share_name}
  COMMENT 'Udostƒôpnienie danych KION dla partner√≥w biznesowych'
""")

**Share '{share_name}' utworzony**

Delta Sharing Share to kolekcja tabel do bezpiecznego udostƒôpnienia zewnƒôtrznym partnerom:
- **Cross-org**: Miƒôdzy r√≥≈ºnymi organizacjami Databricks
- **Cross-cloud**: AWS ‚Üî Azure ‚Üî GCP  
- **Open protocol**: Standard open-source

In [0]:
# Dodanie tabeli do Share (tylko Gold layer - agregowane dane)
spark.sql(f"""
  ALTER SHARE {share_name}
  ADD TABLE {CATALOG}.{GOLD_SCHEMA}.customer_order_summary
""")

**Tabela customer_order_summary dodana do Share**

Best practice: Udostƒôpniaj tylko Gold layer (agregowane dane):
- **Bezpiecze≈Ñstwo**: Brak dostƒôpu do raw data
- **Privacy**: Agregacje ukrywajƒÖ individual records
- **Stability**: Gold layer ma stabilny schema i strukture

In [0]:
# Weryfikacja zawarto≈õci Share
spark.sql(f"SHOW ALL IN SHARE {share_name}").display()

**Tabele w Share zweryfikowane**

Share zawiera obecnie dodane tabele i mo≈ºe byƒá udostƒôpniony recipientom. Recipients otrzymajƒÖ activation link do konsumowania shared data przez Delta Sharing protocol.

### Create Recipient:

### Consuming shared data (as recipient):

### Best practices for Delta Sharing:

1. **Share only aggregated/gold data**: nie udostƒôpniaj raw/bronze layers
2. **Use views for masking**: create view with masked PII before sharing
3. **Monitor access**: track who accesses shared data
4. **Version control**: use table versions for stable APIs
5. **Documentation**: clear documentation dla recipients

---

## 6Ô∏è‚É£ Best Practices for Data Governance

### 1. Catalog organization strategy:

### 2. Access control patterns:

### 3. Tagging and documentation:

### 4. Monitoring and alerts:

In [None]:
# 1. Tabele bez w≈Ça≈õcicieli
unowned_tables = spark.sql(f"""
    SELECT 
        table_catalog,
        table_schema,
        table_name
    FROM system.information_schema.tables
    WHERE table_catalog = '{CATALOG}'
        AND table_owner IS NULL
""")

display(unowned_tables)

**‚ö†Ô∏è Tabele bez w≈Ça≈õcicieli**

Ka≈ºda tabela powinna mieƒá przypisanego owner dla accountability i governance. Tabele bez owner sƒÖ problematyczne dla zarzƒÖdzania dostƒôpami i lifecycle management.

In [None]:
# 2. Tabele bez dokumentacji
undocumented = spark.sql(f"""
    SELECT 
        table_catalog,
        table_schema,
        table_name
    FROM system.information_schema.tables
    WHERE table_catalog = '{CATALOG}'
        AND (comment IS NULL OR comment = '')
""")

display(undocumented)

**üìù Tabele bez dokumentacji**

Ka≈ºda tabela powinna mieƒá `COMMENT` opisujƒÖcy:
- **Cel tabeli**: Co zawiera i do czego s≈Çu≈ºy
- **Source**: SkƒÖd pochodzƒÖ dane
- **Owner**: Kto jest odpowiedzialny
- **Retention**: Jak d≈Çugo dane sƒÖ przechowywane

In [None]:
# 3. Tabele nieu≈ºywane (brak queries w ostatnich 90 dniach)
unused_tables = spark.sql(f"""
    WITH recent_access AS (
        SELECT DISTINCT request_params.full_name_arg as table_name
        FROM system.access.audit
        WHERE action_name = 'getTable'
            AND event_date >= current_date() - INTERVAL 90 DAYS
    )
    SELECT 
        t.table_catalog,
        t.table_schema,
        t.table_name,
        t.created as table_created_at
    FROM system.information_schema.tables t
    LEFT JOIN recent_access ra 
        ON CONCAT(t.table_catalog, '.', t.table_schema, '.', t.table_name) = ra.table_name
    WHERE t.table_catalog = '{CATALOG}'
        AND ra.table_name IS NULL
        AND t.created < current_date() - INTERVAL 90 DAYS
""")

display(unused_tables)

**üóëÔ∏è Tabele nieu≈ºywane (90+ dni bez dostƒôpu)**

Tabele bez dostƒôpu w ostatnich 90 dniach mogƒÖ byƒá kandydatami do:
- **Archiwizacji**: Przeniesienie do cold storage
- **Deprecation**: Oznaczenie jako deprecated  
- **Cleanup**: Usuniƒôcie po weryfikacji z business stakeholders

Monitorowanie usage pomaga optimalizowaƒá koszty storage.

---

## ‚úÖ Podsumowanie

### Nauczy≈Çe≈õ siƒô:

‚úÖ **Unity Catalog Architecture**: Metastore ‚Üí Catalog ‚Üí Schema ‚Üí Tables  
‚úÖ **Access Control**: GRANT/REVOKE privileges at multiple levels  
‚úÖ **Data Masking**: Column-level masking with dynamic views  
‚úÖ **Row-Level Security**: Filter data based on user identity  
‚úÖ **Data Lineage**: Track data flow through system tables  
‚úÖ **Audit Logging**: Monitor who accessed what and when  
‚úÖ **Delta Sharing**: Secure cross-organization data sharing  

### Key Takeaways:

1. **Unified Governance**: Single platform for all data assets
2. **Fine-grained Control**: Table, column, row-level security
3. **Automatic Lineage**: No extra instrumentation needed
4. **Compliance-ready**: Audit logs for regulatory requirements
5. **Secure Sharing**: Delta Sharing for external collaboration

### Nastƒôpne kroki:
- **Notebook 05**: BI & ML Integrations
- **Workshop 03**: Governance + Integrations hands-on

---

## üìö Dodatkowe zasoby

- [Unity Catalog Documentation](https://docs.databricks.com/data-governance/unity-catalog/index.html)
- [Delta Sharing Protocol](https://delta.io/sharing/)
- [Unity Catalog Best Practices](https://docs.databricks.com/data-governance/unity-catalog/best-practices.html)

---

## ‚úÖ Checklist - Unity Catalog Governance

Po uko≈Ñczeniu tego notebooka powiniene≈õ umieƒá:

- [ ] **UC Architecture**: Zrozumieƒá hierarchiƒô Metastore ‚Üí Catalog ‚Üí Schema ‚Üí Objects
- [ ] **Tworzenie obiekt√≥w**: Utworzyƒá Catalog, Schema, Tables, Views, Volumes, Functions
- [ ] **GRANT/REVOKE**: ZarzƒÖdzaƒá uprawnieniami na wszystkich poziomach
- [ ] **Privileges**: Rozumieƒá SELECT, MODIFY, CREATE TABLE, EXECUTE
- [ ] **Data Masking**: Tworzyƒá Views z maskowaniem wra≈ºliwych danych
- [ ] **Row-Level Security**: Implementowaƒá RLS bazowane na group membership
- [ ] **Lineage**: ≈öledziƒá upstream/downstream dependencies
- [ ] **Audit Logging**: Zapytywaƒá system.access.audit o aktywno≈õƒá u≈ºytkownik√≥w
- [ ] **Delta Sharing**: Tworzyƒá Share i udostƒôpniaƒá dane zewnƒôtrznym recipientom
- [ ] **Best Practices**: Monitorowaƒá governance health (owners, documentation, unused tables)

---

## üîß Troubleshooting

### Problem 1: "Table or view not found"
**Przyczyna**: Brak uprawnie≈Ñ USE CATALOG lub USE SCHEMA  
**RozwiƒÖzanie**:
```sql
GRANT USE CATALOG ON CATALOG <catalog_name> TO <principal>;
GRANT USE SCHEMA ON SCHEMA <catalog>.<schema> TO <principal>;
```

### Problem 2: "Permission denied" przy SELECT
**Przyczyna**: Brak uprawnie≈Ñ SELECT na tabeli  
**RozwiƒÖzanie**:
```sql
GRANT SELECT ON TABLE <catalog>.<schema>.<table> TO <principal>;
-- lub na ca≈Çym schema:
GRANT SELECT ON SCHEMA <catalog>.<schema> TO <principal>;
```

### Problem 3: "Cannot execute function"
**Przyczyna**: Brak uprawnienia EXECUTE na funkcji  
**RozwiƒÖzanie**:
```sql
GRANT EXECUTE ON FUNCTION <catalog>.<schema>.<function_name> TO <principal>;
```

### Problem 4: "Volume not accessible"
**Przyczyna**: Brak uprawnie≈Ñ READ VOLUME / WRITE VOLUME  
**RozwiƒÖzanie**:
```sql
GRANT READ VOLUME ON VOLUME <catalog>.<schema>.<volume> TO <principal>;
GRANT WRITE VOLUME ON VOLUME <catalog>.<schema>.<volume> TO <principal>;
```

### Problem 5: RLS View nie filtruje danych
**Przyczyna**: U≈ºytkownik nie nale≈ºy do ≈ºadnej grupy zdefiniowanej w CASE WHEN  
**RozwiƒÖzanie**: Dodaj u≈ºytkownika do odpowiedniej grupy lub dodaj domy≈õlny fallback w View

### Problem 6: Lineage nie pokazuje zale≈ºno≈õci
**Przyczyna**: Lineage jest automatyczne, ale mo≈ºe op√≥≈∫niaƒá siƒô o kilka minut  
**RozwiƒÖzanie**: Poczekaj 5-10 minut i ponownie zapytaj system.access.table_lineage

### Problem 7: Share nie widoczny dla recipient
**Przyczyna**: Recipient nie aktywowa≈Ç activation link  
**RozwiƒÖzanie**: Wy≈õlij activation link z DESCRIBE RECIPIENT

---

## üèÜ Best Practices Summary

### 1. **Catalog Organization**
- ‚úÖ U≈ºywaj environment-based catalogs: `dev`, `test`, `prod`
- ‚úÖ Organizuj schematy wed≈Çug warstw: `bronze`, `silver`, `gold`
- ‚úÖ Stosuj naming conventions: `<catalog>.<schema>.<object>`

### 2. **Access Control**
- ‚úÖ **Principle of Least Privilege**: Nadawaj minimalne wymagane uprawnienia
- ‚úÖ U≈ºywaj grup, nie indywidualnych u≈ºytkownik√≥w
- ‚úÖ Inheritance: GRANT na Catalog ‚Üí dziedziczy na Schema ‚Üí dziedziczy na Tables
- ‚úÖ Regularnie audytuj uprawnienia (SHOW GRANTS)

### 3. **Data Masking & RLS**
- ‚úÖ Maskuj PII w Views dla u≈ºytkownik√≥w bez pii-access-group
- ‚úÖ U≈ºywaj RLS dla multi-tenant scenarios
- ‚úÖ Zawsze testuj masking z r√≥≈ºnymi group membership

### 4. **Lineage & Audit**
- ‚úÖ Wykorzystuj automatic lineage do ≈õledzenia data flow
- ‚úÖ Regularnie sprawdzaj audit logs dla sensitive tables
- ‚úÖ Monitoruj lineage po zmianach w pipeline

### 5. **Delta Sharing**
- ‚úÖ Udostƒôpniaj tylko Gold layer (aggregated data)
- ‚úÖ U≈ºywaj masked Views w Share
- ‚úÖ Dokumentuj Share contracts dla recipients

### 6. **Documentation & Governance**
- ‚úÖ Dodawaj COMMENT do wszystkich tabel, views, functions
- ‚úÖ U≈ºywaj Table Properties dla metadata (owner, PII, retention)
- ‚úÖ Regularnie sprawdzaj governance health checks

### 7. **Volumes & Functions**
- ‚úÖ U≈ºywaj Managed Volumes dla ML artifacts i staging
- ‚úÖ Centralizuj logikƒô biznesowƒÖ w UC Functions
- ‚úÖ Kontroluj dostƒôp przez GRANT EXECUTE

---