<div style="display: flex; justify-content: space-between; align-items: center; padding: 8px 16px; background: #F8F9FA; border-bottom: 2px solid #E0E0E0; margin: 0; line-height: 1;">
    <div style="font-size: 14px; color: #666;">
        <span style="font-weight: bold; color: #333;">{SOURCE_PLATFORM} → Databricks Migration</span>
        <span style="margin-left: 8px; color: #999;">|</span>
        <span style="margin-left: 8px;">04 - Activate</span>
    </div>
    <div style="display: flex; align-items: center; gap: 8px;">
        <img src="https://cdn.simpleicons.org/snowflake/29B5E8" width="24" height="24"/>
        <span style="color: #999; font-size: 16px;">→</span>
        <img src="https://cdn.simpleicons.org/databricks/FF3621" width="24" height="24"/>
    </div>
</div>


<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img
    src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png"
    alt="Databricks Learning"
  >
</div>

# Testing and Data Validation

## Overview

Data validation is the cornerstone of migration success. Before cutover, you must verify that migrated data matches the source in terms of completeness, accuracy, and integrity. This lesson covers validation strategies, testing frameworks, and reconciliation techniques to ensure confidence in your migration.

## Learning Objectives

By the end of this lesson, you will be able to:
- Design a comprehensive data validation strategy
- Implement row count, aggregate, and content validation checks
- Use testing frameworks (Great Expectations, Chispa, pytest) for automated validation
- Build reconciliation reports comparing source and target
- Establish acceptance criteria and sign-off processes

## Validation Strategy

A structured validation approach ensures no data issues slip through to production. Validation should occur at multiple levels with increasing depth.

<br />
<div class="mermaid">
flowchart TB
    subgraph L1["<b>Level 1: Structural</b>"]
        S1["Schema Match"]
        S2["Table Existence"]
        S3["Column Names & Types"]
    end
    subgraph L2["<b>Level 2: Quantitative</b>"]
        Q1["Row Counts"]
        Q2["Null Counts"]
        Q3["Distinct Values"]
    end
    subgraph L3["<b>Level 3: Aggregate</b>"]
        A1["SUM / AVG / MIN / MAX"]
        A2["Standard Deviation"]
        A3["Date Range Bounds"]
    end
    subgraph L4["<b>Level 4: Content</b>"]
        C1["Row-by-Row Hash"]
        C2["Sample Spot Checks"]
        C3["Business Rule Validation"]
    end
    L1 --> L2 --> L3 --> L4
    style L1 fill:#e3f2fd,stroke:#1976d2
    style L2 fill:#e8f5e9,stroke:#4caf50
    style L3 fill:#fff3e0,stroke:#ff9800
    style L4 fill:#fce4ec,stroke:#e91e63
</div>
<script type="module"> import mermaid from "https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.esm.min.mjs"; mermaid.initialize({ startOnLoad: true, theme: "default" }); </script>

### Validation Levels Explained

| Level | Focus | Checks | Tolerance |
|-------|-------|--------|----------|
| **L1: Structural** | Schema compatibility | Table exists, column names match, data types compatible | Zero tolerance |
| **L2: Quantitative** | Record completeness | Row counts, null counts, cardinality | Exact match expected |
| **L3: Aggregate** | Numeric accuracy | SUM, AVG, MIN, MAX for numeric columns | ±0.01% for floats |
| **L4: Content** | Row-level accuracy | Hash comparisons, sample verification | 100% match for critical tables |

<div style="border-left: 4px solid #ff9800; background: #fff3e0; padding: 16px 20px; border-radius: 4px; margin: 16px 0;">
    <div style="display: flex; align-items: flex-start; gap: 12px;">
        <span style="font-size: 24px;">⚠️</span>
        <div>
            <strong style="color: #e65100; font-size: 1.1em;">Floating Point Precision</strong>
            <p style="margin: 8px 0 0 0; color: #333;">
                When validating FLOAT/DOUBLE columns, expect minor precision differences due to platform-specific floating-point implementations. Use threshold-based comparisons (e.g., <code>ABS(source - target) < 0.0001</code>) rather than exact equality.
            </p>
        </div>
    </div>
</div>

## Row Count Validation

The most fundamental check is row count comparison. Every table should have matching row counts between source and target.

<div class="code-block" data-language="sql">
-- Source count (run on {SOURCE_PLATFORM})
SELECT 
    'customers' AS table_name,
    COUNT(*) AS row_count,
    CURRENT_TIMESTAMP AS counted_at
FROM source_db.customers;

-- Target count (run on Databricks)
SELECT 
    'customers' AS table_name,
    COUNT(*) AS row_count,
    CURRENT_TIMESTAMP() AS counted_at
FROM catalog.silver.customers;

-- Automated comparison across all tables
WITH source_counts AS (
    SELECT table_name, row_count FROM migration_validation.source_counts
),
target_counts AS (
    SELECT table_name, row_count FROM migration_validation.target_counts
)
SELECT 
    COALESCE(s.table_name, t.table_name) AS table_name,
    s.row_count AS source_rows,
    t.row_count AS target_rows,
    t.row_count - s.row_count AS difference,
    CASE 
        WHEN s.row_count = t.row_count THEN 'PASS'
        ELSE 'FAIL'
    END AS status
FROM source_counts s
FULL OUTER JOIN target_counts t ON s.table_name = t.table_name
ORDER BY status DESC, table_name;
</div>

<link href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" rel="stylesheet" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/components/prism-sql.min.js"></script>

<script>
(function() {
    document.querySelectorAll('.code-block').forEach(function(block) {
        var lang = block.getAttribute('data-language') || 'sql';
        var code = block.textContent.trim();
        var id = 'code-' + Math.random().toString(36).substr(2, 9);
        
        block.innerHTML = 
            '<div style="position:relative;margin:16px 0;">' +
                '<button class="copy-btn" style="position:absolute;top:8px;right:8px;padding:4px 12px;font-size:12px;background:#ddd;color:#333;border:1px solid #ccc;border-radius:4px;cursor:pointer;z-index:10;">Copy</button>' +
                '<pre style="background:#f8f8f8;border-radius:8px;padding:16px;padding-top:40px;overflow-x:auto;margin:0;border:1px solid #e0e0e0;"><code id="' + id + '" class="language-' + lang + '" style="font-family:Consolas,Monaco,monospace;font-size:14px;"></code></pre>' +
            '</div>';
        
        var codeEl = document.getElementById(id);
        codeEl.textContent = code;
        Prism.highlightElement(codeEl);
        
        block.querySelector('.copy-btn').onclick = function() {
            var t = document.createElement('textarea');
            t.value = code;
            document.body.appendChild(t);
            t.select();
            document.execCommand('copy');
            document.body.removeChild(t);
            this.textContent = '✓ Copied!';
            setTimeout(() => this.textContent = 'Copy', 2000);
        };
    });
})();
</script>

## Aggregate Validation

Beyond counts, validate that numeric aggregates match between source and target. This catches issues like truncation, rounding errors, or missing records.

<div class="code-block" data-language="sql">
-- Comprehensive aggregate validation for a table
SELECT 
    'orders' AS table_name,
    COUNT(*) AS row_count,
    COUNT(DISTINCT customer_id) AS distinct_customers,
    SUM(order_amount) AS total_amount,
    AVG(order_amount) AS avg_amount,
    MIN(order_date) AS min_date,
    MAX(order_date) AS max_date,
    SUM(CASE WHEN order_amount IS NULL THEN 1 ELSE 0 END) AS null_amounts,
    STDDEV(order_amount) AS stddev_amount
FROM catalog.silver.orders;

-- Comparison query (combine source and target results)
SELECT 
    metric_name,
    source_value,
    target_value,
    ABS(target_value - source_value) AS absolute_diff,
    CASE 
        WHEN source_value = 0 THEN NULL
        ELSE ABS(target_value - source_value) / source_value * 100
    END AS pct_diff,
    CASE 
        WHEN ABS(target_value - source_value) < 0.01 THEN 'PASS'
        ELSE 'FAIL'
    END AS status
FROM migration_validation.aggregate_comparison
WHERE table_name = 'orders';
</div>

<link href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" rel="stylesheet" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/components/prism-sql.min.js"></script>

<script>
(function() {
    document.querySelectorAll('.code-block').forEach(function(block) {
        var lang = block.getAttribute('data-language') || 'sql';
        var code = block.textContent.trim();
        var id = 'code-' + Math.random().toString(36).substr(2, 9);
        
        block.innerHTML = 
            '<div style="position:relative;margin:16px 0;">' +
                '<button class="copy-btn" style="position:absolute;top:8px;right:8px;padding:4px 12px;font-size:12px;background:#ddd;color:#333;border:1px solid #ccc;border-radius:4px;cursor:pointer;z-index:10;">Copy</button>' +
                '<pre style="background:#f8f8f8;border-radius:8px;padding:16px;padding-top:40px;overflow-x:auto;margin:0;border:1px solid #e0e0e0;"><code id="' + id + '" class="language-' + lang + '" style="font-family:Consolas,Monaco,monospace;font-size:14px;"></code></pre>' +
            '</div>';
        
        var codeEl = document.getElementById(id);
        codeEl.textContent = code;
        Prism.highlightElement(codeEl);
        
        block.querySelector('.copy-btn').onclick = function() {
            var t = document.createElement('textarea');
            t.value = code;
            document.body.appendChild(t);
            t.select();
            document.execCommand('copy');
            document.body.removeChild(t);
            this.textContent = '✓ Copied!';
            setTimeout(() => this.textContent = 'Copy', 2000);
        };
    });
})();
</script>

## Content Validation with Row Hashing

For critical tables, validate content at the row level using hash comparisons. This ensures every single row matches between source and target.

<br />
<div class="mermaid">
flowchart LR
    subgraph SOURCE["{SOURCE_PLATFORM}"]
        S1["Table Row"] --> S2["Hash Function"]
        S2 --> S3["Row Hash"]
    end
    subgraph TARGET["Databricks"]
        T1["Table Row"] --> T2["Hash Function"]
        T2 --> T3["Row Hash"]
    end
    S3 --> CMP["Compare\nHashes"]
    T3 --> CMP
    CMP --> |Match| PASS["✓ Valid"]
    CMP --> |Mismatch| FAIL["✗ Investigate"]
    style SOURCE fill:#e3f2fd,stroke:#1976d2
    style TARGET fill:#fff3e0,stroke:#ff9800
    style PASS fill:#e8f5e9,stroke:#4caf50
    style FAIL fill:#ffebee,stroke:#f44336
</div>
<script type="module"> import mermaid from "https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.esm.min.mjs"; mermaid.initialize({ startOnLoad: true, theme: "default" }); </script>

### Row Hashing Implementation

<div class="code-block" data-language="sql">
-- Create hash of all columns for each row
-- Use consistent column ordering and null handling
SELECT 
    customer_id AS primary_key,
    MD5(CONCAT_WS('|',
        COALESCE(CAST(customer_id AS STRING), ''),
        COALESCE(first_name, ''),
        COALESCE(last_name, ''),
        COALESCE(email, ''),
        COALESCE(CAST(created_at AS STRING), '')
    )) AS row_hash
FROM catalog.silver.customers;

-- Compare hashes between source and target
WITH source_hashes AS (
    SELECT primary_key, row_hash FROM migration_validation.source_customers_hash
),
target_hashes AS (
    SELECT primary_key, row_hash FROM migration_validation.target_customers_hash
)
SELECT 
    COALESCE(s.primary_key, t.primary_key) AS primary_key,
    CASE 
        WHEN s.row_hash IS NULL THEN 'MISSING_IN_SOURCE'
        WHEN t.row_hash IS NULL THEN 'MISSING_IN_TARGET'
        WHEN s.row_hash = t.row_hash THEN 'MATCH'
        ELSE 'MISMATCH'
    END AS status
FROM source_hashes s
FULL OUTER JOIN target_hashes t ON s.primary_key = t.primary_key
WHERE s.row_hash IS NULL 
   OR t.row_hash IS NULL 
   OR s.row_hash != t.row_hash;
</div>

<link href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" rel="stylesheet" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/components/prism-sql.min.js"></script>

<script>
(function() {
    document.querySelectorAll('.code-block').forEach(function(block) {
        var lang = block.getAttribute('data-language') || 'sql';
        var code = block.textContent.trim();
        var id = 'code-' + Math.random().toString(36).substr(2, 9);
        
        block.innerHTML = 
            '<div style="position:relative;margin:16px 0;">' +
                '<button class="copy-btn" style="position:absolute;top:8px;right:8px;padding:4px 12px;font-size:12px;background:#ddd;color:#333;border:1px solid #ccc;border-radius:4px;cursor:pointer;z-index:10;">Copy</button>' +
                '<pre style="background:#f8f8f8;border-radius:8px;padding:16px;padding-top:40px;overflow-x:auto;margin:0;border:1px solid #e0e0e0;"><code id="' + id + '" class="language-' + lang + '" style="font-family:Consolas,Monaco,monospace;font-size:14px;"></code></pre>' +
            '</div>';
        
        var codeEl = document.getElementById(id);
        codeEl.textContent = code;
        Prism.highlightElement(codeEl);
        
        block.querySelector('.copy-btn').onclick = function() {
            var t = document.createElement('textarea');
            t.value = code;
            document.body.appendChild(t);
            t.select();
            document.execCommand('copy');
            document.body.removeChild(t);
            this.textContent = '✓ Copied!';
            setTimeout(() => this.textContent = 'Copy', 2000);
        };
    });
})();
</script>

## Testing Frameworks

Automate validation using established testing frameworks. These integrate into CI/CD pipelines and provide rich reporting.

| Framework | Language | Best For | Databricks Integration |
|-----------|----------|----------|------------------------|
| **Great Expectations** | Python | Data quality, expectation suites | Native support, DLT integration |
| **Chispa** | Python | DataFrame comparison | Spark-native, pytest compatible |
| **pytest** | Python | Unit/integration tests | Databricks Connect, notebooks |
| **dbt tests** | SQL | Schema & data tests | dbt-databricks adapter |
| **Lakebridge** | Python | Migration validation | Built for source-target comparison |

### Great Expectations Example

<div class="code-block" data-language="python">
import great_expectations as gx
from great_expectations.dataset import SparkDFDataset

# Load target DataFrame
df = spark.table("catalog.silver.customers")
ge_df = SparkDFDataset(df)

# Define expectations
expectations = [
    ge_df.expect_table_row_count_to_equal(1000000),
    ge_df.expect_column_to_exist("customer_id"),
    ge_df.expect_column_values_to_not_be_null("customer_id"),
    ge_df.expect_column_values_to_be_unique("customer_id"),
    ge_df.expect_column_values_to_match_regex("email", r"^[\w.-]+@[\w.-]+\.\w+$"),
    ge_df.expect_column_values_to_be_between("age", min_value=0, max_value=150),
]

# Validate and generate report
results = ge_df.validate()
print(f"Validation passed: {results.success}")
print(f"Passed: {results.statistics['successful_expectations']}")
print(f"Failed: {results.statistics['unsuccessful_expectations']}")
</div>

<link href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" rel="stylesheet" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/components/prism-python.min.js"></script>

<script>
(function() {
    document.querySelectorAll('.code-block').forEach(function(block) {
        var lang = block.getAttribute('data-language') || 'python';
        var code = block.textContent.trim();
        var id = 'code-' + Math.random().toString(36).substr(2, 9);
        
        block.innerHTML = 
            '<div style="position:relative;margin:16px 0;">' +
                '<button class="copy-btn" style="position:absolute;top:8px;right:8px;padding:4px 12px;font-size:12px;background:#ddd;color:#333;border:1px solid #ccc;border-radius:4px;cursor:pointer;z-index:10;">Copy</button>' +
                '<pre style="background:#f8f8f8;border-radius:8px;padding:16px;padding-top:40px;overflow-x:auto;margin:0;border:1px solid #e0e0e0;"><code id="' + id + '" class="language-' + lang + '" style="font-family:Consolas,Monaco,monospace;font-size:14px;"></code></pre>' +
            '</div>';
        
        var codeEl = document.getElementById(id);
        codeEl.textContent = code;
        Prism.highlightElement(codeEl);
        
        block.querySelector('.copy-btn').onclick = function() {
            var t = document.createElement('textarea');
            t.value = code;
            document.body.appendChild(t);
            t.select();
            document.execCommand('copy');
            document.body.removeChild(t);
            this.textContent = '✓ Copied!';
            setTimeout(() => this.textContent = 'Copy', 2000);
        };
    });
})();
</script>

### Chispa DataFrame Comparison

<div class="code-block" data-language="python">
from chispa import assert_df_equality
import pytest

def test_customer_migration():
    """Verify migrated customers match source snapshot."""
    source_df = spark.table("validation.source_customers_snapshot")
    target_df = spark.table("catalog.silver.customers")
    
    # Select only comparable columns (exclude metadata)
    cols = ["customer_id", "first_name", "last_name", "email", "created_at"]
    
    assert_df_equality(
        source_df.select(cols).orderBy("customer_id"),
        target_df.select(cols).orderBy("customer_id"),
        ignore_nullable=True,
        ignore_column_order=True
    )

def test_order_totals():
    """Verify order totals match between source and target."""
    source_total = spark.sql("""
        SELECT SUM(order_amount) FROM validation.source_orders_snapshot
    """).collect()[0][0]
    
    target_total = spark.sql("""
        SELECT SUM(order_amount) FROM catalog.silver.orders
    """).collect()[0][0]
    
    assert abs(source_total - target_total) < 0.01, \
        f"Total mismatch: source={source_total}, target={target_total}"
</div>

<link href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" rel="stylesheet" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/components/prism-python.min.js"></script>

<script>
(function() {
    document.querySelectorAll('.code-block').forEach(function(block) {
        var lang = block.getAttribute('data-language') || 'python';
        var code = block.textContent.trim();
        var id = 'code-' + Math.random().toString(36).substr(2, 9);
        
        block.innerHTML = 
            '<div style="position:relative;margin:16px 0;">' +
                '<button class="copy-btn" style="position:absolute;top:8px;right:8px;padding:4px 12px;font-size:12px;background:#ddd;color:#333;border:1px solid #ccc;border-radius:4px;cursor:pointer;z-index:10;">Copy</button>' +
                '<pre style="background:#f8f8f8;border-radius:8px;padding:16px;padding-top:40px;overflow-x:auto;margin:0;border:1px solid #e0e0e0;"><code id="' + id + '" class="language-' + lang + '" style="font-family:Consolas,Monaco,monospace;font-size:14px;"></code></pre>' +
            '</div>';
        
        var codeEl = document.getElementById(id);
        codeEl.textContent = code;
        Prism.highlightElement(codeEl);
        
        block.querySelector('.copy-btn').onclick = function() {
            var t = document.createElement('textarea');
            t.value = code;
            document.body.appendChild(t);
            t.select();
            document.execCommand('copy');
            document.body.removeChild(t);
            this.textContent = '✓ Copied!';
            setTimeout(() => this.textContent = 'Copy', 2000);
        };
    });
})();
</script>

## Reconciliation Reports

Build comprehensive reconciliation reports that stakeholders can review before sign-off. Store results in Delta tables for audit trails.

<div class="code-block" data-language="sql">
-- Create validation results table
CREATE TABLE IF NOT EXISTS migration_validation.validation_results (
    validation_run_id STRING,
    run_timestamp TIMESTAMP,
    table_name STRING,
    validation_type STRING,
    metric_name STRING,
    source_value DOUBLE,
    target_value DOUBLE,
    difference DOUBLE,
    pct_difference DOUBLE,
    status STRING,
    notes STRING
)
USING DELTA
CLUSTER BY (table_name, validation_type);

-- Generate summary report
SELECT 
    table_name,
    COUNT(*) AS total_checks,
    SUM(CASE WHEN status = 'PASS' THEN 1 ELSE 0 END) AS passed,
    SUM(CASE WHEN status = 'FAIL' THEN 1 ELSE 0 END) AS failed,
    ROUND(SUM(CASE WHEN status = 'PASS' THEN 1 ELSE 0 END) * 100.0 / COUNT(*), 2) AS pass_rate
FROM migration_validation.validation_results
WHERE validation_run_id = 'RUN_20240115_001'
GROUP BY table_name
ORDER BY failed DESC, table_name;
</div>

<link href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" rel="stylesheet" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/components/prism-sql.min.js"></script>

<script>
(function() {
    document.querySelectorAll('.code-block').forEach(function(block) {
        var lang = block.getAttribute('data-language') || 'sql';
        var code = block.textContent.trim();
        var id = 'code-' + Math.random().toString(36).substr(2, 9);
        
        block.innerHTML = 
            '<div style="position:relative;margin:16px 0;">' +
                '<button class="copy-btn" style="position:absolute;top:8px;right:8px;padding:4px 12px;font-size:12px;background:#ddd;color:#333;border:1px solid #ccc;border-radius:4px;cursor:pointer;z-index:10;">Copy</button>' +
                '<pre style="background:#f8f8f8;border-radius:8px;padding:16px;padding-top:40px;overflow-x:auto;margin:0;border:1px solid #e0e0e0;"><code id="' + id + '" class="language-' + lang + '" style="font-family:Consolas,Monaco,monospace;font-size:14px;"></code></pre>' +
            '</div>';
        
        var codeEl = document.getElementById(id);
        codeEl.textContent = code;
        Prism.highlightElement(codeEl);
        
        block.querySelector('.copy-btn').onclick = function() {
            var t = document.createElement('textarea');
            t.value = code;
            document.body.appendChild(t);
            t.select();
            document.execCommand('copy');
            document.body.removeChild(t);
            this.textContent = '✓ Copied!';
            setTimeout(() => this.textContent = 'Copy', 2000);
        };
    });
})();
</script>

## Summary

### Validation Checklist

- [ ] Structural validation: All tables exist with correct schemas
- [ ] Row count validation: Exact match for all tables
- [ ] Aggregate validation: SUM/AVG/MIN/MAX within tolerance
- [ ] Content validation: Row hashes match for critical tables
- [ ] Business rule validation: Domain-specific checks pass
- [ ] Reconciliation report: Generated and reviewed

### Key Principles

| Principle | Implementation |
|-----------|----------------|
| **Automate everything** | Use testing frameworks, CI/CD integration |
| **Fail fast** | Run quick checks first, deep validation last |
| **Document exceptions** | Record and justify any accepted differences |
| **Preserve evidence** | Store validation results in Delta tables |

### Next Steps

With validation complete, establish ongoing monitoring:

- [**4.2 - Observability and Monitoring**]($./4.2 Observability & Monitoring) - Set up Lakehouse Monitoring and alerting
- [**4.3 - Cutover Execution**]($./4.3 Cutover Execution) - Plan and execute production cutover

<div style="color: #FF3621; font-weight: bold; font-size: 2em; margin-bottom: 12px;">COURSE DEVELOPER (remove before publishing)</div>

### Template Customization

**Placeholders to replace:**
- `{SOURCE_PLATFORM}` - Source platform name

**Platform-specific additions:**
- Add platform-specific hash function syntax
- Include platform-specific data type comparison considerations
- Document known precision differences between platforms

&copy; 2026 Databricks, Inc. All rights reserved. Apache, Apache Spark, Spark, the Spark Logo, Apache Iceberg, Iceberg, and the Apache Iceberg logo are trademarks of the <a href="https://www.apache.org/" target="_blank">Apache Software Foundation</a>.<br/><br/><a href="https://databricks.com/privacy-policy" target="_blank">Privacy Policy</a> | <a href="https://databricks.com/terms-of-use" target="_blank">Terms of Use</a> | <a href="https://help.databricks.com/" target="_blank">Support</a>
