<div style="display: flex; justify-content: space-between; align-items: center; padding: 8px 16px; background: #F8F9FA; border-bottom: 2px solid #E0E0E0; margin: 0; line-height: 1;">
    <div style="font-size: 14px; color: #666;">
        <span style="font-weight: bold; color: #333;">{SOURCE_PLATFORM} → Databricks Migration</span>
        <span style="margin-left: 8px; color: #999;">|</span>
        <span style="margin-left: 8px;">02 - Design</span>
    </div>
    <div style="display: flex; align-items: center; gap: 8px;">
        <img src="https://cdn.simpleicons.org/snowflake/29B5E8" width="24" height="24"/>
        <span style="color: #999; font-size: 16px;">→</span>
        <img src="https://cdn.simpleicons.org/databricks/FF3621" width="24" height="24"/>
    </div>
</div>


<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img
    src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png"
    alt="Databricks Learning"
  >
</div>

# Target Architecture Design

## Overview

This module defines the target Databricks **Data Intelligence Platform** architecture for your migration. It combines architecture and feature mapping from Module 00 with specific design decisions for data engineering, ingestion patterns, and Bronze/Silver/Gold medallion architecture.

## Learning Objectives

By the end of this lesson, you will be able to:
- Map {SOURCE_PLATFORM} concepts to Databricks equivalents
- Design the target lakehouse architecture (Bronze/Silver/Gold)
- Select appropriate ingestion and transformation patterns
- Plan data engineering workflows using Spark Declarative Pipelines and Lakeflow Jobs
- Define lineage and CDC strategies

## Core Component Mapping

{SOURCE_PLATFORM} and third party components and their Databricks equivalents:

| Component | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/snowflake/29B5E8" width="20" height="20" style="vertical-align: middle;"> {SOURCE_PLATFORM}</span> | Third Party Tools | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/databricks/FF3621" width="20" height="20" style="vertical-align: middle;"> Databricks</span> | Notes |
|-----------|-------------------|-------------------|------------|-------|
| **Compute** | REPLACEME | | All-purpose clusters, Jobs clusters, SQL warehouses (classic/serverless) | |
| **Storage** | REPLACEME | | Cloud object storage (S3, ADLS Gen2, GCS) with Delta Lake format | |
| **Data Format** | REPLACEME | Parquet, Avro, ORC | Delta Lake (open source, Parquet-based) | |
| **Catalog/Metadata** | REPLACEME | Alation, Collibra, Atlan | Unity Catalog | |
| **Data Ingestion** | REPLACEME | Fivetran, Airbyte, Matillion, Qlik Replicate | Auto Loader, COPY INTO, Lakeflow Connect, Spark Declarative Pipelines | |
| **Orchestration** | REPLACEME | Airflow, Dagster, Prefect, Control-M | Lakeflow Jobs (Lakeflow Jobs) | |
| **Transformations** | REPLACEME | dbt, Matillion, Informatica, Talend | Spark Declarative Pipelines (Declarative Pipelines), notebooks, dbt | |
| **Security Model** | REPLACEME | Immuta, Privacera, Okta | Unity Catalog RBAC/ABAC, row/column-level security | |
| **Data Sharing** | REPLACEME | | Delta Sharing (open protocol) | |
| **BI/Reporting** | REPLACEME | Tableau, Power BI, Looker, Qlik, ThoughtSpot | Databricks SQL, AI/BI Dashboards | |

## Account Hierarchy and Namespace Mapping

Understanding how account structures and data namespaces differ between platforms is essential for planning your migration topology and governance model.

<br />
<div class="mermaid">
flowchart TB
    subgraph SF["Snowflake"]
        direction TB
        SFO["<b>Organization</b><br/><i>admin umbrella</i>"]
        SFA["<b>Account</b><br/><i>isolated environment</i>"]
        SFD["<b>Database</b>"]
        SFS["<b>Schema</b>"]
        SFOBJ["<b>Objects</b><br/><i>tables, views, stages,<br/>procedures, UDFs</i>"]
        SFO --> SFA --> SFD --> SFS --> SFOBJ
    end
    subgraph DB["Databricks"]
        direction TB
        DBA["<b>Account</b><br/><i>admin + billing</i>"]
        DBM["<b>Metastore</b><br/><i>governance (regional)</i>"]
        DBW["<b>Workspace</b><br/><i>compute environment</i>"]
        DBC["<b>Catalog</b>"]
        DBS["<b>Schema</b>"]
        DBOBJ["<b>Objects</b><br/><i>tables, views, volumes,<br/>models, functions</i>"]
        DBA --> DBM
        DBA --> DBW
        DBW -.->|attaches to| DBM
        DBM --> DBC --> DBS --> DBOBJ
    end
    style SF fill:#fff,stroke:#29B5E8,stroke-width:2px
    style DB fill:#fff,stroke:#FF3621,stroke-width:2px
</div>
<script type="module"> import mermaid from "https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.esm.min.mjs"; mermaid.initialize({ startOnLoad: true, theme: "default" }); </script>

### Key Differences

| Aspect | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/snowflake/29B5E8" width="20" height="20" style="vertical-align: middle;"> Snowflake</span> | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/databricks/FF3621" width="20" height="20" style="vertical-align: middle;"> Databricks</span> |
|--------|-----------|------------|
| **Top-level admin** | Organization - umbrella for multiple accounts | Account - single admin boundary for billing and identity |
| **Isolation boundary** | Account - fully isolated; data sharing requires explicit grants | Workspace - compute isolation; data shared via metastore |
| **Governance scope** | Per-account (no cross-account governance without sharing) | Metastore spans workspaces - unified governance, lineage, audit |
| **Data visibility** | Accounts are "walled gardens" - zero-copy sharing between accounts | Workspaces on same metastore see same catalogs (with permissions) |
| **External sharing** | Secure Data Sharing (Snowflake-to-Snowflake) | Delta Sharing (open protocol - any platform) |

### Fully Qualified Object Names

| Platform | Pattern | Example |
|----------|---------|---------|
| <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/snowflake/29B5E8" width="20" height="20" style="vertical-align: middle;"> Snowflake</span> | `database.schema.object` | `analytics_db.sales.customers` |
| <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/databricks/FF3621" width="20" height="20" style="vertical-align: middle;"> Databricks</span> | `catalog.schema.object` | `prod_catalog.sales.customers` |

## Database Object Mapping

{SOURCE_PLATFORM} objects mapped to their Databricks equivalents. Unity Catalog governs all object types - including assets beyond traditional database objects.

| Object Type | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/snowflake/29B5E8" width="20" height="20" style="vertical-align: middle;"> {SOURCE_PLATFORM}</span> | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/databricks/FF3621" width="20" height="20" style="vertical-align: middle;"> Databricks</span> | {SOURCE_PLATFORM} Limitations | Databricks Considerations |
|-------------|-------------------|------------|-------------------------------|---------------------------|
| **Catalog** | No equivalent (database is top level) | Catalog | No unified namespace above database | Top-level container; align to environment or domain |
| **Schema** | Schema | Schema | | Direct equivalent |
| **Permanent Table** | REPLACEME | Managed Table (Delta) | | Tables stored in cloud storage you control |
| **External Table** | REPLACEME | External Table | | Supports Delta, Parquet, CSV, JSON, Avro, ORC |
| **Temporary Table** | REPLACEME | Temporary View | | Session-scoped only |
| **Transient Table** | REPLACEME | No direct equivalent | REPLACEME | Use external tables or manage retention manually |
| **Standard View** | REPLACEME | View | | |
| **Materialized View** | REPLACEME | Materialized View | REPLACEME | Serverless SQL warehouses or DLT; auto-refresh |
| **Dynamic Table** | REPLACEME | Spark Declarative Pipelines (Streaming/Materialized) | REPLACEME | Declarative; handles incremental and streaming |
| **Stored Procedure** | REPLACEME | Python/SQL in Notebooks, SQL Stored Procedures | | Notebooks preferred; procedures supported in DBSQL |
| **User-Defined Function (SQL)** | REPLACEME | SQL UDF | | Registered and governed in Unity Catalog |
| **User-Defined Function (Python/Java)** | REPLACEME | Python UDF, Pandas UDF | REPLACEME | Python UDFs can be vectorized for performance |
| **Sequence** | REPLACEME | IDENTITY columns, generated columns | | |
| **Stream (CDC)** | REPLACEME | Delta Change Data Feed, Structured Streaming | REPLACEME | CDF is table property; streaming is continuous |
| **Task** | REPLACEME | Lakeflow Jobs (Workflows) | | Full DAG orchestration; triggers; alerts |
| **Stage (Internal/External)** | REPLACEME | Volumes, External Locations | | Volumes provide governed file access |
| **File Format** | REPLACEME | Not required (Delta is default) | | Auto-detection available for ingestion |
| **Pipe (Snowpipe)** | REPLACEME | Auto Loader | | Incremental; schema evolution; exactly-once |
| **Share** | REPLACEME | Delta Share | REPLACEME | Open protocol; share outside Databricks |
| **Volume** | No equivalent | Volume (Managed or External) | Files in stages lack governance | Governed access to files - structured and unstructured |
| **Registered Model** | No native equivalent (3rd party MLOps) | Registered Model | Relies on external tools | ML models with versioning, lineage, and governance |
| **Model Version** | No native equivalent | Model Version | | Immutable model artifacts with stage transitions |
| **Connection** | No equivalent | Connection | | External system connections for Lakehouse Federation |
| **Storage Credential** | Storage Integration | Storage Credential | | Cloud storage access credentials (managed centrally) |
| **External Location** | Stage (External) | External Location | | Governed cloud storage paths tied to credentials |

## Compute Architecture Mapping

Understanding the compute models is critical for migration planning - it affects cost modeling, performance tuning, and operational workflows.

| Capability | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/databricks/FF3621" width="20" height="20" style="vertical-align: middle;"> Databricks</span> | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/snowflake/29B5E8" width="20" height="20" style="vertical-align: middle;"> {SOURCE_PLATFORM} Equivalent</span> | {SOURCE_PLATFORM} Limitations |
|------------|------------|-------------------|-------------------------------|
| **Serverless SQL Warehouse** | Fully managed SQL compute; instant startup; auto-scales to zero | REPLACEME | REPLACEME |
| **Classic SQL Warehouse** | Customer-managed SQL compute in your cloud account | REPLACEME | REPLACEME |
| **All-Purpose Cluster** | Interactive compute for notebooks, exploration, development | REPLACEME | REPLACEME |
| **Jobs Cluster** | Ephemeral compute for scheduled/triggered workloads; auto-terminates | REPLACEME | REPLACEME |
| **Instance Pools** | Pre-warmed instances for faster cluster startup | No equivalent | No ability to pre-warm compute resources |
| **Spot/Preemptible Instances** | Discounted cloud instances for fault-tolerant workloads | No equivalent | No access to spot pricing; fixed compute costs |
| **Cluster Policies** | Governance controls for compute configuration and costs | REPLACEME | REPLACEME |
| **Spark Declarative Pipelines (DLT)** | Managed compute for declarative ETL pipelines | REPLACEME | REPLACEME |
| **Custom Libraries/Packages** | Install any PyPI, Maven, CRAN package; custom JARs; init scripts | REPLACEME | Limited to pre-installed packages; restricted extensibility |
| **Language Runtimes** | Python, SQL, Scala, R - full runtimes with version control | REPLACEME | Limited language support; restricted runtime customization |
| **ML Runtime** | Pre-configured clusters with ML libraries (PyTorch, TensorFlow, etc.) | No equivalent | No managed ML compute environment |
| **GPU Clusters** | GPU-enabled clusters for deep learning and AI workloads | REPLACEME | REPLACEME |
| **Model Serving Endpoints** | Real-time model inference with auto-scaling | No native equivalent | Requires external deployment (SageMaker, etc.) |
| **Vector Search Endpoints** | Managed vector similarity search for RAG applications | No native equivalent | Requires external vector database |
| **Feature Serving** | Real-time feature lookup for ML inference | No native equivalent | Requires external feature store |
| **AI Gateway** | Unified endpoint for foundation models with governance | No native equivalent | Requires direct API integration to model providers |

## Governance and Security Mapping

Security and governance configurations must be carefully mapped to ensure compliance requirements are maintained throughout migration.


### Governance Comparison

| Capability | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/databricks/FF3621" width="20" height="20" style="vertical-align: middle;"> Databricks (Unity Catalog)</span> | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/snowflake/29B5E8" width="20" height="20" style="vertical-align: middle;"> {SOURCE_PLATFORM} Equivalent</span> | {SOURCE_PLATFORM} Limitations |
|------------|------------|-------------------|-------------------------------|
| **Unified Catalog** | Single catalog for data, AI assets, files, and models | REPLACEME | REPLACEME |
| **Three-Level Namespace** | Catalog → Schema → Object hierarchy | REPLACEME | REPLACEME |
| **Data Lineage** | Automatic column-level lineage across all workloads | REPLACEME | REPLACEME |
| **Cross-Platform Governance** | Metastore spans multiple workspaces | REPLACEME | REPLACEME |
| **AI Asset Governance** | Models, endpoints, and features governed alongside data | No native equivalent | AI assets managed outside data governance |
| **Lakehouse Federation** | Query external catalogs (Snowflake, PostgreSQL, MySQL, etc.) with unified governance | No equivalent | Cannot federate external catalogs |
| **Open Source Catalog** | Unity Catalog open-sourced (2024) - portable governance | No equivalent | Proprietary catalog; metadata lock-in |

### Access Control Comparison

| Capability | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/databricks/FF3621" width="20" height="20" style="vertical-align: middle;"> Databricks (Unity Catalog)</span> | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/snowflake/29B5E8" width="20" height="20" style="vertical-align: middle;"> {SOURCE_PLATFORM} Equivalent</span> | {SOURCE_PLATFORM} Limitations |
|------------|------------|-------------------|-------------------------------|
| **Role-Based Access Control (RBAC)** | GRANT/REVOKE on securable objects | REPLACEME | |
| **Attribute-Based Access Control (ABAC)** | Dynamic policies based on user/data attributes | REPLACEME | REPLACEME |
| **Row-Level Security** | Row filters on tables | REPLACEME | REPLACEME |
| **Column-Level Security** | Column masks for dynamic data masking | REPLACEME | REPLACEME |
| **Object Ownership** | Ownership model with transferable ownership | REPLACEME | |
| **Privilege Inheritance** | Permissions inherit down the hierarchy | REPLACEME | |
| **Service Principals** | Machine identity for automation and CI/CD | REPLACEME | |
| **Managed Identity** | Cloud-native identity integration (Azure AD, AWS IAM) | REPLACEME | |

### Identity and Authentication

| Capability | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/databricks/FF3621" width="20" height="20" style="vertical-align: middle;"> Databricks</span> | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/snowflake/29B5E8" width="20" height="20" style="vertical-align: middle;"> {SOURCE_PLATFORM} Equivalent</span> | {SOURCE_PLATFORM} Limitations |
|------------|------------|-------------------|-------------------------------|
| **Single Sign-On (SSO)** | SAML 2.0, OIDC integration | REPLACEME | |
| **SCIM Provisioning** | Automated user/group sync from IdP | REPLACEME | |
| **Account-Level Identity** | Centralized identity federated to workspaces | REPLACEME | REPLACEME |
| **Multi-Factor Authentication** | MFA via identity provider | REPLACEME | |
| **OAuth / PAT Tokens** | Personal access tokens; OAuth for applications | REPLACEME | |
| **IP Access Lists** | Network-level access restrictions | REPLACEME | |
| **Private Connectivity** | Private Link (AWS/Azure), Private Service Connect (GCP) | REPLACEME | |

### Audit and Compliance

| Capability | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/databricks/FF3621" width="20" height="20" style="vertical-align: middle;"> Databricks</span> | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/snowflake/29B5E8" width="20" height="20" style="vertical-align: middle;"> {SOURCE_PLATFORM} Equivalent</span> | {SOURCE_PLATFORM} Limitations |
|------------|------------|-------------------|-------------------------------|
| **Audit Logging** | Comprehensive logs for all data access and operations | REPLACEME | |
| **System Tables** | Query audit logs, billing, lineage via SQL | REPLACEME | |
| **Log Delivery** | Stream to your cloud storage or SIEM | REPLACEME | |
| **Compliance Certifications** | SOC 2, HIPAA, FedRAMP, GDPR, etc. | REPLACEME | |
| **Data Residency** | Data stays in your cloud account and region | REPLACEME | REPLACEME |
| **Encryption at Rest** | Customer-managed keys (CMK) supported | REPLACEME | |
| **Encryption in Transit** | TLS 1.2+ for all connections | REPLACEME | |

### Data Sharing Comparison

| Capability | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/databricks/FF3621" width="20" height="20" style="vertical-align: middle;"> Databricks</span> | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/snowflake/29B5E8" width="20" height="20" style="vertical-align: middle;"> {SOURCE_PLATFORM} Equivalent</span> | {SOURCE_PLATFORM} Limitations |
|------------|------------|-------------------|-------------------------------|
| **Internal Sharing** | Native via Unity Catalog (same metastore) | REPLACEME | |
| **External Sharing** | Delta Sharing (open protocol) | REPLACEME | REPLACEME |
| **Share Recipients** | Any platform - no Databricks required | REPLACEME | Requires recipient to have {SOURCE_PLATFORM} account |
| **Marketplace** | Databricks Marketplace for data products | REPLACEME | |

## SQL Compatibility

Databricks uses **Spark SQL** which is largely ANSI SQL compliant. Most analytical queries translate with minimal changes, though some platform-specific syntax will require refactoring.


### SQL Feature Comparison

| Feature | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/databricks/FF3621" width="20" height="20" style="vertical-align: middle;"> Databricks (Spark SQL)</span> | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/snowflake/29B5E8" width="20" height="20" style="vertical-align: middle;"> {SOURCE_PLATFORM}</span> | Migration Notes |
|---------|------------|-------------------|-----------------|
| **ANSI SQL Compliance** | Largely ANSI compliant; some extensions | REPLACEME | Most SELECT/JOIN/GROUP BY translate directly |
| **SELECT, JOIN, GROUP BY** | Full support | REPLACEME | Direct translation |
| **Window Functions** | Full support (OVER, PARTITION BY, etc.) | REPLACEME | Direct translation |
| **CTEs (WITH clause)** | Full support including recursive CTEs | REPLACEME | Direct translation |
| **MERGE (Upsert)** | Full support on Delta tables | REPLACEME | Syntax differences may apply ¹ |
| **QUALIFY Clause** | Supported | REPLACEME | Direct translation |
| **Stored Procedures** | SQL and Python stored procedures (DBSQL) | REPLACEME | May require refactoring ² |
| **User-Defined Functions** | SQL, Python, Scala UDFs | REPLACEME | Python UDFs offer more flexibility |
| **Transactions** | Single-statement ACID; multi-statement in development | REPLACEME | Use Delta Lake for transactional guarantees |
| **Semi-Structured Data (JSON)** | Native JSON functions; schema_of_json; from_json | REPLACEME | Syntax differences; VARIANT → struct/string ³ |
| **FLATTEN (JSON arrays)** | explode(), inline(), posexplode() | REPLACEME | Different function names ³ |
| **ARRAY/MAP Functions** | Full support | REPLACEME | Some function names differ |
| **Regular Expressions** | regexp_extract, regexp_replace, rlike | REPLACEME | Similar functionality; syntax may vary |
| **Date/Time Functions** | Comprehensive; date_format, date_add, datediff | REPLACEME | Some function names differ ⁴ |
| **String Functions** | Full support | REPLACEME | Most translate directly |
| **COPY INTO** | Supported for bulk loading | REPLACEME | Similar syntax |
| **CREATE TABLE AS (CTAS)** | Full support | REPLACEME | Direct translation |
| **Temporary Tables** | CREATE TEMPORARY VIEW | REPLACEME | Session-scoped views instead of tables |
| **Materialized Views** | Supported (Serverless SQL/DLT) | REPLACEME | Refresh semantics may differ |
| **Dynamic Data Masking** | Column masks in Unity Catalog | REPLACEME | Policy-based approach |
| **Row Access Policies** | Row filters in Unity Catalog | REPLACEME | Policy-based approach |
| **EXECUTE IMMEDIATE** | Supported | REPLACEME | Dynamic SQL execution |
| **Scripting (loops, variables)** | SQL variables; Python notebooks for complex logic | REPLACEME | Notebooks preferred for procedural logic ² |
| **IDENTIFIER()** | Supported for dynamic object names | REPLACEME | Direct translation |

### Data Type Mapping

| <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/snowflake/29B5E8" width="20" height="20" style="vertical-align: middle;"> {SOURCE_PLATFORM}</span> | <span style="white-space: nowrap;"><img src="https://cdn.simpleicons.org/databricks/FF3621" width="20" height="20" style="vertical-align: middle;"> Databricks</span> | Notes |
|-------------------|------------|-------|
| `NUMBER` / `NUMERIC` / `DECIMAL` | `DECIMAL(p,s)` | Specify precision and scale explicitly |
| `INT` / `INTEGER` | `INT` | Direct mapping |
| `BIGINT` | `BIGINT` | Direct mapping |
| `SMALLINT` | `SMALLINT` | Direct mapping |
| `TINYINT` | `TINYINT` | Direct mapping |
| `FLOAT` / `FLOAT4` | `FLOAT` | Direct mapping |
| `DOUBLE` / `FLOAT8` | `DOUBLE` | Direct mapping |
| `VARCHAR(n)` | `STRING` | Databricks STRING is unbounded ⁵ |
| `CHAR(n)` | `STRING` | No fixed-width char in Spark |
| `STRING` / `TEXT` | `STRING` | Direct mapping |
| `BINARY` / `VARBINARY` | `BINARY` | Direct mapping |
| `BOOLEAN` | `BOOLEAN` | Direct mapping |
| `DATE` | `DATE` | Direct mapping |
| `TIME` | `STRING` or `TIMESTAMP` | No native TIME type ⁶ |
| `DATETIME` / `TIMESTAMP` | `TIMESTAMP` | Direct mapping |
| `TIMESTAMP_LTZ` | `TIMESTAMP` | Spark TIMESTAMP is always UTC-normalized |
| `TIMESTAMP_NTZ` | `TIMESTAMP_NTZ` | Supported in Databricks Runtime 13.3+ |
| `TIMESTAMP_TZ` | `TIMESTAMP` | Timezone info may need handling ⁶ |
| `VARIANT` | `STRING` (JSON) or `STRUCT` | Parse with from_json(); schema_of_json() ³ |
| `OBJECT` | `MAP<STRING, STRING>` or `STRUCT` | Structure must be defined or parsed |
| `ARRAY` | `ARRAY<type>` | Direct mapping; element type required |
| `GEOGRAPHY` | `STRING` (WKT/GeoJSON) | Use external libraries for geo functions ⁷ |
| `GEOMETRY` | `STRING` (WKT/GeoJSON) | Use external libraries for geo functions ⁷ |

### Common Syntax Differences

| Operation | {SOURCE_PLATFORM} Syntax | Databricks Syntax |
|-----------|--------------------------|-------------------|
| **Current timestamp** | `CURRENT_TIMESTAMP()` | `current_timestamp()` |
| **Date difference** | `DATEDIFF('day', start, end)` | `datediff(end, start)` ⁴ |
| **Add days to date** | `DATEADD('day', n, date)` | `date_add(date, n)` ⁴ |
| **String concatenation** | `col1 \|\| col2` or `CONCAT()` | `concat(col1, col2)` or `\|\|` |
| **NVL / null handling** | `NVL(col, default)` | `coalesce(col, default)` or `nvl()` |
| **IFF conditional** | `IFF(cond, true_val, false_val)` | `if(cond, true_val, false_val)` or `CASE` |
| **Type casting** | `col::type` or `CAST()` | `CAST(col AS type)` or `col::type` |
| **Flatten JSON array** | `LATERAL FLATTEN(input => col)` | `explode(col)` ³ |
| **Parse JSON** | `col:field` or `GET_PATH()` | `col.field` or `get_json_object()` ³ |
| **Regex match** | `REGEXP_LIKE(col, pattern)` | `col RLIKE pattern` |
| **Listagg / array_agg** | `LISTAGG(col, ',')` | `concat_ws(',', collect_list(col))` |
| **Top N per group** | `QUALIFY ROW_NUMBER() OVER(...) <= N` | `QUALIFY ROW_NUMBER() OVER(...) <= N` |
| **Sample data** | `SAMPLE (n ROWS)` | `TABLESAMPLE (n ROWS)` or `LIMIT` |
| **Clone table** | `CREATE TABLE ... CLONE` | `CREATE TABLE ... SHALLOW CLONE` / `DEEP CLONE` |


<div style="font-size: 0.85em; color: #555; line-height: 1.5;">

**Footnotes**

¹ **MERGE**: Databricks MERGE syntax is similar but may require adjustments for complex merge conditions. See [MERGE INTO documentation](https://docs.databricks.com/sql/language-manual/delta-merge-into.html).

² **Stored Procedures / Scripting**: For complex procedural logic, Databricks notebooks (Python/SQL) are preferred. SQL stored procedures are supported in Databricks SQL for simpler use cases.

³ **Semi-Structured Data**: Snowflake's VARIANT type and dot notation (`col:field`) map to Spark's `from_json()`, `get_json_object()`, and struct access (`col.field`). FLATTEN becomes `explode()` or `inline()`.

⁴ **Date Functions**: Argument order differs for some date functions. Snowflake: `DATEDIFF(part, start, end)` vs Databricks: `datediff(end, start)` returns days only; use `months_between()` for months.

⁵ **VARCHAR to STRING**: Databricks STRING has no length limit. If length enforcement is required, implement via CHECK constraints or application logic.

⁶ **TIME and TIMESTAMP_TZ**: Spark has no native TIME type; store as STRING or extract from TIMESTAMP. For timezone-aware timestamps, use `TIMESTAMP` with explicit timezone conversion functions.

⁷ **Geospatial**: Databricks supports geospatial via libraries like Apache Sedona, H3, and built-in H3 functions. Native GEOGRAPHY/GEOMETRY types are not directly supported.

</div>

## Summary

### What Translates Easily

- Standard SQL queries (`SELECT`, `JOIN`, `GROUP BY`)
- Basic DDL (`CREATE TABLE`, `CREATE VIEW`)
- Simple data types
- User/role definitions
- Basic access controls

### What Requires Adaptation or Redesign

- Stored procedures and complex UDFs
- Orchestration and scheduling logic
- Platform-specific features
- Proprietary features without direct equivalent
- External integrations
- Tightly coupled architectures


<div style="color: #FF3621; font-weight: bold; font-size: 2em; margin-bottom: 12px;">COURSE DEVELOPER (remove before publishing)</div>

### Source Specific Considerations

REPLACEME: Add comprehensive feature mapping for your specific source platform, including:

- Detailed SQL function equivalents
- Specific stored procedure migration patterns
- Platform-specific feature workarounds
- Known compatibility issues and solutions

&copy; 2026 Databricks, Inc. All rights reserved. Apache, Apache Spark, Spark, the Spark Logo, Apache Iceberg, Iceberg, and the Apache Iceberg logo are trademarks of the <a href="https://www.apache.org/" target="_blank">Apache Software Foundation</a>.<br/><br/><a href="https://databricks.com/privacy-policy" target="_blank">Privacy Policy</a> | <a href="https://databricks.com/terms-of-use" target="_blank">Terms of Use</a> | <a href="https://help.databricks.com/" target="_blank">Support</a>
