<div style="display: flex; justify-content: space-between; align-items: center; padding: 8px 16px; background: #F8F9FA; border-bottom: 2px solid #E0E0E0; margin: 0; line-height: 1;">
    <div style="font-size: 14px; color: #666;">
        <span style="font-weight: bold; color: #333;">{SOURCE_PLATFORM} ‚Üí Databricks Migration</span>
        <span style="margin-left: 8px; color: #999;">|</span>
        <span style="margin-left: 8px;">06 - Closeout</span>
    </div>
    <div style="display: flex; align-items: center; gap: 8px;">
        <img src="https://cdn.simpleicons.org/snowflake/29B5E8" width="24" height="24"/>
        <span style="color: #999; font-size: 16px;">‚Üí</span>
        <img src="https://cdn.simpleicons.org/databricks/FF3621" width="24" height="24"/>
    </div>
</div>


<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img
    src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png"
    alt="Databricks Learning"
  >
</div>

# Documentation and Knowledge Transfer

## Overview

Successful migration doesn't end with cutover. The final critical step is transferring knowledge and documentation to the teams who will operate and maintain the platform long-term. This lesson covers **project closure documentation** - not ongoing operational documentation, but the artifacts needed to hand off a completed migration to production operations.

**Key Distinction**: This is **project closure**, not ongoing operations. You're documenting *what was built, why decisions were made, and how to support it* - not creating perpetual operational runbooks.

## Learning Objectives

By the end of this lesson, you will be able to:
- Create Architecture Decision Records (ADRs) documenting key design choices
- Build runbooks and troubleshooting playbooks for common scenarios
- Compile a test inventory and validation query library for future use
- Execute effective ownership transfer to Operations/SRE teams
- Obtain formal stakeholder sign-off to close the project

## Knowledge Transfer Lifecycle

Knowledge transfer occurs throughout the project, not just at the end. The closeout phase consolidates and formalizes what was learned.

<br />
<div class="mermaid">
flowchart LR
    subgraph DURING["<b>During Migration</b>"]
        D1["Document<br/>Decisions"]
        D2["Capture<br/>Issues"]
        D3["Record<br/>Solutions"]
    end
    subgraph CLOSURE["<b>Project Closure</b>"]
        C1["Consolidate<br/>Documentation"]
        C2["Create<br/>Runbooks"]
        C3["Transfer<br/>Ownership"]
    end
    subgraph OPS["<b>Ongoing Operations</b>"]
        O1["Maintain<br/>Systems"]
        O2["Evolve<br/>Documentation"]
    end
    DURING --> CLOSURE --> OPS
    style DURING fill:#e3f2fd,stroke:#1976d2
    style CLOSURE fill:#fff3e0,stroke:#ff9800
    style OPS fill:#e8f5e9,stroke:#4caf50
</div>
<script type="module"> import mermaid from "https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.esm.min.mjs"; mermaid.initialize({ startOnLoad: true, theme: "default" }); </script>

## Documentation Types

| Document Type | Purpose | Audience | Lifecycle |
|---------------|---------|----------|----------|
| **ADRs** | Record architectural decisions and rationale | Architects, engineers | Permanent, immutable |
| **Runbooks** | Operational procedures for common tasks | SRE, Ops teams | Living, maintained by Ops |
| **Troubleshooting Playbooks** | Diagnostic steps for known issues | Support, SRE | Living, updated as issues arise |
| **Test Inventory** | Catalog of validation queries and tests | QA, Data Engineers | Reference, updated as needed |
| **Handoff Documentation** | Migration summary and system overview | All stakeholders | Archived after handoff |

<div style="border-left: 4px solid #2196f3; background: #e3f2fd; padding: 16px 20px; border-radius: 4px; margin: 16px 0;">
    <div style="display: flex; align-items: flex-start; gap: 12px;">
        <span style="font-size: 24px;">üí°</span>
        <div>
            <strong style="color: #1565c0; font-size: 1.1em;">Documentation Philosophy</strong>
            <p style="margin: 8px 0 0 0; color: #333;">
                The goal is to transfer <em>knowledge</em>, not just documents. Schedule hands-on sessions, pair programming, and shadowing opportunities alongside written documentation. The best runbook is one the Ops team helped write.
            </p>
        </div>
    </div>
</div>

## Architecture Decision Records (ADRs)

ADRs are lightweight documents that capture important architectural decisions made during the migration. They answer the question: **"Why did we build it this way?"**

### ADR Template

Use a consistent structure for all ADRs:

<div class="code-block" data-language="markdown">
# ADR-NNN: [Short Title]

## Status
[Proposed | Accepted | Deprecated | Superseded by ADR-XXX]

## Context
What is the issue we're facing? What forces are at play?
(Technical, political, social, and project-specific)

## Decision
What did we decide to do?
(State the decision clearly and concisely)

## Consequences
What are the positive and negative outcomes?
(Both benefits and tradeoffs)

## Alternatives Considered
What other options did we evaluate?
(Brief description of why they were rejected)
</div>

<link href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" rel="stylesheet" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/components/prism-markdown.min.js"></script>

<script>
(function() {
    document.querySelectorAll('.code-block').forEach(function(block) {
        var lang = block.getAttribute('data-language') || 'markdown';
        var code = block.textContent.trim();
        var id = 'code-' + Math.random().toString(36).substr(2, 9);
        
        block.innerHTML = 
            '<div style="position:relative;margin:16px 0;">' +
                '<button class="copy-btn" style="position:absolute;top:8px;right:8px;padding:4px 12px;font-size:12px;background:#ddd;color:#333;border:1px solid #ccc;border-radius:4px;cursor:pointer;z-index:10;">Copy</button>' +
                '<pre style="background:#f8f8f8;border-radius:8px;padding:16px;padding-top:40px;overflow-x:auto;margin:0;border:1px solid #e0e0e0;"><code id="' + id + '" class="language-' + lang + '" style="font-family:Consolas,Monaco,monospace;font-size:14px;"></code></pre>' +
            '</div>';
        
        var codeEl = document.getElementById(id);
        codeEl.textContent = code;
        Prism.highlightElement(codeEl);
        
        block.querySelector('.copy-btn').onclick = function() {
            var t = document.createElement('textarea');
            t.value = code;
            document.body.appendChild(t);
            t.select();
            document.execCommand('copy');
            document.body.removeChild(t);
            this.textContent = '‚úì Copied!';
            setTimeout(() => this.textContent = 'Copy', 2000);
        };
    });
})();
</script>

### Example ADRs for Migration Projects

| ADR # | Title | Key Question |
|-------|-------|-------------|
| **001** | Unity Catalog vs Hive Metastore | Which catalog architecture for our use case? |
| **002** | Delta Lake vs Iceberg table format | Which open table format provides best tradeoffs? |
| **003** | Lift-and-shift vs rewrite ETL patterns | Minimize risk or modernize during migration? |
| **004** | Shared cluster vs serverless compute | Cost optimization vs developer experience? |
| **005** | External tables vs managed tables | Where to store migrated data? |
| **006** | Databricks SQL vs external BI tools | Native vs integrated analytics? |
| **007** | Bronze/Silver/Gold medallion architecture | How to structure the lakehouse? |
| **008** | Reverse sync strategy to {SOURCE_PLATFORM} | How to maintain coexistence during transition? |

<div style="border-left: 4px solid #ff9800; background: #fff3e0; padding: 16px 20px; border-radius: 4px; margin: 16px 0;">
    <div style="display: flex; align-items: flex-start; gap: 12px;">
        <span style="font-size: 24px;">‚ö†Ô∏è</span>
        <div>
            <strong style="color: #e65100; font-size: 1.1em;">ADRs Are Immutable</strong>
            <p style="margin: 8px 0 0 0; color: #333;">
                Once accepted, ADRs should never be edited. If a decision changes, create a new ADR that supersedes the old one. This preserves the historical context of why decisions were made at that point in time.
            </p>
        </div>
    </div>
</div>

### Example: Complete ADR

<div class="code-block" data-language="markdown">
# ADR-003: Lift-and-Shift vs Rewrite ETL Patterns

## Status
Accepted (2024-01-15)

## Context
We have 200+ ETL pipelines in {SOURCE_PLATFORM} written as stored procedures 
and scheduled tasks. We face the decision of how to migrate them:

1. Lift-and-shift: Convert SQL to Databricks SQL with minimal changes
2. Rewrite: Modernize to Spark Declarative Pipelines (DLT)

Forces:
- Tight 6-month migration deadline
- Team has limited Spark/Python experience
- Existing pipelines are stable but difficult to maintain
- Business wants improved data quality and observability

## Decision
We will use a **hybrid approach**:
- Wave 1 (60 pipelines): Lift-and-shift to Databricks SQL
- Wave 2 (80 pipelines): Modernize to DLT after team training
- Wave 3 (60 pipelines): DLT with quality constraints

## Consequences

**Positive:**
- Meets timeline by getting quick wins with lift-and-shift
- Team learns Databricks gradually, reducing risk
- Later waves benefit from improved patterns and observability
- Business gets value incrementally

**Negative:**
- Wave 1 pipelines miss DLT benefits (auto-scaling, quality checks)
- May need to revisit Wave 1 pipelines post-migration
- Inconsistent patterns across waves during transition

## Alternatives Considered

**Option A: Rewrite everything to DLT**
- Rejected: Too risky given timeline and team skills
- Would delay value delivery by 3-4 months

**Option B: Lift-and-shift everything**
- Rejected: Misses opportunity to modernize
- Doesn't address technical debt or improve data quality
</div>

<link href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" rel="stylesheet" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/components/prism-markdown.min.js"></script>

<script>
(function() {
    document.querySelectorAll('.code-block').forEach(function(block) {
        var lang = block.getAttribute('data-language') || 'markdown';
        var code = block.textContent.trim();
        var id = 'code-' + Math.random().toString(36).substr(2, 9);
        
        block.innerHTML = 
            '<div style="position:relative;margin:16px 0;">' +
                '<button class="copy-btn" style="position:absolute;top:8px;right:8px;padding:4px 12px;font-size:12px;background:#ddd;color:#333;border:1px solid #ccc;border-radius:4px;cursor:pointer;z-index:10;">Copy</button>' +
                '<pre style="background:#f8f8f8;border-radius:8px;padding:16px;padding-top:40px;overflow-x:auto;margin:0;border:1px solid #e0e0e0;"><code id="' + id + '" class="language-' + lang + '" style="font-family:Consolas,Monaco,monospace;font-size:14px;"></code></pre>' +
            '</div>';
        
        var codeEl = document.getElementById(id);
        codeEl.textContent = code;
        Prism.highlightElement(codeEl);
        
        block.querySelector('.copy-btn').onclick = function() {
            var t = document.createElement('textarea');
            t.value = code;
            document.body.appendChild(t);
            t.select();
            document.execCommand('copy');
            document.body.removeChild(t);
            this.textContent = '‚úì Copied!';
            setTimeout(() => this.textContent = 'Copy', 2000);
        };
    });
})();
</script>

## Runbooks

Runbooks are step-by-step operational procedures for common tasks. Unlike ADRs, runbooks are **living documents** maintained by the operations team.

### Runbook Structure

<br />
<div class="mermaid">
flowchart TD
    A["<b>Runbook</b>"] --> B["<b>Purpose</b><br/>What task does this cover?"]
    A --> C["<b>Prerequisites</b><br/>What access/tools needed?"]
    A --> D["<b>Steps</b><br/>Detailed procedure"]
    A --> E["<b>Validation</b><br/>How to verify success?"]
    A --> F["<b>Rollback</b><br/>How to undo if needed?"]
    A --> G["<b>Escalation</b><br/>Who to contact for help?"]
    style A fill:#ff9800,stroke:#e65100,color:#fff
    style B fill:#fff3e0,stroke:#ff9800
    style C fill:#fff3e0,stroke:#ff9800
    style D fill:#fff3e0,stroke:#ff9800
    style E fill:#fff3e0,stroke:#ff9800
    style F fill:#fff3e0,stroke:#ff9800
    style G fill:#fff3e0,stroke:#ff9800
</div>
<script type="module"> import mermaid from "https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.esm.min.mjs"; mermaid.initialize({ startOnLoad: true, theme: "default" }); </script>

### Essential Runbooks for Migration Handoff

| Runbook | Purpose | Frequency |
|---------|---------|----------|
| **Cluster Startup/Shutdown** | Manage interactive clusters for cost control | Daily |
| **Job Failure Investigation** | Diagnose and resolve pipeline failures | As needed |
| **Workspace User Management** | Add/remove users, assign permissions | Weekly |
| **Cost Analysis and Optimization** | Review DBU consumption, identify waste | Monthly |
| **Schema Evolution** | Add columns or modify table structures | As needed |
| **Unity Catalog Grant Management** | Manage data access permissions | As needed |
| **Delta Table Optimization** | Run OPTIMIZE and VACUUM operations | Weekly |
| **Secret Rotation** | Update credentials and API keys | Quarterly |
| **Emergency Rollback** | Restore previous table version using time travel | Emergency only |

<div style="border-left: 4px solid #2196f3; background: #e3f2fd; padding: 16px 20px; border-radius: 4px; margin: 16px 0;">
    <div style="display: flex; align-items: flex-start; gap: 12px;">
        <span style="font-size: 24px;">üí°</span>
        <div>
            <strong style="color: #1565c0; font-size: 1.1em;">Runbook Best Practices</strong>
            <p style="margin: 8px 0 0 0; color: #333;">
                Include screenshots, example commands, and expected outputs. Test each runbook with someone unfamiliar with the system - if they can't follow it, rewrite it. Store runbooks in version control (Git) alongside code.
            </p>
        </div>
    </div>
</div>

### Example: Job Failure Investigation Runbook

<div class="code-block" data-language="markdown">
# Runbook: Investigate Failed Databricks Job

## Purpose
Diagnose and resolve failed Databricks job runs.

## Prerequisites
- Access to Databricks workspace
- CAN MANAGE permission on job (or workspace admin)
- Access to CloudWatch/Application Insights (for driver logs)

## Steps

### 1. Locate the Failed Job Run
- Navigate to **Workflows** ‚Üí **Jobs** in Databricks workspace
- Filter by status: "Failed"
- Click on the job name to view run history

### 2. Review Run Details
- Click on the failed run ID
- Note the **failure reason** in the status banner
- Check **Duration** - did it timeout or fail quickly?

### 3. Examine Task-Level Errors
- Expand the failed task in the run graph
- Click **View Logs** ‚Üí **Error Logs**
- Look for:
  - Python/SQL exceptions
  - "Table not found" errors
  - "Permission denied" errors
  - "Out of memory" errors

### 4. Common Failure Patterns

**Pattern A: Table Not Found**
```
Error: Table or view 'catalog.schema.table' not found
```
‚Üí Check if upstream job succeeded and created the table
‚Üí Verify catalog/schema permissions

**Pattern B: Permission Denied**
```
Error: User does not have SELECT privilege on table
```
‚Üí Run: `SHOW GRANTS ON TABLE catalog.schema.table`
‚Üí Grant missing permissions via Unity Catalog

**Pattern C: Out of Memory**
```
Error: java.lang.OutOfMemoryError: Java heap space
```
‚Üí Increase cluster size or enable autoscaling
‚Üí Optimize query (reduce shuffle, filter earlier)

## Validation
- Re-run the job after fixing the issue
- Verify downstream dependencies completed successfully
- Check data quality in target tables

## Rollback
If the fix caused issues:
```sql
RESTORE TABLE catalog.schema.table TO VERSION AS OF <previous_version>;
```

## Escalation
- For infrastructure issues: Contact SRE team (#sre-oncall)
- For data issues: Contact Data Engineering lead
- For Databricks platform issues: Open support ticket
</div>

<link href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" rel="stylesheet" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/components/prism-markdown.min.js"></script>

<script>
(function() {
    document.querySelectorAll('.code-block').forEach(function(block) {
        var lang = block.getAttribute('data-language') || 'markdown';
        var code = block.textContent.trim();
        var id = 'code-' + Math.random().toString(36).substr(2, 9);
        
        block.innerHTML = 
            '<div style="position:relative;margin:16px 0;">' +
                '<button class="copy-btn" style="position:absolute;top:8px;right:8px;padding:4px 12px;font-size:12px;background:#ddd;color:#333;border:1px solid #ccc;border-radius:4px;cursor:pointer;z-index:10;">Copy</button>' +
                '<pre style="background:#f8f8f8;border-radius:8px;padding:16px;padding-top:40px;overflow-x:auto;margin:0;border:1px solid #e0e0e0;"><code id="' + id + '" class="language-' + lang + '" style="font-family:Consolas,Monaco,monospace;font-size:14px;"></code></pre>' +
            '</div>';
        
        var codeEl = document.getElementById(id);
        codeEl.textContent = code;
        Prism.highlightElement(codeEl);
        
        block.querySelector('.copy-btn').onclick = function() {
            var t = document.createElement('textarea');
            t.value = code;
            document.body.appendChild(t);
            t.select();
            document.execCommand('copy');
            document.body.removeChild(t);
            this.textContent = '‚úì Copied!';
            setTimeout(() => this.textContent = 'Copy', 2000);
        };
    });
})();
</script>

## Troubleshooting Playbooks

Playbooks differ from runbooks in that they're diagnostic guides, not step-by-step procedures. They help responders **investigate** rather than **execute**.

### Playbook vs Runbook

| Aspect | Runbook | Playbook |
|--------|---------|----------|
| **Purpose** | Perform a task | Diagnose a problem |
| **Structure** | Linear steps | Decision tree |
| **Output** | Task completed | Root cause identified |
| **Example** | "How to add a user" | "Why is query slow?" |

### Example: Query Performance Troubleshooting Playbook

<br />
<div class="mermaid">
flowchart TD
    START["Query Running Slow"] --> CHECK1{"Check Query<br/>Profile"}
    CHECK1 --> |High shuffle time| SHUFFLE["<b>Shuffle Issue</b><br/>‚Ä¢ Repartition data<br/>‚Ä¢ Increase cluster size<br/>‚Ä¢ Add broadcast hint"]
    CHECK1 --> |High scan time| SCAN["<b>Scan Issue</b><br/>‚Ä¢ Check file sizes<br/>‚Ä¢ Run OPTIMIZE<br/>‚Ä¢ Add Z-ORDER<br/>‚Ä¢ Add partition filters"]
    CHECK1 --> |High task time| TASK["<b>Task Skew</b><br/>‚Ä¢ Repartition by key<br/>‚Ä¢ Salting technique<br/>‚Ä¢ Adaptive query execution"]
    CHECK1 --> |Spill to disk| MEMORY["<b>Memory Pressure</b><br/>‚Ä¢ Increase executor memory<br/>‚Ä¢ Reduce parallelism<br/>‚Ä¢ Cache intermediate results"]
    SHUFFLE --> VALIDATE["Re-run and validate"]
    SCAN --> VALIDATE
    TASK --> VALIDATE
    MEMORY --> VALIDATE
    VALIDATE --> |Improved| DONE["‚úì Resolved"]
    VALIDATE --> |Not improved| ESCALATE["Escalate to<br/>Databricks Support"]
    style START fill:#ffebee,stroke:#f44336
    style DONE fill:#e8f5e9,stroke:#4caf50
    style ESCALATE fill:#fff3e0,stroke:#ff9800
</div>
<script type="module"> import mermaid from "https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.esm.min.mjs"; mermaid.initialize({ startOnLoad: true, theme: "default" }); </script>

### Essential Troubleshooting Playbooks

Create playbooks for the most common issues encountered during migration:

| Playbook | Symptoms | Common Root Causes |
|----------|----------|-------------------|
| **Slow Query Performance** | Query takes >10x expected time | File sizes, skew, missing statistics |
| **Job Timeout** | Job exceeds configured timeout | Cluster size, infinite loops, stuck tasks |
| **Data Quality Failures** | DLT expectations failing | Schema drift, upstream changes, bad data |
| **Authentication Errors** | "User not authorized" errors | Token expiration, permission changes |
| **Cost Spike** | Unexpected DBU consumption | Runaway queries, misconfigured autoscaling |
| **Table Not Found** | "Table does not exist" errors | Catalog permissions, job dependencies |

<div style="border-left: 4px solid #ff9800; background: #fff3e0; padding: 16px 20px; border-radius: 4px; margin: 16px 0;">
    <div style="display: flex; align-items: flex-start; gap: 12px;">
        <span style="font-size: 24px;">‚ö†Ô∏è</span>
        <div>
            <strong style="color: #e65100; font-size: 1.1em;">Update Playbooks Based on Real Incidents</strong>
            <p style="margin: 8px 0 0 0; color: #333;">
                After each production incident, update the relevant playbook with what was learned. The best playbooks evolve from real operational experience, not theoretical knowledge.
            </p>
        </div>
    </div>
</div>

## Test Inventory and Validation Query Library

Preserve the validation work done during migration for future use. This library helps operations teams verify system health and troubleshoot issues.

### Test Inventory Structure

Store test metadata in a Delta table for easy querying:

<div class="code-block" data-language="sql">
CREATE TABLE IF NOT EXISTS migration_tests.test_inventory (
    test_id STRING,
    test_name STRING,
    test_category STRING, -- structural, quantitative, aggregate, content
    table_name STRING,
    test_query STRING,
    expected_result STRING,
    severity STRING, -- critical, high, medium, low
    owner STRING,
    created_date DATE,
    last_executed TIMESTAMP,
    last_result STRING,
    notes STRING
)
USING DELTA
CLUSTER BY (table_name, test_category);

-- Example test records
INSERT INTO migration_tests.test_inventory VALUES
('TEST-001', 'Customer row count validation', 'quantitative', 
 'catalog.silver.customers', 
 'SELECT COUNT(*) FROM catalog.silver.customers', 
 '1000000', 'critical', 'data-eng-team', 
 '2024-01-15', '2024-01-20 14:30:00', 'PASS', 
 'Compares against source snapshot'),
 
('TEST-002', 'Order amount sum validation', 'aggregate',
 'catalog.silver.orders',
 'SELECT SUM(order_amount) FROM catalog.silver.orders',
 '50000000.00', 'critical', 'data-eng-team',
 '2024-01-15', '2024-01-20 14:31:00', 'PASS',
 'Tolerance: ¬±0.01%');
</div>

<link href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" rel="stylesheet" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/components/prism-sql.min.js"></script>

<script>
(function() {
    document.querySelectorAll('.code-block').forEach(function(block) {
        var lang = block.getAttribute('data-language') || 'sql';
        var code = block.textContent.trim();
        var id = 'code-' + Math.random().toString(36).substr(2, 9);
        
        block.innerHTML = 
            '<div style="position:relative;margin:16px 0;">' +
                '<button class="copy-btn" style="position:absolute;top:8px;right:8px;padding:4px 12px;font-size:12px;background:#ddd;color:#333;border:1px solid #ccc;border-radius:4px;cursor:pointer;z-index:10;">Copy</button>' +
                '<pre style="background:#f8f8f8;border-radius:8px;padding:16px;padding-top:40px;overflow-x:auto;margin:0;border:1px solid #e0e0e0;"><code id="' + id + '" class="language-' + lang + '" style="font-family:Consolas,Monaco,monospace;font-size:14px;"></code></pre>' +
            '</div>';
        
        var codeEl = document.getElementById(id);
        codeEl.textContent = code;
        Prism.highlightElement(codeEl);
        
        block.querySelector('.copy-btn').onclick = function() {
            var t = document.createElement('textarea');
            t.value = code;
            document.body.appendChild(t);
            t.select();
            document.execCommand('copy');
            document.body.removeChild(t);
            this.textContent = '‚úì Copied!';
            setTimeout(() => this.textContent = 'Copy', 2000);
        };
    });
})();
</script>

### Validation Query Library

Package commonly used validation queries as SQL functions or stored procedures:

<div class="code-block" data-language="sql">
-- Create reusable validation functions
CREATE OR REPLACE FUNCTION migration_tests.row_count_check(
    table_name STRING
)
RETURNS TABLE(table_name STRING, row_count BIGINT, checked_at TIMESTAMP)
RETURN SELECT 
    table_name AS table_name,
    COUNT(*) AS row_count,
    CURRENT_TIMESTAMP() AS checked_at
FROM IDENTIFIER(table_name);

-- Usage
SELECT * FROM migration_tests.row_count_check('catalog.silver.customers');

-- Aggregate validation function
CREATE OR REPLACE FUNCTION migration_tests.numeric_aggregates(
    table_name STRING,
    column_name STRING
)
RETURNS TABLE(
    metric STRING, 
    value DOUBLE
)
RETURN SELECT 
    metric,
    value
FROM (
    SELECT 
        'count' AS metric, COUNT(*) AS value
    FROM IDENTIFIER(table_name)
    UNION ALL
    SELECT 
        'sum', SUM(IDENTIFIER(column_name))
    FROM IDENTIFIER(table_name)
    UNION ALL
    SELECT 
        'avg', AVG(IDENTIFIER(column_name))
    FROM IDENTIFIER(table_name)
    UNION ALL
    SELECT 
        'min', MIN(IDENTIFIER(column_name))
    FROM IDENTIFIER(table_name)
    UNION ALL
    SELECT 
        'max', MAX(IDENTIFIER(column_name))
    FROM IDENTIFIER(table_name)
);

-- Usage
SELECT * FROM migration_tests.numeric_aggregates(
    'catalog.silver.orders', 
    'order_amount'
);
</div>

<link href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" rel="stylesheet" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/components/prism-sql.min.js"></script>

<script>
(function() {
    document.querySelectorAll('.code-block').forEach(function(block) {
        var lang = block.getAttribute('data-language') || 'sql';
        var code = block.textContent.trim();
        var id = 'code-' + Math.random().toString(36).substr(2, 9);
        
        block.innerHTML = 
            '<div style="position:relative;margin:16px 0;">' +
                '<button class="copy-btn" style="position:absolute;top:8px;right:8px;padding:4px 12px;font-size:12px;background:#ddd;color:#333;border:1px solid #ccc;border-radius:4px;cursor:pointer;z-index:10;">Copy</button>' +
                '<pre style="background:#f8f8f8;border-radius:8px;padding:16px;padding-top:40px;overflow-x:auto;margin:0;border:1px solid #e0e0e0;"><code id="' + id + '" class="language-' + lang + '" style="font-family:Consolas,Monaco,monospace;font-size:14px;"></code></pre>' +
            '</div>';
        
        var codeEl = document.getElementById(id);
        codeEl.textContent = code;
        Prism.highlightElement(codeEl);
        
        block.querySelector('.copy-btn').onclick = function() {
            var t = document.createElement('textarea');
            t.value = code;
            document.body.appendChild(t);
            t.select();
            document.execCommand('copy');
            document.body.removeChild(t);
            this.textContent = '‚úì Copied!';
            setTimeout(() => this.textContent = 'Copy', 2000);
        };
    });
})();
</script>

## Ownership Transfer to Operations

Transferring ownership from the migration team to operations is a structured process, not a single handoff meeting.

### Transfer Process

<br />
<div class="mermaid">
gantt
    title Ownership Transfer Timeline
    dateFormat YYYY-MM-DD
    section Preparation
    Documentation Complete    :a1, 2024-01-01, 14d
    Runbook Review            :a2, after a1, 7d
    section Training
    Ops Team Training         :b1, after a2, 10d
    Shadow Migration Team     :b2, after b1, 14d
    section Transition
    Joint On-Call             :c1, after b2, 21d
    Ops Primary On-Call       :c2, after c1, 30d
    Migration Team Off-Call   :milestone, after c2, 0d
</div>
<script type="module"> import mermaid from "https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.esm.min.mjs"; mermaid.initialize({ startOnLoad: true, theme: "default" }); </script>

### Ownership Transfer Phases

| Phase | Duration | Migration Team Role | Ops Team Role | Success Criteria |
|-------|----------|--------------------|--------------|-----------------|
| **1. Documentation** | 2 weeks | Author runbooks and playbooks | Review for clarity | Ops can understand docs without help |
| **2. Training** | 2 weeks | Conduct training sessions | Attend and ask questions | Ops understands architecture |
| **3. Shadow** | 2 weeks | Perform operations | Observe and learn | Ops can anticipate actions |
| **4. Reverse Shadow** | 2 weeks | Observe and advise | Perform operations | Ops handles most issues independently |
| **5. Joint On-Call** | 3 weeks | Secondary on-call | Primary on-call | Ops resolves 80%+ incidents alone |
| **6. Ops Primary** | 4 weeks | Escalation only | Full ownership | Ops escalates <10% of incidents |

<div style="border-left: 4px solid #4caf50; background: #e8f5e9; padding: 16px 20px; border-radius: 4px; margin: 16px 0;">
    <div style="display: flex; align-items: flex-start; gap: 12px;">
        <span style="font-size: 24px;">‚úì</span>
        <div>
            <strong style="color: #2e7d32; font-size: 1.1em;">Gradual Transition Reduces Risk</strong>
            <p style="margin: 8px 0 0 0; color: #333;">
                The most successful transfers happen gradually over 8-12 weeks. Resist pressure to "throw it over the wall" immediately after cutover. Budget time for knowledge transfer in the project plan.
            </p>
        </div>
    </div>
</div>

### Training Topics for Operations

Ensure operations teams are trained on these key areas:

| Topic | Content | Delivery Method |
|-------|---------|----------------|
| **Databricks Platform Basics** | Workspace navigation, clusters, jobs, notebooks | Hands-on workshop (4 hrs) |
| **Architecture Overview** | Medallion architecture, catalogs, schemas, tables | Presentation + diagram review (2 hrs) |
| **Job Monitoring** | Workflow UI, logs, alerts, common failures | Live demo + exercises (3 hrs) |
| **Cost Management** | DBU consumption, cluster sizing, autoscaling | Presentation + cost analysis (2 hrs) |
| **Security & Access Control** | Unity Catalog, grants, service principals | Hands-on exercises (3 hrs) |
| **Incident Response** | Runbooks, playbooks, escalation paths | Tabletop exercise (2 hrs) |
| **Delta Table Operations** | OPTIMIZE, VACUUM, time travel, RESTORE | Hands-on exercises (2 hrs) |

**Total Training Time**: ~18 hours over 3-4 sessions

## Stakeholder Sign-Off Process

Formal sign-off closes the migration project and transitions to business-as-usual operations.

### Sign-Off Checklist

Before requesting sign-off, ensure all criteria are met:

<div class="code-block" data-language="markdown">
## Migration Completion Checklist

### Technical Deliverables
- [ ] All tables migrated and validated (100% row count match)
- [ ] All pipelines converted and tested in production
- [ ] All BI dashboards repointed to Databricks
- [ ] Parallel run completed (30 days, zero discrepancies)
- [ ] Performance benchmarks met or exceeded
- [ ] Security controls validated (authentication, authorization, audit)

### Documentation Deliverables
- [ ] Architecture Decision Records (ADRs) published
- [ ] Runbooks created and tested
- [ ] Troubleshooting playbooks documented
- [ ] Test inventory and validation query library delivered
- [ ] System architecture diagrams updated
- [ ] Data lineage documented

### Operations Deliverables
- [ ] Operations team trained
- [ ] Shadow period completed (2+ weeks)
- [ ] On-call rotation transferred
- [ ] Monitoring and alerting configured
- [ ] Incident response procedures tested
- [ ] Backup and recovery procedures validated

### Business Deliverables
- [ ] User acceptance testing (UAT) passed
- [ ] Business stakeholders trained on new platform
- [ ] Cost baseline established and optimized
- [ ] Compliance requirements met (SOC2, GDPR, HIPAA, etc.)
- [ ] Service Level Agreements (SLAs) defined and measured

### Decommissioning Prep
- [ ] {SOURCE_PLATFORM} decommissioning plan created
- [ ] Data retention requirements documented
- [ ] Contract termination timeline confirmed
- [ ] Final cost comparison report delivered
</div>

<link href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" rel="stylesheet" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/components/prism-markdown.min.js"></script>

<script>
(function() {
    document.querySelectorAll('.code-block').forEach(function(block) {
        var lang = block.getAttribute('data-language') || 'markdown';
        var code = block.textContent.trim();
        var id = 'code-' + Math.random().toString(36).substr(2, 9);
        
        block.innerHTML = 
            '<div style="position:relative;margin:16px 0;">' +
                '<button class="copy-btn" style="position:absolute;top:8px;right:8px;padding:4px 12px;font-size:12px;background:#ddd;color:#333;border:1px solid #ccc;border-radius:4px;cursor:pointer;z-index:10;">Copy</button>' +
                '<pre style="background:#f8f8f8;border-radius:8px;padding:16px;padding-top:40px;overflow-x:auto;margin:0;border:1px solid #e0e0e0;"><code id="' + id + '" class="language-' + lang + '" style="font-family:Consolas,Monaco,monospace;font-size:14px;"></code></pre>' +
            '</div>';
        
        var codeEl = document.getElementById(id);
        codeEl.textContent = code;
        Prism.highlightElement(codeEl);
        
        block.querySelector('.copy-btn').onclick = function() {
            var t = document.createElement('textarea');
            t.value = code;
            document.body.appendChild(t);
            t.select();
            document.execCommand('copy');
            document.body.removeChild(t);
            this.textContent = '‚úì Copied!';
            setTimeout(() => this.textContent = 'Copy', 2000);
        };
    });
})();
</script>

### Sign-Off Meeting Agenda

Conduct a formal sign-off meeting with key stakeholders:

| Agenda Item | Duration | Presenter | Deliverable |
|-------------|----------|-----------|-------------|
| **1. Migration Summary** | 15 min | Project Manager | Executive summary deck |
| **2. Success Metrics** | 15 min | Data Engineering Lead | KPI dashboard |
| **3. Validation Results** | 20 min | QA Lead | Reconciliation report |
| **4. Performance Benchmarks** | 15 min | Platform Architect | Performance comparison |
| **5. Cost Analysis** | 15 min | FinOps Lead | Cost savings report |
| **6. Documentation Review** | 10 min | Technical Writer | Document repository tour |
| **7. Operations Readiness** | 10 min | SRE Manager | Training completion report |
| **8. Lessons Learned** | 15 min | All | Retrospective summary |
| **9. Sign-Off** | 10 min | All | Formal acceptance document |

**Total Duration**: ~2 hours

### Formal Sign-Off Document Template

<div class="code-block" data-language="markdown">
# {SOURCE_PLATFORM} to Databricks Migration - Formal Acceptance

**Project Name**: {SOURCE_PLATFORM} to Databricks Migration  
**Sign-Off Date**: January 31, 2024  
**Project Duration**: September 1, 2023 - January 31, 2024 (5 months)

## Executive Summary
This document confirms the successful completion of the {SOURCE_PLATFORM} to 
Databricks migration project. All technical deliverables have been completed, 
validated, and transferred to the operations team.

## Scope Delivered
- ‚úì 150 tables migrated (100% of in-scope tables)
- ‚úì 85 ETL pipelines converted and validated
- ‚úì 45 BI dashboards repointed to Databricks
- ‚úì 30-day parallel run completed with zero discrepancies
- ‚úì Operations team trained and on-call rotation established

## Success Criteria Met
| Criterion | Target | Actual | Status |
|-----------|--------|--------|--------|
| Data validation pass rate | 100% | 100% | ‚úì |
| Query performance | ‚â§ {SOURCE_PLATFORM} | 2.3x faster | ‚úì |
| Monthly cost reduction | 30% | 42% | ‚úì |
| Uptime SLA | 99.9% | 99.95% | ‚úì |
| Zero data loss incidents | 0 | 0 | ‚úì |

## Stakeholder Acknowledgment
By signing below, stakeholders acknowledge acceptance of the migration 
deliverables and agree that the project is complete.

**Executive Sponsor**  
Name: _____________________  
Signature: _________________  Date: __________

**Data Engineering Manager**  
Name: _____________________  
Signature: _________________  Date: __________

**SRE Manager**  
Name: _____________________  
Signature: _________________  Date: __________

**Business Intelligence Lead**  
Name: _____________________  
Signature: _________________  Date: __________

## Next Steps
1. Archive project documentation to company knowledge base
2. Conduct post-implementation review (PIR) in 90 days
3. Begin {SOURCE_PLATFORM} decommissioning (see 6.2)
4. Transition to continuous improvement mode
</div>

<link href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css" rel="stylesheet" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/prism.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/components/prism-markdown.min.js"></script>

<script>
(function() {
    document.querySelectorAll('.code-block').forEach(function(block) {
        var lang = block.getAttribute('data-language') || 'markdown';
        var code = block.textContent.trim();
        var id = 'code-' + Math.random().toString(36).substr(2, 9);
        
        block.innerHTML = 
            '<div style="position:relative;margin:16px 0;">' +
                '<button class="copy-btn" style="position:absolute;top:8px;right:8px;padding:4px 12px;font-size:12px;background:#ddd;color:#333;border:1px solid #ccc;border-radius:4px;cursor:pointer;z-index:10;">Copy</button>' +
                '<pre style="background:#f8f8f8;border-radius:8px;padding:16px;padding-top:40px;overflow-x:auto;margin:0;border:1px solid #e0e0e0;"><code id="' + id + '" class="language-' + lang + '" style="font-family:Consolas,Monaco,monospace;font-size:14px;"></code></pre>' +
            '</div>';
        
        var codeEl = document.getElementById(id);
        codeEl.textContent = code;
        Prism.highlightElement(codeEl);
        
        block.querySelector('.copy-btn').onclick = function() {
            var t = document.createElement('textarea');
            t.value = code;
            document.body.appendChild(t);
            t.select();
            document.execCommand('copy');
            document.body.removeChild(t);
            this.textContent = '‚úì Copied!';
            setTimeout(() => this.textContent = 'Copy', 2000);
        };
    });
})();
</script>

## Summary and Key Takeaways

### Documentation Artifacts

| Artifact | Purpose | Owner After Handoff |
|----------|---------|--------------------|
| **ADRs** | Explain architectural decisions | Architecture team (read-only) |
| **Runbooks** | Execute operational tasks | SRE/Ops team (living) |
| **Playbooks** | Diagnose and troubleshoot issues | SRE/Ops team (living) |
| **Test Library** | Validate system health | QA/Data Engineering (reference) |
| **Training Materials** | Onboard new team members | Learning & Development |

### Knowledge Transfer Best Practices

- **Start early**: Begin documentation during migration, not at the end
- **Transfer knowledge, not just documents**: Hands-on training beats written docs
- **Gradual transition**: 8-12 week handoff reduces risk
- **Test the docs**: Have someone unfamiliar execute runbooks
- **Make docs discoverable**: Use consistent naming, searchable repository
- **Keep docs alive**: Ops team owns updates, not migration team

### Project Closure Checklist

‚úì ADRs published and archived  
‚úì Runbooks created and tested  
‚úì Troubleshooting playbooks documented  
‚úì Test inventory delivered  
‚úì Operations team trained  
‚úì On-call rotation transferred  
‚úì Stakeholder sign-off obtained  
‚úì Post-implementation review scheduled

<div style="border-left: 4px solid #4caf50; background: #e8f5e9; padding: 16px 20px; border-radius: 4px; margin: 16px 0;">
    <div style="display: flex; align-items: flex-start; gap: 12px;">
        <span style="font-size: 24px;">‚úì</span>
        <div>
            <strong style="color: #2e7d32; font-size: 1.1em;">The Best Migrations Don't End</strong>
            <p style="margin: 8px 0 0 0; color: #333;">
                The most successful migrations transition from "project mode" to "continuous improvement mode." The operations team should feel empowered to evolve the platform beyond what the migration team delivered.
            </p>
        </div>
    </div>
</div>

### Next Steps

With documentation complete and ownership transferred, proceed to decommission the source platform:

- [**6.2 - Decommissioning and Retirement**]($./6.2 - Decommissioning and Retirement) - Safely retire {SOURCE_PLATFORM} infrastructure

<div style="color: #FF3621; font-weight: bold; font-size: 2em; margin-bottom: 12px;">COURSE DEVELOPER (remove before publishing)</div>

### Template Customization

**Placeholders to replace:**
- `{SOURCE_PLATFORM}` - Source platform name

**Platform-specific additions:**
- Update ADR examples with platform-specific architectural decisions (e.g., Redshift Spectrum vs external tables, BigQuery slots vs Databricks compute)
- Add runbooks for platform-specific coexistence patterns (e.g., reverse sync mechanisms)
- Include troubleshooting playbooks for known migration issues from that platform
- Update success criteria examples with realistic metrics for that platform

**Content notes:**
- Emphasize that this is **project closure** documentation, not ongoing operational documentation
- ADRs should be stored in version control (Git) alongside code, not in wikis or shared drives
- Runbooks should include screenshots and example outputs
- Test inventory should be stored as Delta tables for easy querying
- Knowledge transfer timeline (8-12 weeks) should be included in initial project plan

&copy; 2026 Databricks, Inc. All rights reserved. Apache, Apache Spark, Spark, the Spark Logo, Apache Iceberg, Iceberg, and the Apache Iceberg logo are trademarks of the <a href="https://www.apache.org/" target="_blank">Apache Software Foundation</a>.<br/><br/><a href="https://databricks.com/privacy-policy" target="_blank">Privacy Policy</a> | <a href="https://databricks.com/terms-of-use" target="_blank">Terms of Use</a> | <a href="https://help.databricks.com/" target="_blank">Support</a>
