# Databricks Jobs Orchestration

---

## Agenda

1. Wprowadzenie do Databricks Jobs
2. Multi-task Jobs i zależności
3. Task Types: Notebook, DLT, SQL, dbt
4. Parametryzacja i Widgets
5. Monitoring, Alerting i Retry Logic
6. Best Practices dla produkcyjnych Jobs

---

## Cele szkolenia

Po tym module będziesz potrafić:
- Tworzyć multi-task Databricks Jobs
- Konfigurować zależności między taskami
- Parametryzować workflows
- Monitorować i debugować Jobs
- Implementować retry logic i alerting

---

## 1. Wprowadzenie do Lakeflow Jobs

**Lakeflow Jobs** (dawniej Databricks Jobs) to zarządzany service orkiestracji dla:
- ETL/ELT pipelines
- Machine Learning workflows
- Scheduled reports
- Data quality checks

### Kluczowe cechy:
- **Multi-task workflows**: DAG (Directed Acyclic Graph)
- **Task types**: Notebook, Lakeflow SDP (dawniej DLT), SQL, dbt, Python wheel, JAR
- **Scheduling**: cron, continuous, triggered
- **Retry logic**: automatyczne retry przy błędach
- **Alerting**: email, webhooks, integrations
- **Cost optimization**: Serverless, spot instances, autoscaling

### Serverless Jobs (zalecane od 2024)

**Serverless compute dla Jobs** eliminuje potrzebę konfiguracji klastrów:
- Near-zero startup time
- Automatic optimization przez Databricks
- Pay-per-use billing
- Autoscaling i Photon włączone domyślnie
- Performance Mode (od April 2025) - wybór między wydajnością a kosztami

### Jobs vs Lakeflow SDP:

| Feature | Lakeflow Jobs | Lakeflow SDP (DLT) |
|---------|---------------|-------------------|
| Use Case | General orchestration | ETL pipelines |
| Task Types | Notebook, SQL, dbt, SDP | SDP only |
| Dependencies | Manual configuration | Automatic (DAG) |
| Data Quality | Custom code | Built-in expectations |
| Flexibility | High | Opinionated |

---

## 2. Multi-task Jobs i zależności

### Podstawowa struktura Job:

```
Job: Daily_ETL_Pipeline
├── Task 1: ingest_raw_data (Notebook)
├── Task 2: validate_data (SQL) → depends_on: Task 1
├── Task 3: transform_silver (Notebook) → depends_on: Task 2
└── Task 4: aggregate_gold (DLT) → depends_on: Task 3
```

### Przykład konfiguracji Job (JSON):

In [None]:
# Konfiguracja Job przez Databricks REST API lub UI

job_config = {
    "name": "KION_Daily_Orders_ETL",
    "email_notifications": {
        "on_failure": ["data-team@kion.com"],
        "on_success": ["data-team@kion.com"]
    },
    "timeout_seconds": 7200,  # 2 hours
    "max_concurrent_runs": 1,
    "format": "MULTI_TASK",
    "tasks": [
        {
            "task_key": "ingest_bronze",
            "notebook_task": {
                "notebook_path": "/Workspace/KION/notebooks/01_ingest_orders",
                "base_parameters": {
                    "source_path": "/Volumes/main/default/kion_data/orders",
                    "target_table": "bronze_orders"
                }
            },
            "new_cluster": {
                "spark_version": "13.3.x-scala2.12",
                "node_type_id": "Standard_DS3_v2",
                "num_workers": 2,
                "autoscale": {
                    "min_workers": 1,
                    "max_workers": 4
                }
            },
            "timeout_seconds": 3600
        },
        {
            "task_key": "transform_silver",
            "depends_on": [{"task_key": "ingest_bronze"}],
            "notebook_task": {
                "notebook_path": "/Workspace/KION/notebooks/02_transform_silver",
                "base_parameters": {
                    "source_table": "bronze_orders",
                    "target_table": "silver_orders"
                }
            },
            "existing_cluster_id": "{{previous_task_cluster}}"
        },
        {
            "task_key": "aggregate_gold",
            "depends_on": [{"task_key": "transform_silver"}],
            "sql_task": {
                "warehouse_id": "abc123def456",
                "query": {
                    "query_id": "gold_aggregations_query_id"
                }
            }
        }
    ],
    "schedule": {
        "quartz_cron_expression": "0 0 2 * * ?",  # Daily at 2 AM
        "timezone_id": "Europe/Warsaw"
    }
}

print("Job configuration ready!")

### Dependency patterns:

#### 1. Linear pipeline (sequential):
```
Task A → Task B → Task C → Task D
```

#### 2. Fan-out (parallel processing):
```
 Task A
 / | \
 B C D
```

#### 3. Fan-in (merge results):
```
 A B C
 \ | /
 Task D
```

#### 4. Diamond (complex):
```
 Task A
 / \
 B C
 \ /
 Task D
```

---

## 3. Task Types: Notebook, DLT, SQL, dbt

### 1. Notebook Task

**Use case**: Python/Scala/R processing, custom logic

**Example notebook** (`01_ingest_orders.ipynb`):

In [None]:
# Notebook Task Example
# File: 01_ingest_orders.ipynb

# Get parameters from Job
dbutils.widgets.text("source_path", "/Volumes/main/default/kion_data/orders")
dbutils.widgets.text("target_table", "bronze_orders")
dbutils.widgets.text("run_date", "")

source_path = dbutils.widgets.get("source_path")
target_table = dbutils.widgets.get("target_table")
run_date = dbutils.widgets.get("run_date") or datetime.now().strftime("%Y-%m-%d")

print(f"Starting ingestion from {source_path} to {target_table}")
print(f"Run date: {run_date}")

In [None]:
from pyspark.sql.functions import *
from datetime import datetime

# Load data
df = (
    spark.read.format("csv")
    .option("header", "true")
    .option("inferSchema", "true")
    .load(f"{source_path}/*.csv")
)

# Add audit columns
df_with_audit = (
    df
    .withColumn("ingestion_timestamp", current_timestamp())
    .withColumn("source_file", input_file_name())
    .withColumn("run_date", lit(run_date))
)

# Write to Delta
df_with_audit.write.format("delta").mode("append").saveAsTable(target_table)

# Return metrics to Job
row_count = df_with_audit.count()
dbutils.notebook.exit({
    "status": "success",
    "rows_ingested": row_count,
    "target_table": target_table
})

print(f"Ingested {row_count} rows to {target_table}")

### 2. SQL Task

**Use case**: SQL-based transformations, aggregations

**Example SQL query**:

In [None]:
# SQL Task - może być zapisane jako SQL Query w Databricks SQL

sql_query = """
-- Gold Layer Aggregation
CREATE OR REPLACE TABLE gold_daily_sales AS
SELECT 
    order_date,
    COUNT(DISTINCT customer_id) as unique_customers,
    COUNT(order_id) as total_orders,
    SUM(amount) as total_revenue,
    AVG(amount) as avg_order_value,
    MAX(amount) as max_order_value,
    CURRENT_TIMESTAMP() as calculated_at
FROM silver_orders
WHERE status = 'completed'
GROUP BY order_date
ORDER BY order_date DESC
"""

# W Job configuration:
# "sql_task": {
#     "warehouse_id": "abc123",
#     "query": {"query_id": "saved_query_id"}
# }

### 3. DLT Task

**Use case**: Delta Live Tables pipeline execution

In [None]:
# DLT Task configuration
dlt_task = {
    "task_key": "run_dlt_pipeline",
    "depends_on": [{"task_key": "validate_source_data"}],
    "pipeline_task": {
        "pipeline_id": "abc123-dlt-pipeline-id",
        "full_refresh": False  # Incremental by default
    }
}

# DLT pipeline będzie uruchomiony jako task w większym Job workflow

### 4. dbt Task

**Use case**: dbt models execution

In [None]:
# dbt Task configuration
dbt_task = {
    "task_key": "run_dbt_models",
    "dbt_task": {
        "project_directory": "/Workspace/KION/dbt",
        "commands": [
            "dbt deps",
            "dbt seed",
            "dbt run --models tag:daily",
            "dbt test"
        ],
        "profiles_directory": "/Workspace/KION/dbt",
        "warehouse_id": "abc123"
    }
}

# dbt models mogą być zintegrowane jako tasks w Job workflow

---

## 4. Parametryzacja i Widgets

### Passing parameters między taskami:

#### Metoda 1: Job-level parameters

In [None]:
# Job configuration z parametrami
job_with_params = {
    "name": "Parameterized_ETL_Job",
    "tasks": [
        {
            "task_key": "task_1",
            "notebook_task": {
                "notebook_path": "/Workspace/KION/task1",
                "base_parameters": {
                    "environment": "production",
                    "run_date": "{{job.start_time.iso_date}}",  # Dynamic parameter
                    "source_path": "/Volumes/main/default/kion_data"
                }
            }
        }
    ]
}

# W notebooku odbieramy parametry:
# dbutils.widgets.text("environment", "dev")
# environment = dbutils.widgets.get("environment")

#### Metoda 2: Task output → Next task input

In [None]:
# Task 1 - zwraca output
# File: task1_ingest.ipynb

# ... ingestion logic ...

# Return output
import json
output = {
    "rows_ingested": 1000,
    "target_table": "bronze_orders",
    "max_order_date": "2024-01-15"
}
dbutils.notebook.exit(json.dumps(output))

In [None]:
# Task 2 - odbiera output z Task 1
# File: task2_transform.ipynb

import json

# Get output from previous task
task1_output = dbutils.jobs.taskValues.get(
    taskKey="task_1",
    key="default",
    default="{}",
    debugValue="{}"
)

output_dict = json.loads(task1_output)
source_table = output_dict.get("target_table")
max_date = output_dict.get("max_order_date")

print(f"Processing data from {source_table} up to {max_date}")

# Use in transformation
df = spark.table(source_table).filter(f"order_date <= '{max_date}'")
# ... rest of transformation ...

#### Metoda 3: Dynamic values

In [None]:
# Databricks wspiera dynamic values w parametrach:

dynamic_params = {
    "run_date": "{{job.start_time.iso_date}}",  # YYYY-MM-DD
    "run_timestamp": "{{job.start_time}}",      # Full timestamp
    "job_id": "{{job.id}}",
    "run_id": "{{run.id}}",
    "parent_run_id": "{{parent_run.id}}"  # Dla nested jobs
}

# Przykład użycia:
# base_parameters: {
#     "processing_date": "{{job.start_time.iso_date}}",
#     "job_run_id": "{{run.id}}"
# }

---

## 5. Monitoring, Alerting i Retry Logic

### Retry Logic:

In [None]:
# Task-level retry configuration
task_with_retry = {
    "task_key": "data_ingestion",
    "notebook_task": {
        "notebook_path": "/Workspace/KION/ingest"
    },
    "max_retries": 3,  # Retry up to 3 times
    "min_retry_interval_millis": 60000,  # Wait 1 minute between retries
    "retry_on_timeout": True
}

# Best practice: używaj retry dla transient errors (network, API rate limits)
# NIE używaj retry dla data quality issues

### Email Alerting:

In [None]:
# Email notifications configuration
email_config = {
    "email_notifications": {
        "on_start": ["ops@kion.com"],
        "on_success": ["data-team@kion.com"],
        "on_failure": ["data-team@kion.com", "on-call@kion.com"],
        "on_duration_warning_threshold_exceeded": ["ops@kion.com"],
        "no_alert_for_skipped_runs": True
    },
    "health": {
        "rules": [
            {
                "metric": "RUN_DURATION_SECONDS",
                "op": "GREATER_THAN",
                "value": 3600  # Alert if job runs > 1 hour
            }
        ]
    }
}

### Webhook Integration (Slack, Teams, PagerDuty):

In [None]:
# Webhook configuration dla Slack
webhook_config = {
    "webhook_notifications": {
        "on_failure": [
            {
                "id": "slack-data-alerts"
            }
        ],
        "on_success": [
            {
                "id": "slack-data-alerts"
            }
        ]
    }
}

# Webhook musi być wcześniej skonfigurowany w Databricks UI:
# Admin Settings → Webhooks → Create Webhook

### Custom monitoring w notebooku:

In [None]:
# Custom monitoring logic w task notebook

from pyspark.sql.functions import *

# Load data
df = spark.table("bronze_orders")

# Data quality checks
total_rows = df.count()
null_order_ids = df.filter(col("order_id").isNull()).count()
null_percentage = (null_order_ids / total_rows) * 100

# Alert if quality threshold exceeded
if null_percentage > 5:
    error_msg = f"Data quality issue: {null_percentage:.2f}% null order_ids"
    print(error_msg)
    
    # Log to monitoring table
    monitoring_df = spark.createDataFrame([
        {
            "job_run_id": dbutils.notebook.entry_point.getDbutils().notebook().getContext().tags().get("jobId").get(),
            "check_type": "null_check",
            "metric_name": "null_order_id_percentage",
            "metric_value": null_percentage,
            "threshold": 5.0,
            "status": "FAILED",
            "timestamp": current_timestamp()
        }
    ])
    monitoring_df.write.format("delta").mode("append").saveAsTable("job_monitoring_metrics")
    
    # Fail the job
    raise Exception(error_msg)

print(f"Quality check passed: {null_percentage:.2f}% null order_ids")

### Monitoring Job runs przez API:

In [None]:
# Query system tables dla job history
job_history = spark.sql("""
    SELECT 
        job_id,
        job_name,
        run_id,
        start_time,
        end_time,
        DATEDIFF(SECOND, start_time, end_time) as duration_seconds,
        result_state,
        task_runs
    FROM system.lakeflow.job_runs
    WHERE job_name = 'KION_Daily_Orders_ETL'
        AND start_time >= current_date() - INTERVAL 7 DAYS
    ORDER BY start_time DESC
""")
job_history.display()

In [None]:
# SLA monitoring - track on-time completion
sla_monitoring = spark.sql("""
    WITH daily_runs AS (
        SELECT 
            DATE(start_time) as run_date,
            COUNT(*) as total_runs,
            SUM(CASE WHEN result_state = 'SUCCESS' THEN 1 ELSE 0 END) as successful_runs,
            SUM(CASE WHEN result_state = 'FAILED' THEN 1 ELSE 0 END) as failed_runs,
            AVG(DATEDIFF(SECOND, start_time, end_time)) as avg_duration_seconds
        FROM system.lakeflow.job_runs
        WHERE job_name = 'KION_Daily_Orders_ETL'
            AND start_time >= current_date() - INTERVAL 30 DAYS
        GROUP BY run_date
    )
    SELECT 
        run_date,
        total_runs,
        successful_runs,
        failed_runs,
        ROUND(successful_runs * 100.0 / total_runs, 2) as success_rate_pct,
        ROUND(avg_duration_seconds / 60, 2) as avg_duration_minutes
    FROM daily_runs
    ORDER BY run_date DESC
""")
sla_monitoring.display()

---

## 6. Best Practices dla produkcyjnych Jobs

### 1. Cluster Strategy:

In [None]:
# Best Practice: Job Clusters (new cluster per run)
job_cluster_config = {
    "new_cluster": {
        "spark_version": "13.3.x-scala2.12",
        "node_type_id": "Standard_DS3_v2",
        "autoscale": {
            "min_workers": 1,
            "max_workers": 8
        },
        "spark_conf": {
            "spark.databricks.delta.optimizeWrite.enabled": "true",
            "spark.databricks.delta.autoCompact.enabled": "true"
        },
        "aws_attributes": {
            "availability": "SPOT_WITH_FALLBACK",  # Cost optimization
            "spot_bid_price_percent": 100
        }
    }
}

# Alternative: Cluster reuse dla linked tasks
# "existing_cluster_id": "{{previous_task_cluster}}"

### 2. Error Handling w Notebooks:

In [None]:
# Robust error handling w notebook tasks

try:
    # Main processing logic
    df = spark.table("bronze_orders")
    
    # Validation
    assert df.count() > 0, "Source table is empty"
    
    # Transformation
    result_df = df.filter(col("order_id").isNotNull())
    
    # Write result
    result_df.write.format("delta").mode("overwrite").saveAsTable("silver_orders")
    
    # Return success
    dbutils.notebook.exit(json.dumps({
        "status": "SUCCESS",
        "rows_processed": result_df.count()
    }))
    
except AssertionError as e:
    # Data validation failure
    print(f"Validation Error: {str(e)}")
    dbutils.notebook.exit(json.dumps({
        "status": "FAILED",
        "error_type": "VALIDATION_ERROR",
        "error_message": str(e)
    }))
    
except Exception as e:
    # Unexpected error
    print(f"Unexpected Error: {str(e)}")
    import traceback
    traceback.print_exc()
    dbutils.notebook.exit(json.dumps({
        "status": "FAILED",
        "error_type": "UNEXPECTED_ERROR",
        "error_message": str(e)
    }))

### 3. Idempotency:

In [None]:
# Best Practice: Idempotent operations
# Job powinien móc być uruchomiony wielokrotnie bez side effects

# BAD: Non-idempotent
# df.write.format("delta").mode("append").saveAsTable("target")
# ^ Re-running duplicates data

In [None]:
# GOOD: Partition-specific overwrite
df.write.format("delta").mode("overwrite") \
    .option("replaceWhere", "order_date = '2024-01-15'") \
    .saveAsTable("target")

In [None]:
# GOOD: Idempotent with MERGE
from delta.tables import DeltaTable

target_table = DeltaTable.forName(spark, "target")
target_table.alias("target").merge(
    df.alias("source"),
    "target.order_id = source.order_id"
).whenMatchedUpdateAll().whenNotMatchedInsertAll().execute()

In [None]:
# GOOD: Idempotent with overwrite
df.write.format("delta").mode("overwrite").saveAsTable("target")

### 4. Logging i Observability:

In [None]:
# Structured logging
import logging
from datetime import datetime

# Setup logger
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Log key events
logger.info(f"Job started at {datetime.now()}")
logger.info(f"Processing partition: {run_date}")

# Log metrics
start_time = datetime.now()
df = spark.table("bronze_orders")
row_count = df.count()
end_time = datetime.now()
duration = (end_time - start_time).total_seconds()

logger.info(f"Processed {row_count} rows in {duration}s")

# Persist metrics to monitoring table
metrics_df = spark.createDataFrame([{
    "job_name": "KION_Daily_ETL",
    "run_date": run_date,
    "metric_name": "rows_processed",
    "metric_value": row_count,
    "execution_time_seconds": duration,
    "timestamp": datetime.now()
}])
metrics_df.write.format("delta").mode("append").saveAsTable("job_metrics")

### 5. Cost Optimization:

In [None]:
# Cost optimization strategies:

cost_optimized_config = {
    # 1. Use Spot instances
    "aws_attributes": {
        "availability": "SPOT_WITH_FALLBACK"
    },
    
    # 2. Enable autoscaling
    "autoscale": {
        "min_workers": 1,
        "max_workers": 8
    },
    
    # 3. Set timeouts
    "timeout_seconds": 3600,
    
    # 4. Cluster reuse dla linked tasks
    # "existing_cluster_id": "{{previous_task_cluster}}",
    
    # 5. Photon acceleration
    "runtime_engine": "PHOTON",
    
    # 6. Right-size clusters
    "node_type_id": "Standard_DS3_v2",  # Choose appropriate size
}

# 7. Schedule during off-peak hours
schedule_config = {
    "quartz_cron_expression": "0 0 2 * * ?",  # 2 AM
    "timezone_id": "Europe/Warsaw"
}

---

## Kompletny przykład: Production-ready Multi-task Job

### Scenario: Daily Orders ETL Pipeline
```
1. validate_source → 2. ingest_bronze → 3. transform_silver
 ↓
 4a. aggregate_gold ← 5. data_quality_check
 ↓
 4b. customer_metrics
```

In [None]:
# Complete Job configuration
production_job = {
    "name": "KION_Daily_Orders_ETL_Production",
    "email_notifications": {
        "on_failure": ["data-team@kion.com", "on-call@kion.com"],
        "on_success": ["data-team@kion.com"]
    },
    "webhook_notifications": {
        "on_failure": [{"id": "slack-alerts"}]
    },
    "timeout_seconds": 7200,
    "max_concurrent_runs": 1,
    "format": "MULTI_TASK",
    "tasks": [
        # Task 1: Validate source data availability
        {
            "task_key": "validate_source",
            "notebook_task": {
                "notebook_path": "/Workspace/KION/jobs/00_validate_source",
                "base_parameters": {
                    "source_path": "/Volumes/main/default/kion_data/orders",
                    "run_date": "{{job.start_time.iso_date}}"
                }
            },
            "new_cluster": {
                "spark_version": "13.3.x-scala2.12",
                "node_type_id": "Standard_DS3_v2",
                "num_workers": 1
            },
            "timeout_seconds": 600,
            "max_retries": 2
        },
        # Task 2: Ingest to Bronze
        {
            "task_key": "ingest_bronze",
            "depends_on": [{"task_key": "validate_source"}],
            "notebook_task": {
                "notebook_path": "/Workspace/KION/jobs/01_ingest_bronze",
                "base_parameters": {
                    "source_path": "/Volumes/main/default/kion_data/orders",
                    "target_table": "bronze_orders",
                    "run_date": "{{job.start_time.iso_date}}"
                }
            },
            "new_cluster": {
                "spark_version": "13.3.x-scala2.12",
                "node_type_id": "Standard_DS3_v2",
                "autoscale": {"min_workers": 2, "max_workers": 4},
                "aws_attributes": {"availability": "SPOT_WITH_FALLBACK"}
            },
            "timeout_seconds": 1800,
            "max_retries": 3,
            "min_retry_interval_millis": 120000
        },
        # Task 3: Transform to Silver
        {
            "task_key": "transform_silver",
            "depends_on": [{"task_key": "ingest_bronze"}],
            "notebook_task": {
                "notebook_path": "/Workspace/KION/jobs/02_transform_silver",
                "base_parameters": {
                    "source_table": "bronze_orders",
                    "target_table": "silver_orders",
                    "run_date": "{{job.start_time.iso_date}}"
                }
            },
            "existing_cluster_id": "{{previous_task_cluster}}",  # Reuse cluster
            "timeout_seconds": 1800
        },
        # Task 4a: Aggregate to Gold (daily sales)
        {
            "task_key": "aggregate_gold_daily",
            "depends_on": [{"task_key": "transform_silver"}],
            "sql_task": {
                "warehouse_id": "abc123warehouse",
                "query": {"query_id": "gold_daily_sales_query"}
            }
        },
        # Task 4b: Customer metrics (parallel with 4a)
        {
            "task_key": "aggregate_gold_customers",
            "depends_on": [{"task_key": "transform_silver"}],
            "sql_task": {
                "warehouse_id": "abc123warehouse",
                "query": {"query_id": "gold_customer_ltv_query"}
            }
        },
        # Task 5: Data quality checks
        {
            "task_key": "data_quality_check",
            "depends_on": [
                {"task_key": "aggregate_gold_daily"},
                {"task_key": "aggregate_gold_customers"}
            ],
            "notebook_task": {
                "notebook_path": "/Workspace/KION/jobs/99_quality_checks",
                "base_parameters": {
                    "tables_to_check": "silver_orders,gold_daily_sales,gold_customer_ltv",
                    "run_date": "{{job.start_time.iso_date}}"
                }
            },
            "new_cluster": {
                "spark_version": "13.3.x-scala2.12",
                "node_type_id": "Standard_DS3_v2",
                "num_workers": 1
            }
        }
    ],
    "schedule": {
        "quartz_cron_expression": "0 0 2 * * ?",  # Daily at 2 AM
        "timezone_id": "Europe/Warsaw",
        "pause_status": "UNPAUSED"
    },
    "health": {
        "rules": [
            {
                "metric": "RUN_DURATION_SECONDS",
                "op": "GREATER_THAN",
                "value": 5400  # Alert if > 90 minutes
            }
        ]
    }
}

print("Production Job configuration complete!")

---

## Podsumowanie

### Nauczyłeś się:

- **Multi-task Jobs**: DAG-based workflow orchestration 
- **Task Types**: Notebook, SQL, DLT, dbt tasks 
- **Dependencies**: Sequential, parallel, fan-out/fan-in patterns 
- **Parametryzacja**: Job parameters, task values, dynamic values 
- **Monitoring**: Email, webhooks, custom metrics 
- **Retry Logic**: Automatic retry dla transient errors 
- **Best Practices**: Idempotency, error handling, cost optimization 

### Key Takeaways:

1. **Jobs = Orchestration**: Coordinate complex workflows
2. **Flexibility**: Mix notebook, SQL, DLT, dbt tasks
3. **Resilience**: Retry logic + alerting = robust pipelines
4. **Cost-aware**: Spot instances + autoscaling + cluster reuse
5. **Observability**: System tables + custom monitoring

### Następne kroki:
- **Notebook 04**: Unity Catalog Governance
- **Workshop 02**: Hands-on Lakeflow + Jobs Orchestration

---

## Dodatkowe zasoby

- [Databricks Jobs Documentation](https://docs.databricks.com/workflows/jobs/jobs.html)
- [Jobs API Reference](https://docs.databricks.com/api/workspace/jobs)
- [Orchestration Best Practices](https://docs.databricks.com/workflows/jobs/jobs-best-practices.html)

---