# Databricks Bundles Walkthrough

## 1. Authenticate Databricks CLI

```sh
databricks auth login --host <workspace-url>
```

---

## 2. Initialize Databricks Bundle

```sh
databricks bundle init
```

Selections:
- `default-python`
- provide `name`
- no sample notebook
- no sample DLT pipeline
- no sample Python package
- **yes** to serverless

---

## 3. Review Bundle Structure & Create `src` Folder

- Explain the generated structure.
- Create a `src/` folder and add your source files.

---

## 4. Walk Through the Code

- Go through the logic of each script (e.g., `load_data`, `train_model`) and explain what it does.

---

## 5. Review `databricks.yml`

- Go through the auto-generated configuration.
- Explain how resources, variables, and jobs are referenced.

---

## 6. Create `variables.yml` and Link in `databricks.yml`

```yaml
variables:
  catalog:
    description: Unity Catalog name
    default: ml
  schema:
    description: Schema/database name
    default: iris_demo
  model_name:
    description: Registered model name (Unity Catalog)
    default: ${var.catalog}.${var.schema}.iris_model
  table_name:
    description: Target table for Iris data
    default: ${var.catalog}.${var.schema}.iris_raw
```

---

## 7. Create Catalog in Databricks

- Set up the catalog and fill in the variables.

---

## 8. Define a Job in `databricks.yml`

### 8.1 Base Structure

```yaml
resources:
  jobs:
    iris_job:
      name: ${bundle.name}-${bundle.target}-job
      tags:
        project: ${bundle.name}
        env: ${bundle.target}
      tasks:
```

### 8.2 First Task: Load Data

```yaml
      - task_key: load_data
        description: Load Iris & save as table
        notebook_task:
          notebook_path: ${workspace.root_path}/files/src/load_data
          base_parameters:
            table: ${var.table_name}_${bundle.target}
```

### 8.3 Second Task: Train Model

```yaml
      - task_key: train_model
        depends_on:
          - task_key: load_data
        description: Train a simple model & log to MLflow
        notebook_task:
          notebook_path: ${workspace.root_path}/files/src/train_model
          base_parameters:
            table: ${var.table_name}_${bundle.target}
            model: ${var.model_name}_${bundle.target}
```

---

## 9. First Deployment

```sh
databricks bundle deploy
```

---

## 10. Create Service Principal

- Assign **your account** the `User` role.
- Give the service principal rights to the **catalog**.

---

## 11. Run Job & Configure Staging Environment

```yaml
staging:
  presets:
    name_prefix: "[staging SP]"
  workspace:
    host: https://dbc-2d7e8d9a-a6fc.cloud.databricks.com
    root_path: /Workspace/Shared/.bundle/${bundle.name}/${bundle.target}
  run_as:
    service_principal_name: 29836a0e-b654-4aa7-8b15-844311a7ddc9
  permissions:
    - user_name: stefan.helm.99@icloud.com
      level: CAN_MANAGE
    - group_name: users
      level: CAN_MANAGE
```

- Run a **staging checkout run**.

---

## 12. Adjust Production Environment

```yaml
prod:
  mode: production
  presets:
    name_prefix: "[prod SP]"
  workspace:
    host: https://dbc-2d7e8d9a-a6fc.cloud.databricks.com
    root_path: /Workspace/Shared/.bundle/${bundle.name}/${bundle.target}
  run_as:
    service_principal_name: 29836a0e-b654-4aa7-8b15-844311a7ddc9
  permissions:
    - user_name: stefan.helm.99@icloud.com
      level: CAN_MANAGE
    - group_name: admins
      level: CAN_MANAGE
```

---

## 13. Run Production Deployment

```sh
databricks bundle deploy -t prod
```
