%md
# Databricks Asset Bundle

## Proyecto

### Requisitos

Tener instalado el cliente de databricks

### Crear proyecto

Vamos a crear un nuevo proyecto mediante un pequeño formulario

```sh
cd databricks-project
databricks bundle init -p [profile]
Search: █
? Template to use: 
>  default-python (The default Python template for Notebooks and Lakeflow)
  default-sql
  dbt-sql
  mlops-stacks
↓ experimental-jobs-as-code
Unique name for this project [my_project]:
Include a stub (sample) notebook in 'my_project/src': 
  ▸ yes
    no
Include a stub (sample) Lakeflow Declarative Pipeline in 'my_project/src': 
  ▸ yes
    no
Include a stub (sample) Python package in 'my_project/src': 
  ▸ yes
    no
Use serverless compute: 
  ▸ yes
    no
```

Dentro del proyecto vamos a construir un flujo en desarrollo (dev)

En el archivo my_project/databricks.yml agrega las variables my_catalog y my_schema

```yml
# This is a Databricks asset bundle definition for bookstore.
# See https://docs.databricks.com/dev-tools/bundles/index.html for documentation.
# Especifica el nombre del proyecto
bundle:
  name: bookstore
  uuid: 380d75e8-7c19-4055-8252-4914843e8441

artifacts:
  python_artifact:
    type: whl
    build: uv build --wheel

# Especifica los recursos a incluir en el bundle como jobs y pipelines
include:
  - resources/*.yml
  - resources/*/*.yml

# >> NUEVO*** Crea variables para reutilizar en el job
variables:
  my_catalog:
    description: Catalogo por defecto
    default: workspace
  my_schema:
    description: Esquema por defecto
    default: bookstore_dev

# Especifica los entornos de despliegue
targets:
  dev:
    # The default target uses 'mode: development' to create a development copy.
    # - Deployed resources get prefixed with '[dev my_user_name]'
    # - Any job schedules and triggers are paused by default.
    # See also https://docs.databricks.com/dev-tools/bundles/deployment-modes.html.
    mode: development
    default: true
    workspace:
      host: https://dbc-f23c1e0b-86c8.cloud.databricks.com

  prod:
    mode: production
    workspace:
      host: https://dbc-f23c1e0b-86c8.cloud.databricks.com
      # We explicitly deploy to /Workspace/Users/agustin.martinez@bigdataybi.com to make sure we only have a single copy.
      root_path: /Workspace/Users/agustin.martinez@bigdataybi.com/.bundle/${bundle.name}/${bundle.target}
    permissions:
      - user_name: agustin.martinez@bigdataybi.com
        level: CAN_MANAGE
```

Ahora modifica el notebook my_project/notebook.ipynb agrega este parrafo:

```python

dbutils.widgets.text("catalog", "")
dbutils.widgets.text("schema", "")
catalog = dbutils.widgets.get("catalog")
schema = dbutils.widgets.get("schema")
print(f"Catalog: {catalog}, Schema: {schema}")

```

Luego envia las variables como parametros en el job my_project/resources/my_project.job.yml

```yml
# The main job for my_project.
resources:
  jobs:
    my_project_job:
      name: my_project_job

      trigger:
        # Run this job every day, exactly one day from the last run; see https://docs.databricks.com/api/workspace/jobs/create#trigger
        periodic:
          interval: 1
          unit: DAYS

      #email_notifications:
      #  on_failure:
      #    - your_email@example.com

      tasks:
        - task_key: notebook_task
          notebook_task:
            notebook_path: ../src/notebook.ipynb
            base_parameters:
              catalog: ${var.my_catalog}
              schema: ${var.my_schema}     

        - task_key: refresh_pipeline
          depends_on:
            - task_key: notebook_task
          pipeline_task:
            pipeline_id: ${resources.pipelines.my_project_pipeline.id}

        - task_key: main_task
          depends_on:
            - task_key: refresh_pipeline
          environment_key: default
          python_wheel_task:
            package_name: my_project
            entry_point: main

      # A list of task execution environment specifications that can be referenced by tasks of this job.
      environments:
        - environment_key: default

          # Full documentation of this spec can be found at:
          # https://docs.databricks.com/api/workspace/jobs/create#environments-spec
          spec:
            environment_version: "2"
            dependencies:
              - ../dist/*.whl

```

### Despliega el bundle

```sh
cd my_project
databricks bundle deploy --target dev
databricks bundle deploy --target dev
```

### Ejecuta el pipeline 

```sh
# Lista los jobs
databricks jobs list -p [profile]
databricks jobs run-now  [job_id] -p [profile] 
```