Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt task in asset bundle deployment, errors if artifacts included and git_source missing, inaccurate location if artifacts missing and git_source missing #1246

Open
NodeJSmith opened this issue Feb 29, 2024 · 0 comments
Labels
Bug Something isn't working DABs DABs related issues

Comments

@NodeJSmith
Copy link

Describe the issue

The more I attempt to troubleshoot this the less sure I am regarding what is a bug and what is by design but confusing.

I originally had an issue because I added a dbt task to my pipeline and forgot to add the git source for the dbt task.

When I attempted to deploy the updated asset I get the error message: build failed <package_name>: error chdir <bundle_path>: no such file or directory, output .
image

While troubleshooting this I found that if I remove the artifacts section from my asset bundle the deployment will succeed, but the dbt task assumes that the project directory is the asset bundle deployment location, e.g. /Shared/.bundle/dbx_data_quality/dev/files. I assume that this location being used as the project directory for the dbt task is the reason for the error and failed deployment, but this seems like a bug still because the path actually did it exist already.
Deploying this way results in a task that has these arguments:
image

I solved the issue by adding the git_source section to my job in the asset bundle, which keeps the project directory from being set at all on the dbt task.

          git_source:
            git_branch: develop
            git_provider: azureDevOpsServices
            git_url: 
              https://<organization>@dev.azure.com/<organization>/<project>/_git/dbx-dbt-legacy

image

Configuration

Please provide a minimal reproducible configuration for the issue

Steps to reproduce the behavior

To reproduce the Error: build failed dbx_data_quality, ... error you need an asset bundle that contains a python task and a dbt task, with an artifacts section included in the yaml, using a relative path. The job cannot have a git_source section.

bundle:
  name: dbx_data_quality

artifacts:
  dbx_pipeline_legacy:
    path: .
    type: whl

targets:
  dev:
    mode: development
    resources:
      jobs:
        dbx_data_quality:
          name: dbx_data_quality (dev)
          tasks:
          - job_cluster_key: basic_cluster
            libraries:
            - whl: ./dist/dbx_data_quality-*.whl
            python_wheel_task:
              entry_point: setup
              package_name: dbx_data_quality
            task_key: setup
          - dbt_task:
              catalog: dev
              commands:
              - dbt deps
              - dbt test
              schema: corrections
            depends_on:
            - task_key: setup
            job_cluster_key: basic_cluster
            libraries:
            - pypi:
                package: dbt-databricks==1.7.8
            run_if: ALL_DONE
            task_key: dbt_tests
    workspace:
      profile: dev

Expected Behavior

I'm not sure. The example of a dbt task in the docs shows a git_source section, so it seems that is the expected way of using a dbt task. I think that likely we would want to require a git_source section or ensure that if we do not have one and we have a relative path to the python wheel artifact that the dbt task does not cause a deployment failure.

Actual Behavior

With the artifacts section the deployment fails with a confusing error message. With the git_source section the deployment succeeds. Without either the artifacts or git_source section the deployment succeeds with the artifact directory as the project directory for the dbt task.

OS and CLI version

OS: Ubuntu 22.04 on WSL2 via Windows 11
CLI Version: Databricks CLI v0.214.1

Is this a regression?

I tried this in 0.213.0 and it did not work in that version either.

Debug Logs

Output logs if you run the command with debug logs enabled. Example: databricks bundle deploy --log-level=debug. Redact if needed
with_artifacts_section_and_git_source_section.txt
no_artifacts_section_no_git_source_section.txt
with_artifacts_section_no_git_source_section.txt

@NodeJSmith NodeJSmith added the DABs DABs related issues label Feb 29, 2024
@andrewnester andrewnester added the Bug Something isn't working label Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working DABs DABs related issues
Projects
None yet
Development

No branches or pull requests

2 participants