Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Databricks Asset Bundles bundle validate "no file in location" error for jobs #1330

Closed
SophieBlum opened this issue Apr 2, 2024 · 4 comments · Fixed by #1333
Closed

Databricks Asset Bundles bundle validate "no file in location" error for jobs #1330

SophieBlum opened this issue Apr 2, 2024 · 4 comments · Fixed by #1333
Labels
Bug Something isn't working DABs DABs related issues

Comments

@SophieBlum
Copy link

SophieBlum commented Apr 2, 2024

Describe the issue

Running databricks bundle validate gives a "unable to determine directory for job jobA: no file in location error". This error did not appear in any previous versions and our bundle definition gets validated and deployed just fine with v0.215.0 and before.

Configuration

Job configuration:

resources:
  jobs:
    jobA:
      name: Job A
      tasks:
        - task_key: notebook-task
          existing_cluster_id: ${var.cluster_id_json_stream}
          notebook_task:
            notebook_path: ${var.json-to-bronze-path}

Bundle definition:

targets:
  dev_deploy:
    workspace:
      host: https://***.azuredatabricks.net/
      root_path: /Shared
      file_path: /Shared
    run_as:
      service_principal_name: ***
    variables:
      dbx_host_name: ${workspace.host}

Steps to reproduce the behavior

Please list the steps required to reproduce the issue, for example:

  1. Run databricks bundle validate -t dev_deploy --var="cluster_id_json_stream=***" --log-level=debug
  2. See error

Expected Behavior

Bundle validation goes smoothly without any errors as it did before.

Actual Behavior

Bundle validation throws an error related to a job definition. The error message makes little sense (to us) and the same bundle definition and setup worked on previous CLI version.

OS and CLI version

Databricks CLI v0.216.0 on macOS Sonoma 14.4.1

Is this a regression?

This works with Databricks CLI v0.215.0 without problems.

Debug Logs

14:05:07 INFO start pid=30240 version=0.216.0 args="databricks, bundle, validate, -t, dev_deploy, --var=cluster_id_json_stream=***, --log-level=debug"
14:05:07 DEBUG Loading bundle configuration from: ***/databricks.yml pid=30240
14:05:07 DEBUG Apply pid=30240 mutator=seq
14:05:07 DEBUG Apply pid=30240 mutator=seq mutator=scripts.preinit
14:05:07 DEBUG No script defined for preinit, skipping pid=30240 mutator=seq mutator=scripts.preinit
14:05:07 DEBUG Apply pid=30240 mutator=seq mutator=ProcessRootIncludes
14:05:07 DEBUG Apply pid=30240 mutator=seq mutator=ProcessRootIncludes mutator=seq
14:05:07 DEBUG Apply pid=30240 mutator=seq mutator=ProcessRootIncludes mutator=seq mutator=ProcessInclude(error_test.yml)
14:05:07 DEBUG Apply pid=30240 mutator=seq mutator=ProcessRootIncludes mutator=seq mutator=ProcessInclude(asset_bundles/orchestration/variables.yml)
14:05:07 DEBUG Apply pid=30240 mutator=seq mutator=EnvironmentsToTargets
14:05:07 DEBUG Apply pid=30240 mutator=seq mutator=InitializeVariables
14:05:07 DEBUG Apply pid=30240 mutator=seq mutator=DefineDefaultTarget(default)
14:05:07 DEBUG Apply pid=30240 mutator=seq mutator=LoadGitDetails
14:05:07 DEBUG Apply pid=30240 mutator=SelectTarget(dev_deploy)
14:05:07 DEBUG Apply pid=30240 mutator=
14:05:07 DEBUG Apply pid=30240 mutator=initialize
14:05:07 INFO Phase: initialize pid=30240 mutator=initialize
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=RewriteSyncPaths
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=MergeJobClusters
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=MergeJobTasks
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=MergePipelineClusters
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=InitializeWorkspaceClient
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=PopulateCurrentUser
14:05:07 DEBUG GET /api/2.0/preview/scim/v2/Me
< HTTP/2.0 200 OK
< {
< "active": true,
< REDACTED
< } pid=30240 mutator=initialize mutator=seq mutator=PopulateCurrentUser sdk=true
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=DefineDefaultWorkspaceRoot
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=ExpandWorkspaceRoot
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=DefaultWorkspacePaths
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=SetVariables
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=ResolveResourceReferences
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=ResolveVariableReferences
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=SetRunAs
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=OverrideCompute
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=ProcessTargetMode
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=ExpandPipelineGlobPaths
14:05:07 DEBUG Apply pid=30240 mutator=initialize mutator=seq mutator=TranslatePaths
14:05:07 ERROR Error: unable to determine directory for job jobA: no file in location pid=30240 mutator=initialize mutator=seq mutator=TranslatePaths
14:05:07 ERROR Error: unable to determine directory for job jobA: no file in location pid=30240 mutator=initialize mutator=seq
14:05:07 ERROR Error: unable to determine directory for job jobA: no file in location pid=30240 mutator=initialize
Error: unable to determine directory for job jobA: no file in location
14:05:07 ERROR failed execution pid=30240 exit_code=1 error="unable to determine directory for job jobA: no file in location"

@SophieBlum SophieBlum added the DABs DABs related issues label Apr 2, 2024
@pietern
Copy link
Contributor

pietern commented Apr 2, 2024

Thanks for reporting the issue.

Version v0.216.0 included a change to how we perform local relative path resolution. It is possible to have a configuration where you have the base definition of a job in one file and specify target overrides in another file in another directory. Paths should always be relative to the directory of the configuration file where they are used, and before this version, this wasn't the case. The relevant PR is at #1273, in case you're interested.

The issue seems to be that we do not have location information for this path because it is specified as a variable. The previous approach assumed all paths to be relative to the location where the job was first defined. Could you let me know what you specify for json-to-bronze-path, is this an absolute path or a relative path, and if the latter, relative to which directory?

@pietern pietern added the Bug Something isn't working label Apr 2, 2024
@SophieBlum
Copy link
Author

It is a relative path, relative to the location where the job is defined.

@SophieBlum
Copy link
Author

SophieBlum commented Apr 2, 2024

I can add to that: if I put the exact same path directly into the job definition, the error does not occur.

@pietern
Copy link
Contributor

pietern commented Apr 3, 2024

Thanks for confirming.

You could work around this for the time being by including a prefix:

          notebook_task:
            notebook_path: "./${var.json-to-bronze-path}"

The result is the same but the location information will persist. We'll work on a fix for the pure reference as well.

github-merge-queue bot pushed a commit that referenced this issue Apr 3, 2024
## Changes

Variable substitution works as if the variable reference is literally
replaced with its contents.

The following fields should be interpreted in the same way regardless of
where the variable is defined:
```yaml
foo: ${var.some_path}
bar: "./${var.some_path}"
```

Before this change, `foo` would inherit the location information of the
variable definition. After this change, it uses the location information
of the variable reference, making the behavior for `foo` and `bar`
identical.

Fixes #1330.

## Tests

The new test passes only with the fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working DABs DABs related issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants