Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processing and completion of positional args to bundle run #1120

Merged
merged 12 commits into from
Apr 22, 2024

Conversation

pietern
Copy link
Contributor

@pietern pietern commented Jan 11, 2024

Changes

With this change, both job parameters and task parameters can be specified as positional arguments to bundle run. How the positional arguments are interpreted depends on the configuration of the job.

Examples:

For a job that has job parameters configured a user can specify:

databricks bundle run my_job -- --param1=value1 --param2=value2

And the run is kicked off with job parameters set to:

{
  "param1": "value1",
  "param2": "value2"
}

Similarly, for a job that doesn't use job parameters and only has notebook_task tasks, a user can specify:

databricks bundle run my_notebook_job -- --param1=value1 --param2=value2

And the run is kicked off with task level notebook_params configured as:

{
  "param1": "value1",
  "param2": "value2"
}

For a job that doesn't doesn't use job parameters and only has either spark_python_task or python_wheel_task tasks, a user can specify:

databricks bundle run my_python_file_job -- --flag=value other arguments

And the run is kicked off with task level python_params configured as:

[
  "--flag=value",
  "other",
  "arguments"
]

The same is applied to jobs with only spark_jar_task or spark_submit_task tasks.

Tests

Unit tests. Tested the completions manually.

This change adds support for job parameters. If job parameters are specified
for a job that doesn't define job parameters it returns an error. Conversely,
if task parameters are specified for a job that defines job parameters, it also
returns an error.

This changes moves the options structs and their functions to separate files
and backfills test coverage for them.

Job parameters can now be specified with `--param foo=bar,bar=qux`.
With this change, both job parameters and task parameters can be specified
as positional arguments to bundle run. How the positional arguments are
interpreted depends on the configuration of the job.

Examples:

For a job that has job parameters configured a user can specify:

```
databricks bundle run my_job -- --param1=value1 --param2=value2
```

And the run is kicked off with job parameters set to:
```json
{
  "param1": "value1",
  "param2": "value2"
}
```

Similarly, for a job that doesn't use job parameters and only has
`notebook_task` tasks, a user can specify:

```
databricks bundle run my_notebook_job -- --param1=value1 --param2=value2
```

And the run is kicked off with task level `notebook_params` configured as:
```json
{
  "param1": "value1",
  "param2": "value2"
}
```

For a job that doesn't doesn't use job parameters and only has either
`spark_python_task` or `python_wheel_task` tasks, a user can specify:

```
databricks bundle run my_notebook_job -- --flag=value other arguments
```

And the run is kicked off with task level `python_params` configured as:
```json
[
  "--flag=value",
  "other",
  "arguments"
]
```

The same is applied to jobs with only `spark_jar_task` or `spark_submit_task`
tasks.
@codecov-commenter
Copy link

Codecov Report

Attention: 64 lines in your changes are missing coverage. Please review.

Comparison is base (0b22965) 49.52% compared to head (8cec15f) 49.43%.

Files Patch % Lines
bundle/run/job_args.go 40.50% 47 Missing ⚠️
cmd/bundle/run.go 0.00% 11 Missing ⚠️
bundle/run/pipeline.go 0.00% 4 Missing ⚠️
bundle/run/job.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@                Coverage Diff                 @@
##           run-job-params    #1120      +/-   ##
==================================================
- Coverage           49.52%   49.43%   -0.10%     
==================================================
  Files                 275      274       -1     
  Lines               10591    10704     +113     
==================================================
+ Hits                 5245     5291      +46     
- Misses               4790     4858      +68     
+ Partials              556      555       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

cmd/bundle/run.go Outdated Show resolved Hide resolved
bundle/run/job_args.go Outdated Show resolved Hide resolved
bundle/run/job_args_test.go Show resolved Hide resolved
Base automatically changed from run-job-params to main January 15, 2024 07:47
bundle/run/job_args.go Outdated Show resolved Hide resolved
bundle/run/job_args.go Show resolved Hide resolved
cmd/bundle/run.go Outdated Show resolved Hide resolved
bundle/run/job_args.go Outdated Show resolved Hide resolved
bundle/run/job_options.go Outdated Show resolved Hide resolved
bundle/run/job_options.go Outdated Show resolved Hide resolved
bundle/run/job_args_test.go Outdated Show resolved Hide resolved
bundle/run/args.go Show resolved Hide resolved
cmd/bundle/run.go Outdated Show resolved Hide resolved
cmd/bundle/run.go Outdated Show resolved Hide resolved
cmd/bundle/run.go Outdated Show resolved Hide resolved
cmd/bundle/run.go Outdated Show resolved Hide resolved
cmd/bundle/run.go Outdated Show resolved Hide resolved
Copy link
Contributor Author

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @juliacrawf-db !

cmd/bundle/run.go Outdated Show resolved Hide resolved
@pietern pietern added this pull request to the merge queue Apr 22, 2024
Merged via the queue into main with commit 3108883 Apr 22, 2024
5 checks passed
@pietern pietern deleted the run-additional-args branch April 22, 2024 11:56
@pietern pietern mentioned this pull request Apr 23, 2024
pietern added a commit that referenced this pull request Apr 23, 2024
This release marks the general availability of Databricks Asset Bundles.

CLI:
 * Publish Docker images ([#1353](#1353)).
 * Add support for multi-arch Docker images ([#1362](#1362)).
 * Do not prefill https:// in prompt for Databricks Host ([#1364](#1364)).
 * Add better documentation for the `auth login` command ([#1366](#1366)).
 * Add URLs for authentication documentation to the auth command help ([#1365](#1365)).

Bundles:
 * Fix compute override for foreach tasks ([#1357](#1357)).
 * Transform artifact files source patterns in build not upload stage ([#1359](#1359)).
 * Convert between integer and float in normalization ([#1371](#1371)).
 * Disable locking for development mode ([#1302](#1302)).
 * Resolve variable references inside variable lookup fields ([#1368](#1368)).
 * Added validate mutator to surface additional bundle warnings ([#1352](#1352)).
 * Upgrade terraform-provider-databricks to 1.40.0 ([#1376](#1376)).
 * Print host in `bundle validate` when passed via profile or environment variables ([#1378](#1378)).
 * Cleanup remote file path on bundle destroy ([#1374](#1374)).
 * Add docs URL for `run_as` in error message ([#1381](#1381)).
 * Enable job queueing by default ([#1385](#1385)).
 * Added support for job environments ([#1379](#1379)).
 * Processing and completion of positional args to bundle run ([#1120](#1120)).
 * Add legacy option for `run_as` ([#1384](#1384)).

API Changes:
 * Changed `databricks lakehouse-monitors cancel-refresh` command with new required argument order.
 * Changed `databricks lakehouse-monitors create` command with new required argument order.
 * Changed `databricks lakehouse-monitors delete` command with new required argument order.
 * Changed `databricks lakehouse-monitors get` command with new required argument order.
 * Changed `databricks lakehouse-monitors get-refresh` command with new required argument order.
 * Changed `databricks lakehouse-monitors list-refreshes` command with new required argument order.
 * Changed `databricks lakehouse-monitors run-refresh` command with new required argument order.
 * Changed `databricks lakehouse-monitors update` command with new required argument order.
 * Changed `databricks account workspace-assignment update` command to return response.

OpenAPI commit 94684175b8bd65f8701f89729351f8069e8309c9 (2024-04-11)

Dependency updates:
 * Bump github.com/databricks/databricks-sdk-go from 0.37.0 to 0.38.0 ([#1361](#1361)).
 * Bump golang.org/x/net from 0.22.0 to 0.23.0 ([#1380](#1380)).
github-merge-queue bot pushed a commit that referenced this pull request Apr 23, 2024
This release marks the general availability of Databricks Asset Bundles.

CLI:
* Publish Docker images
([#1353](#1353)).
* Add support for multi-arch Docker images
([#1362](#1362)).
* Do not prefill https:// in prompt for Databricks Host
([#1364](#1364)).
* Add better documentation for the `auth login` command
([#1366](#1366)).
* Add URLs for authentication documentation to the auth command help
([#1365](#1365)).

Bundles:
* Fix compute override for foreach tasks
([#1357](#1357)).
* Transform artifact files source patterns in build not upload stage
([#1359](#1359)).
* Convert between integer and float in normalization
([#1371](#1371)).
* Disable locking for development mode
([#1302](#1302)).
* Resolve variable references inside variable lookup fields
([#1368](#1368)).
* Added validate mutator to surface additional bundle warnings
([#1352](#1352)).
* Upgrade terraform-provider-databricks to 1.40.0
([#1376](#1376)).
* Print host in `bundle validate` when passed via profile or environment
variables ([#1378](#1378)).
* Cleanup remote file path on bundle destroy
([#1374](#1374)).
* Add docs URL for `run_as` in error message
([#1381](#1381)).
* Enable job queueing by default
([#1385](#1385)).
* Added support for job environments
([#1379](#1379)).
* Processing and completion of positional args to bundle run
([#1120](#1120)).
* Add legacy option for `run_as`
([#1384](#1384)).

API Changes:
* Changed `databricks lakehouse-monitors cancel-refresh` command with
new required argument order.
* Changed `databricks lakehouse-monitors create` command with new
required argument order.
* Changed `databricks lakehouse-monitors delete` command with new
required argument order.
* Changed `databricks lakehouse-monitors get` command with new required
argument order.
* Changed `databricks lakehouse-monitors get-refresh` command with new
required argument order.
* Changed `databricks lakehouse-monitors list-refreshes` command with
new required argument order.
* Changed `databricks lakehouse-monitors run-refresh` command with new
required argument order.
* Changed `databricks lakehouse-monitors update` command with new
required argument order.
* Changed `databricks account workspace-assignment update` command to
return response.

OpenAPI commit 94684175b8bd65f8701f89729351f8069e8309c9 (2024-04-11)

Dependency updates:
* Bump github.com/databricks/databricks-sdk-go from 0.37.0 to 0.38.0
([#1361](#1361)).
* Bump golang.org/x/net from 0.22.0 to 0.23.0
([#1380](#1380)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants