Add pipelines 'get' and 'reset' commands#273
Add pipelines 'get' and 'reset' commands#273mukulmurthy merged 4 commits intodatabricks:pipelines-clientfrom
Conversation
|
@anew and @arulajmani as potential reviewers too |
| @click.command(context_settings=CONTEXT_SETTINGS, | ||
| short_help='Gets a delta pipeline\'s current spec and status') | ||
| @click.argument('spec_arg', default=None, required=False) | ||
| @click.option('--spec', default=None, type=PipelineSpecClickType(), help=PipelineSpecClickType.help) |
There was a problem hiding this comment.
Does it make sense to provide the spec to an API that retrieves the spec? Perhaps we should only accept a pipeline id as argument here.
There was a problem hiding this comment.
It also gets the status. Arul and Michael and I debated a bunch when we first did this for delete, but we figured it was probably best to support all options there. I think we should do the same here, but check in with early customers and see if this is making it easier or harder to use.
There was a problem hiding this comment.
Classic REST will assign a unique ID on resource creation and then use that ID for subsequent requests.
Accepting the entire spec for a delete() or similar call may seem harmless. But I have always believed that a good API has exactly one way to do a specific thing. Allowing to do it in multiple ways creates ambiguity, it can cause confusion, and it multiplies the amount of code that needs to be maintained (and kept backwards-compatible) in the future.
One drawback of accepting a spec for get or delete etc. is that the user may think the entire spec must match, when in reality everything but ID is ignored. For example, a user may expect that the spec returned is equal to the spec passed in - that would not be the case. Or, on delete, the user may expect that it only deletes resources that match the whole spec, including, say, the name. But it will delete a pipeline with a matching ID, regardless of name.
This kind of ambiguities can easily be avoided.
| @click.command(context_settings=CONTEXT_SETTINGS, | ||
| short_help='Resets a delta pipeline so data can be reprocessed from scratch') | ||
| @click.argument('spec_arg', default=None, required=False) | ||
| @click.option('--spec', default=None, type=PipelineSpecClickType(), help=PipelineSpecClickType.help) |
There was a problem hiding this comment.
same here - I think it the pipeline id is the only argument we should accept here.
And the same applies to delete...
| def delete(self, pipeline_id=None, credentials=None, headers=None): | ||
| def delete(self, pipeline_id=None, headers=None): | ||
| _data = {} | ||
| if pipeline_id is not None: |
There was a problem hiding this comment.
I'd expect that pipeline_id must not be None.
| def reset(self, pipeline_id=None, headers=None): | ||
| _data = {} | ||
| if pipeline_id is not None: | ||
| _data['pipeline_id'] = pipeline_id |
|
|
||
| @provide_conf | ||
| def test_get_cli_spec_option(pipelines_api_mock, tmpdir): | ||
| path = tmpdir.join('/spec.json').strpath |
There was a problem hiding this comment.
some of these tests become unnecessary when you allow only the pipeline id as argument
| path = '{}/{}.{}'.format(base_pipelines_dir, file_hash, extension) | ||
| return path | ||
|
|
||
| @staticmethod |
There was a problem hiding this comment.
this change seems unrelated - how come we don't need this any more?
There was a problem hiding this comment.
This is for the
Stop double-passing credentials to the Delta Pipelines service
part of the change. This method pulled CLI credentials from the config file so they could be included in the body of the request. Now that we don't need to include them in the body, we don't need this method. Arul wrote it nicely so that we could just whack this method once we fixed the credentials server-side situation.
There was a problem hiding this comment.
ok. I think it is better to make these kind of refactorings in a separate PR, but no biggie.
mukulmurthy
left a comment
There was a problem hiding this comment.
Responded to Andreas's comments
| path = '{}/{}.{}'.format(base_pipelines_dir, file_hash, extension) | ||
| return path | ||
|
|
||
| @staticmethod |
There was a problem hiding this comment.
This is for the
Stop double-passing credentials to the Delta Pipelines service
part of the change. This method pulled CLI credentials from the config file so they could be included in the body of the request. Now that we don't need to include them in the body, we don't need this method. Arul wrote it nicely so that we could just whack this method once we fixed the credentials server-side situation.
| @click.command(context_settings=CONTEXT_SETTINGS, | ||
| short_help='Gets a delta pipeline\'s current spec and status') | ||
| @click.argument('spec_arg', default=None, required=False) | ||
| @click.option('--spec', default=None, type=PipelineSpecClickType(), help=PipelineSpecClickType.help) |
There was a problem hiding this comment.
It also gets the status. Arul and Michael and I debated a bunch when we first did this for delete, but we figured it was probably best to support all options there. I think we should do the same here, but check in with early customers and see if this is making it easier or harder to use.
anew
left a comment
There was a problem hiding this comment.
Let's defer the API discussion (whether to take a spec or a pipeline id as arguments for get/delete/reset) and get this merged.
LGTM
* Add the commands `databricks pipelines get` and `databricks pipelines reset` to get and reset Delta Pipelines * Stop double-passing credentials to the Delta Pipelines service * Improve CLI documentation. Tested with new and additional unit tests.
* Add the commands `databricks pipelines get` and `databricks pipelines reset` to get and reset Delta Pipelines * Stop double-passing credentials to the Delta Pipelines service * Improve CLI documentation. Tested with new and additional unit tests.
databricks pipelines getanddatabricks pipelines resetto get and reset Delta PipelinesTested with new and additional unit tests.