Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sdk): Support loading pipeline from yaml #8209

Merged
merged 4 commits into from
Aug 31, 2022

Conversation

chensun
Copy link
Member

@chensun chensun commented Aug 30, 2022

Description of your changes:

  • Support loading pipeline from yaml to use as a component.
  • Fix bug where InputSpec ignores artifacts inputs from IR.
  • Fix bug where executor label and component name may not be merged correctly due to name collision.

Checklist:

@chensun
Copy link
Member Author

chensun commented Aug 30, 2022

/test kubeflow-pipelines-samples-v2

Copy link
Contributor

@zichuan-scott-xu zichuan-scott-xu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Chen! /lgtm

@chensun
Copy link
Member Author

chensun commented Aug 30, 2022

/test kubeflow-pipelines-samples-v2

@chensun
Copy link
Member Author

chensun commented Aug 30, 2022

/test kubeflow-pipelines-samples-v2

1 similar comment
@chensun
Copy link
Member Author

chensun commented Aug 30, 2022

/test kubeflow-pipelines-samples-v2

@connor-mccarthy
Copy link
Member

Reviewing this 👀

Copy link
Member

@connor-mccarthy connor-mccarthy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @chensun. I think this there may be an idempotency issue or at least something I have a question about.

This implementation results in changing PipelineSpec when you reload the pipeline. Try appending the following to one of the pipeline files, running it, then inspecting the one.yaml and two.yaml files. They are different.

if __name__ == '__main__':
    compiler.Compiler().compile(
        pipeline_func=my_pipeline,
        package_path='one.yaml')

    from kfp import components
    with open('one.yaml', 'r') as f:
        reloaded_pipeline = components.load_component_from_text(f.read())
    compiler.Compiler().compile(
        reloaded_pipeline, package_path='two.yaml')

I don't think this is desired, correct? Even if the underlying pipeline logic is itself correct. WDYT?

If I am correct, the read/write tests will be very helpful for validating correctness across a diversity of pipelines if you switch the .pipelines.config.read flag to True, plus make a few other changes (see this commit for the additional changes needed: connor-mccarthy@746d787). If I am not correct, we probably still want to use the read/write tests to validate correctness, but they will need more attention in order to do so.

@chensun
Copy link
Member Author

chensun commented Aug 31, 2022

Thanks, @chensun. I think this there may be an idempotency issue or at least something I have a question about.

This implementation results in changing PipelineSpec when you reload the pipeline. Try appending the following to one of the pipeline files, running it, then inspecting the one.yaml and two.yaml files. They are different.

if __name__ == '__main__':
    compiler.Compiler().compile(
        pipeline_func=my_pipeline,
        package_path='one.yaml')

    from kfp import components
    with open('one.yaml', 'r') as f:
        reloaded_pipeline = components.load_component_from_text(f.read())
    compiler.Compiler().compile(
        reloaded_pipeline, package_path='two.yaml')

I don't think this is desired, correct? Even if the underlying pipeline logic is itself correct. WDYT?

If I am correct, the read/write tests will be very helpful for validating correctness across a diversity of pipelines if you switch the .pipelines.config.read flag to True, plus make a few other changes (see this commit for the additional changes needed: connor-mccarthy@746d787). If I am not correct, we probably still want to use the read/write tests to validate correctness, but they will need more attention in order to do so.

That's a good point. Although the changed PipelineSpec is still correct, agree that it is ideal to maintain consistency across read and write. Made the change.

@chensun
Copy link
Member Author

chensun commented Aug 31, 2022

All test passed. Merging this to unblock my next PR.
@connor-mccarthy please feel free to add additional comments, I'll address them in a follow up PR if needed. Thanks!

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 410e99c into kubeflow:master Aug 31, 2022
@chensun chensun deleted the load-pipeline branch August 31, 2022 12:10
Copy link
Member

@connor-mccarthy connor-mccarthy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, @chensun. Thank you for updating the tests!

/lgtm

def from_deployment_spec_dict(cls, deployment_spec_dict: Dict[str, Any],
component_name: str) -> 'Implementation':
def from_pipeline_spec_dict(cls, pipeline_spec_dict: Dict[str, Any],
component_name: str) -> 'Implementation':
"""Creates an Implmentation object from a deployment spec message in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think docstrings are out of date

else:
pipeline_spec = json_format.ParseDict(
pipeline_spec_dict, pipeline_spec_pb2.PipelineSpec())
return Implementation(graph=pipeline_spec)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻



class YamlComponent(components.BaseComponent):
"""A component loaded from a YAML file.

**Note:** ``YamlComponent`` is not intended to be used to construct components directly. Use ``kfp.components.load_component_from_*()`` instead.

Args:
Artribute:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Artribute:
Attributes:

@@ -511,18 +523,16 @@ def transform_outputs(self) -> None:
def validate_placeholders(self):
"""Validates that input/output placeholders refer to an existing
input/output."""
implementation = self.implementation
if getattr(implementation, 'container', None) is None:
if self.implementation.container is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

@@ -1470,12 +1488,11 @@ def _rename_component_refs(
sub_pipeline_spec.root)


def create_pipeline_spec_and_deployment_config(
def create_pipeline_spec(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this change

jlyaoyuli pushed a commit to jlyaoyuli/pipelines that referenced this pull request Jan 5, 2023
* support loading pipeline from yaml

* release note

* cleanup

* maintain read/write consistency with component/pipeline compilation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants