Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dataproc component yaml files #956

Merged
merged 3 commits into from
Mar 12, 2019

Conversation

hongye-sun
Copy link
Contributor

@hongye-sun hongye-sun commented Mar 11, 2019

This change is Reviewable

@hongye-sun
Copy link
Contributor Author

/retest

@hongye-sun
Copy link
Contributor Author

/retest

Copy link
Contributor

@animeshsingh animeshsingh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @hongye-sun - i am trying to understand how are these YAML files for components description being used in the overall pipelines system?

@hongye-sun
Copy link
Contributor Author

@animeshsingh We want to use those yaml file to share component across pipelines. Basically, the pipeline author should be able to load a component by yaml file. The descriptions in the yaml are served as documentation for the loaded component. Here is an example on how to use it in a notebook: https://github.com/kubeflow/pipelines/tree/master/components/gcp/bigquery/query.

It's still in early state and format in the yaml are going to be changed in the future. E.g. it will be extended to support DAG and other types of resources.

@animeshsingh
Copy link
Contributor

"It's still in early state and format in the yaml are going to be changed in the future. E.g. it will be extended to support DAG and other types of resources." - if we support DAG here, wouldnt it start going in the same territory as Argo yaml?

@hongye-sun
Copy link
Contributor Author

True. We are likely to replace the implementation section in the yaml with argo spec here and will keep the inputs and outputs metadata for describing the documentation and type information. Ideally, the load component api should be able to load any compiled pipeline yaml as a DAG component.

@gaoning777
Copy link
Contributor

/lgtm

@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 12, 2019

Hi @hongye-sun - i am trying to understand how are these YAML files for components description being used in the overall pipelines system?

The component.yaml files are needed for efficient component sharing. Currently many pipeline authors just copy/paste the code between the pipeline files which is an anti-pattern and is error-prone. It's much easier to just write train_op = kfp.components.load_component_from_url("https://..../component.yaml") to load the component and immediately use it to compose a pipeline.

@hongye-sun
Copy link
Contributor Author

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hongye-sun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

1 similar comment
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hongye-sun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 12, 2019

We are likely to
@hongye-sun AFAIK, we have a policy about not disclosing any future plans that are not a part of our roadmap document. Especially, when the plans are not finalized and do not have any planning CUJs or ETAs. It would be best to edit you comment to remove any potential planning information which is not part of the roadmap. Previously, Pascal was very strict about this.

Thanks.

@k8s-ci-robot k8s-ci-robot merged commit 5868158 into kubeflow:master Mar 12, 2019
cheyang pushed a commit to alibaba/pipelines that referenced this pull request Mar 28, 2019
* Add dataproc component yaml files

* Update license to 2019

* Remove unused parameter
Linchin pushed a commit to Linchin/pipelines that referenced this pull request Apr 11, 2023
magdalenakuhn17 pushed a commit to magdalenakuhn17/pipelines that referenced this pull request Oct 22, 2023
* Create PRESENTATIONS.md

* hyperlink from main README
HumairAK pushed a commit to red-hat-data-services/data-science-pipelines that referenced this pull request Mar 11, 2024
* [test] tryout kind on github

Signed-off-by: Yihong Wang <yh.wang@ibm.com>

* build images

build and use the images inside the kind cluster

Signed-off-by: Yihong Wang <yh.wang@ibm.com>

* remove unnecessary step

Signed-off-by: Yihong Wang <yh.wang@ibm.com>

* build multiple images in a script

Signed-off-by: Yihong Wang <yh.wang@ibm.com>

* check if any change for backend files

check changes for backend files and trigger the integration
testing if any.

Signed-off-by: Yihong Wang <yh.wang@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants