Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate Tekton Pipeline with Tasks #17

Merged
merged 1 commit into from
Mar 12, 2020

Conversation

ckadner
Copy link
Member

@ckadner ckadner commented Mar 11, 2020

Replace steps with tasks to allow parallel task execution.
Use runAfter to support sequential task execution.

The compile now generates multiple documents in one YAML file, one per Task and one for the Pipeline.


This change is Reviewable

@ckadner
Copy link
Member Author

ckadner commented Mar 11, 2020

/assign @animeshsingh

@Tomcli
Copy link
Member

Tomcli commented Mar 11, 2020

Hi @ckadner, do you know what version of tekton you tested this on? Probably we may want to add that into the readme on the versions of tekton it supported.

@ckadner
Copy link
Member Author

ckadner commented Mar 11, 2020

Hi @ckadner, do you know what version of tekton you tested this on? Probably we may want to add that into the readme on the versions of tekton it supported.

@Tomcli -- I am using tekton.dev/release: v0.10.0. This would be the "tested version" but for the README I guess it would be nice to describe the range of "supported versions" which may be harder to ascertain.

@Tomcli
Copy link
Member

Tomcli commented Mar 11, 2020

Sure, can we put the tested version in the README. The reason I bring it up because Tekton just have their new beta api release in 0.11 and our current yaml are based on their alpha api.

@ckadner
Copy link
Member Author

ckadner commented Mar 11, 2020

Thanks @Tomcli for raising the topic of tested/supported versions. I started a section in the README.md which we can enhance in the future

Replace Steps with Tasks to allow parallel task execution.
Use 'runAfter' to support sequential task execution.
@animeshsingh
Copy link
Collaborator

Just on release point, Tekton beta with 0.11 onwards will guarantee some sort of API stability, and hence we should migrate to them as and when schedule permits

@Tomcli
Copy link
Member

Tomcli commented Mar 11, 2020

Hi Christian, how do you pass parameter between each task? I was trying to use kfp's parallel join example and get the following errors
https://github.com/kubeflow/pipelines/blob/master/samples/core/parallel_join/parallel_join.py

generated pipeline.yaml

apiVersion: tekton.dev/v1alpha1
kind: Task
metadata:
  name: gcs-download
spec:
  inputs:
    params:
    - name: url1
  steps:
  - args:
    - gsutil cat $0 | tee $1
    - $(inputs.params.url1)
    - /tmp/results.txt
    command:
    - sh
    - -c
    image: google/cloud-sdk:279.0.0
    name: gcs-download
---
apiVersion: tekton.dev/v1alpha1
kind: Task
metadata:
  name: gcs-download-2
spec:
  inputs:
    params:
    - name: url2
  steps:
  - args:
    - gsutil cat $0 | tee $1
    - $(inputs.params.url2)
    - /tmp/results.txt
    command:
    - sh
    - -c
    image: google/cloud-sdk:279.0.0
    name: gcs-download-2
---
apiVersion: tekton.dev/v1alpha1
kind: Task
metadata:
  name: echo
spec:
  inputs:
    params:
    - name: gcs-download-2-data
    - name: gcs-download-data
  steps:
  - args:
    - 'echo "Text 1: $0"; echo "Text 2: $1"'
    - $(inputs.params.gcs-download-data)
    - $(inputs.params.gcs-download-2-data)
    command:
    - sh
    - -c
    image: library/bash:4.4.23
    name: echo
---
apiVersion: tekton.dev/v1alpha1
kind: Pipeline
metadata:
  annotations:
    pipelines.kubeflow.org/pipeline_spec: '{"description": "Download two messages
      in parallel and prints the concatenated result.", "inputs": [{"default": "gs://ml-pipeline-playground/shakespeare1.txt",
      "name": "url1", "optional": true}, {"default": "gs://ml-pipeline-playground/shakespeare2.txt",
      "name": "url2", "optional": true}], "name": "Parallel pipeline"}'
  name: parallel-pipeline
spec:
  params:
  - default: gs://ml-pipeline-playground/shakespeare1.txt
    name: url1
  - default: gs://ml-pipeline-playground/shakespeare2.txt
    name: url2
  tasks:
  - name: gcs-download
    params:
    - name: url1
      value: $(params.url1)
    taskRef:
      name: gcs-download
  - name: gcs-download-2
    params:
    - name: url2
      value: $(params.url2)
    taskRef:
      name: gcs-download-2
  - name: echo
    params:
    - name: gcs-download-2-data
      value: $(params.gcs-download-2-data)
    - name: gcs-download-data
      value: $(params.gcs-download-data)
    taskRef:
      name: echo

errors

tommyli$ kubectl apply -f pipeline.yaml
task.tekton.dev/gcs-download configured
task.tekton.dev/gcs-download-2 configured
task.tekton.dev/echo configured
Error from server (BadRequest): error when creating "pipeline.yaml": admission webhook "webhook.tekton.dev" denied the request: mutation failed: non-existent variable in "$(params.gcs-download-2-data)" for task parameter param[gcs-download-2-data]: pipelinespec.params.param[gcs-download-2-data]

Copy link
Collaborator

@animeshsingh animeshsingh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @ckadner for this. Looks great!

apart from ContainerOps and ResourceOps, there are others operation for Volume, Secret etc...part of the plan I think coming in future?

}
}

elif isinstance(op, dsl.ResourceOp):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the ResourceOp significance in DSL?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResourceOp represents a pipeline task (op) which lets you directly manipulate Kubernetes resources (create, get, apply, …)

... KFP has an example using ResourceOp for PVC creation

import tarfile
import yaml
import zipfile
from typing import Callable, Set, List, Text, Dict, Tuple, Any, Union, Optional
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we import the whole thing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, it's a long line of an import statement, but this is the recommended Python practice (https://www.python.org/dev/peps/pep-0008/#imports) -- to make imports as narrow and explicit as possible. Wild-card imports like from typing import * are strongly discouraged.
And since this was copied down from the KFP compiler I was trying to keep changes to a minimum to make it easier to trace actual code changes in the future.

@ckadner
Copy link
Member Author

ckadner commented Mar 11, 2020

Hi Christian, how do you pass parameter between each task? I was trying to use kfp's parallel join example and get the following errors
https://github.com/kubeflow/pipelines/blob/master/samples/core/parallel_join/parallel_join.py

...

tommyli$ kubectl apply -f pipeline.yaml
task.tekton.dev/gcs-download configured
task.tekton.dev/gcs-download-2 configured
task.tekton.dev/echo configured
Error from server (BadRequest): error when creating "pipeline.yaml": admission webhook "webhook.tekton.dev" denied the request: mutation failed: non-existent variable in "$(params.gcs-download-2-data)" for task parameter param[gcs-download-2-data]: pipelinespec.params.param[gcs-download-2-data]

Hi @Tomcli passing parameters between Tasks is not implemented yet. I was planning to that in a subsequent PR (issue #19) in order to keep the size and scope of this PR to a minimum. This PR is just to migrate the compiler from generating a single Tekton Task with a sequence of steps to generating a Tekton Pipeline with a sequence of Tasks which can be executed in parallel or sequentially.

@ckadner ckadner mentioned this pull request Mar 11, 2020
27 tasks
@ckadner
Copy link
Member Author

ckadner commented Mar 11, 2020

apart from ContainerOps and ResourceOps, there are others operation for Volume, Secret etc...part of the plan I think coming in future?

@animeshsingh -- yes, we should list and prioritize next steps. I started this umbrella issue Compiler work items #18

@ckadner
Copy link
Member Author

ckadner commented Mar 11, 2020

Just on release point, Tekton beta with 0.11 onwards will guarantee some sort of API stability, and hence we should migrate to them as and when schedule permits

added to umbrella issue Compiler work items #18

@animeshsingh
Copy link
Collaborator

/lgtm
/approve

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: animeshsingh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 3167232 into kubeflow:master Mar 12, 2020
@ckadner
Copy link
Member Author

ckadner commented Mar 12, 2020

thanks @animeshsingh and @Tomcli

ckadner added a commit to ckadner/kfp-tekton that referenced this pull request Mar 25, 2020
Replace Steps with Tasks to allow parallel task execution.
Use 'runAfter' to support sequential task execution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants