Skip to content

Latest commit

 

History

History
518 lines (418 loc) · 20.5 KB

0061-allow-custom-task-to-be-embedded-in-pipeline.md

File metadata and controls

518 lines (418 loc) · 20.5 KB
status title creation-date last-updated authors
implemented
Allow custom task to be embedded in pipeline
2021-03-18
2021-05-26
@Tomcli
@litong01
@ScrapCodes

TEP-0061: Allow custom task to be embedded in pipeline

Summary

Tektoncd/Pipeline currently allows custom task to be referenced in pipeline resource specification file using taskRef. This TEP discusses the various aspects of embedding the custom task in the TaskSpec for the Tekton Pipeline CRD and RunSpec for the Tekton Run CRD. Just as a regular task, can be either referenced or embedded in the pipelineRun, after implementation of this TEP, a similar support will be available for custom task controller as well.

Motivation

A custom task reference needs to be submitted to kubernetes along with the submission of the Tektoncd/pipeline. To run the pipeline, custom task resource object creation is submitted as a separate request to Kubernetes. If multiple custom task resource objects are created with the same name, to both Kubernetes and Tektoncd/Pipeline, they will be treated as the same task, this behavior can have unintended consequences when Tektoncd/Pipeline gets used as a backend with multiple users. This problem becomes even greater when new users follow documents such as Get started where each user may end up with same name for task and pipeline. In this environment multiple users will step on each other's toes, and produce unintended results.

Another motivation for having this TEP, is reduction in number of API calls to get all the pipeline information. A case in point, in Kubeflow Pipeline (KFP), we need all the templates and task spec live in each pipeline. Currently, having all the custom task templates living in the Kubernetes namespace scope means that we have to make multiple API calls to Kubernetes in order to get all the pipeline information to render in our API/UI. For example, when we create a pipelineRun with custom tasks, the KFP client first needs to make multiple API calls to Kubernetes to create all the custom task CRDs on the same namespace before creating the pipelineRun. Having all the spec inside a single pipelineRun can simplify task/pipeline submission for the KFP client and reduce the number of API calls to the Kubernetes cluster.

Currently TektonCD/Pipeline supports task specifications to be embedded in a pipeline for regular task, but not for custom task. If Tektoncd/Pipeline also allows a custom task specification to be embedded in a pipeline specification then the behavior will be unified with regular task, retaining the existing behavior of taskRef. Most importantly, embedding of spec avoids the issues related to naming conflict, when multiple users in the same namespace create resource. Related issue tektoncd/pipeline#3682

Goals

  1. Allow custom tasks to be embedded in a pipeline specification.
  2. Custom taskSpec should be submitted as part of the runSpec.
  3. Document, general advice on validation/verification of custom task, to the custom task controller developers.

Non-Goals

  1. Custom task controllers are to be developed by other parties. Custom task specification validation by Tektoncd/Pipeline webhooks.

Use Cases (optional)

Use cases from Kubeflow Pipeline (KFP), where tektoncd is used as a backend for running pipelines:

  • KFP compiler can put all the information in one pipelineRun object. Then, KFP client doesn't need to create any Kubernetes resource before running the pipelineRun.
  • KFP doesn't manage the lifecycle of associated custom task resource objects for each pipeline. Since many custom task resource objects are namespace scope, multiple users in the same namespace will have conflicts when creating the custom task resource objects with the same name but with different specs.

Requirements

  • The Tekton controller is responsible for adding the custom task spec to the Run spec. Validation of the custom task is delegated to the custom controller.

Proposal

Add support for Run.RunSpec.Spec.

Currently, Run.RunSpec.Spec is not supported and there are validations across the codebase to ensure, only Run.RunSpec.Ref is specified. As part of this TEP, in addition to adding support for Run.RunSpec.Spec the validations will be changed to support "One of Run.RunSpec.Spec or Run.RunSpec.Ref" only and not both as part of a single API request to kubernetes.

Introducing a new type v1alpha1.EmbeddedRunSpec

// EmbeddedRunSpec allows custom task definitions to be embedded
type EmbeddedRunSpec struct {
	runtime.TypeMeta `json:",inline"`

	// +optional
	Metadata v1beta1.PipelineTaskMetadata `json:"metadata,omitempty"`

	// Spec is a specification of a custom task
	// +optional
	Spec runtime.RawExtension `json:"spec,omitempty"`
}

Structure of RunSpec after adding the field Spec of type EmbeddedRunSpec,

// RunSpec defines the desired state of Run
type RunSpec struct {
	// +optional
	Ref *TaskRef `json:"ref,omitempty"`

	// Spec is a specification of a custom task
	// +optional
	Spec *EmbeddedRunSpec `json:"spec,omitempty"`

	// +optional
	Params []v1beta1.Param `json:"params,omitempty"`

	// Used for cancelling a run (and maybe more later on)
	// +optional
	Status RunSpecStatus `json:"status,omitempty"`

	// +optional
	ServiceAccountName string `json:"serviceAccountName"`

	// PodTemplate holds pod specific configuration
	// +optional
	PodTemplate *PodTemplate `json:"podTemplate,omitempty"`

	// Workspaces is a list of WorkspaceBindings from volumes to workspaces.
	// +optional
	Workspaces []v1beta1.WorkspaceBinding `json:"workspaces,omitempty"`
}

An embedded task will accept new fields i.e. Spec with type runtime.RawExtension and ApiVersion and Kind fields of type string (as part of runtime.TypeMeta) :

type EmbeddedTask struct {
	// +optional
	runtime.TypeMeta `json:",inline,omitempty"`

	// +optional
	Spec runtime.RawExtension `json:"spec,omitempty"`

	// +optional
	Metadata PipelineTaskMetadata `json:"metadata,omitempty"`

	// TaskSpec is a specification of a task
	// +optional
	TaskSpec `json:",inline,omitempty"`
}

An example Run spec based on Tektoncd/experimental/task-loop controller, will look like:

apiVersion: tekton.dev/v1alpha1
kind: Run
metadata:
  name: simpletasklooprun
spec:
  params:
    - name: word
      value:
        - jump
        - land
        - roll
    - name: suffix
      value: ing
  spec:
    apiVersion: custom.tekton.dev/v1alpha1
    kind: TaskLoop
    spec:
      # Task to run (inline taskSpec also works)
      taskRef:
        name: simpletask
      # Parameter that contains the values to iterate
      iterateParam: word
      # Timeout (defaults to global default timeout, usually 1h00m; use "0" for no timeout)
      timeout: 60s
      # Retries for task failure
      retries: 2

Another example based on PipelineRun spec, will look like:

Note that, spec.pipelineSpec.tasks.taskSpec.spec is holding the custom task spec.

apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: pr-loop-example
spec:
  pipelineSpec:
    tasks:
      - name: first-task
        taskSpec:
          steps:
            - name: echo
              image: ubuntu
              imagePullPolicy: IfNotPresent
              script: |
                #!/usr/bin/env bash
                echo "I am the first task before the loop task"
      - name: loop-task
        runAfter:
          - first-task
        params:
          - name: message
            value:
              - I am the first one
              - I am the second one
              - I am the third one
        taskSpec:
          apiVersion: custom.tekton.dev/v1alpha1
          kind: PipelineLoop
          spec:
            iterateParam: message
            pipelineSpec:
              params:
                - name: message
                  type: string
              tasks:
                - name: echo-loop-task
                  params:
                    - name: message
                      value: $(params.message)
                  taskSpec:
                    params:
                      - name: message
                        type: string
                    steps:
                      - name: echo
                        image: ubuntu
                        imagePullPolicy: IfNotPresent
                        script: |
                          #!/usr/bin/env bash
                          echo "$(params.message)"
      - name: last-task
        runAfter:
          - loop-task
        taskSpec:
          steps:
            - name: echo
              image: ubuntu
              imagePullPolicy: IfNotPresent
              script: |
                #!/usr/bin/env bash
                echo "I am the last task after the loop task"

Tektoncd/pipeline can only validate the structure and fields it knows about, validation of the custom task spec field(s) is delegated to the custom task controller.

A custom controller may still choose to not support a Spec based Run or PipelineRun specification. This can be done by implementing validations at the custom controller end. If the custom controller did not respond in any of the ways i.e. either validation errors or reconcile CRD, then, a PipelineRun or a Run will wait until the timeout and mark the status as Failed.

What is the fate of an existing custom controller developed prior to the implementation of this TEP. If the custom controller implemented a validation for missing a Ref, then the PipelineRun or Run missing a Ref will fail immediately with configured error and if however, no validation was implemented for missing a Ref, then it can even lead to nil dereference errors or have the same fate as that of a custom controller who does not respond for missing a Spec or a Ref.

Notes/Caveats (optional)

A poorly implemented custom task controller might neglect validation or manifest erroneous behaviour beyond the control of tektoncd/pipeline. This is true of any custom task implementation whether Spec or Ref.

Risks and Mitigations

User Experience (optional)

With the embedded taskSpec for the custom task, all the Tekton clients can create a pipeline or pipelineRun using a single API call to the Kubernetes. Any downstream systems that employ tektoncd e.g. Kubeflow pipelines, will not be managing lifecycle of all the custom task resource objects (e.g. generate unique names) and their versioning.

It is natural for a user to follow ways such as defining the PodTemplateSpec as the Kubernetes pod definition in Kubernetes Deployment, ReplicaSet, and StatefulSet. Tektoncd/Pipeline with custom tasks embedded will offer a similar/familiar experience.

Performance (optional)

Performance improvement is a consequence of reduction in number of API request(s) to create custom resource(s) accompanying a pipeline. In pipelines, where the number of custom task resource objects are large, this can make a huge difference in performance improvement.

For the end users, trying to render the custom task resource details on the UI dashboard, can be a much smoother experience if all the requests could be fetched in fewer API request(s).

Design Details

The actual code changes needed to implement this TEP, are very minimal.

Broad categories are:

  1. Add the relevant APIs. Already covered in Proposal section.

  2. Change validation logic to accept the newly added API fields. Currently tecktoncd/pipeline will reject any request for Run, which does not include a Run.RunSpec.Ref. So this validation is now changed to either one of Ref or Spec must be present, but not both.

    Next, whether it is a Ref or a Spec, validation logic will ensure, they have non-empty values for, APIVersion and Kind.

    Lastly document advice for downstream custom controllers to implement their own validation logic. This aspect is covered in full detail, in Upgrade & Migration Strategy section of this TEP.

This TEP does not change the existing flow of creation of Run object, it updates the Run object with the content of RunSpec.Spec by marshalling the field Spec runtime.RawExtension to json and embed in the spec, before creating the Run object.

Test Plan

We can reuse the current custom task e2e tests which simulates a custom task controller updating Run status. Then, verify whether the controller can handle the custom task taskSpec as well or not.

Design Evaluation

Before the implementation of this TEP, i.e. without the support for embedding a custom task spec in the PipelineRun resource, a user has to create multiple API requests to the Apiserver. Next, he has to ensure unique names, to avoid conflict in someone else's created custom task resource object.

Embedding of custom task spec avoids the problems related to name collisions and also improves performance by reducing the number of API requests to create custom task resource objects. The performance benefit, of reducing the number of API requests, is more evident when using web-ui based dashboard to display, pipeline details (e.g. in Kubeflow Pipelines with tekton as backend).

Lastly, it looks aesthetically nicer and coherent with existing regular task, with all the custom task spec using fewer lines of yaml and all present in one place.

Drawbacks

Alternatives

Use v1beta1.EmbeddedTask as RunSpec.Spec so that we don't have to introduce a new embedded Spec type for runs.

Cons:

  • brings some PipelineTask-specific fields (like PipelineResources) that don't have a use case in Runs yet.

Infrastructure Needed (optional)

Upgrade & Migration Strategy (optional)

  • Existing custom controller need to upgrade their validation logic:

    Rationale: Previously, there was only one possibility for the structure of Run objects, i.e. they had the path as Run.RunSpec.Ref. A custom controller may do fine, even without validating the input request(s) that misses a Ref. Because, this was already validated by tektoncd/pipeline. After the implementation of this TEP, this is no longer the case, a Run.RunSpec may either contain a Ref or a Spec. So a request with a Spec, to a controller which does not have proper validation for missing a Ref, and does not yet support a Spec, may be rendered in an unstable state e.g. due to nil dereference errors or fail due to timeout.

  • Support spec or taskSpec in the existing custom controller:

    With implementation of this tep, users can supply custom task spec embedded in a PipelineRun or Run. The existing custom controller need to upgrade, to provide support.

  • Unmarshalling the json of custom task object embedded as Spec:

    Run.RunSpec.Spec objects are marshalled as binary by using json.Marshal where json is imported from encoding/json library of golang. So the custom controller may unmarshall these objects by using the corresponding unmarshall function as, json.Unmarshal(run.Spec.Spec.Spec.Raw, &customObjectSpec). In the future, a custom task SDK will do a better job of handling it, and making it easier for the developer to work on custom task controller. TODO: Add a reference to an example custom task controller e.g. TaskLoop, once the changes are merged.

Implementation Pull request(s)

  1. API Changes, docs and e2e tests
  2. Followup fix
  3. Followup fix 2

References (optional)

  1. tektoncd/pipeline#3682
  2. TEP-0002 Custom tasks