Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "volume" input/output resource #1062

Closed
abayer opened this issue Jul 10, 2019 · 1 comment
Closed

Add "volume" input/output resource #1062

abayer opened this issue Jul 10, 2019 · 1 comment
Assignees
Labels
design This task is about creating and discussing a design kind/design Categorizes issue or PR as related to design. kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@abayer
Copy link
Contributor

abayer commented Jul 10, 2019

Related to but not exactly the same as #924. =)

In Jenkins X, we're going to be running two PipelineRuns of different Pipelines for each build - the first will dynamically generate the second Pipeline from the jenkins-x.yml and build packs etc in the repo and potentially make changes to other files in the workspace, while the second will run the actual build. To do this, we need to be able to pass the modified files from the first PipelineRun's workspace to the second PipelineRun (in the case of pull requests, we also merge the pull request into master locally in both PipelineRuns currently, and it would be nice to avoid doing that in the second one). We considered using a gcs resource for this, but that would break Jenkins X running on anything but GCP, which isn't an option, and would require bucket configuration, etc.

So we've got jenkins-x/jx#4660 - we'd like to take the workspace at the end of the first PipelineRun and use that as the input to the second one. I thought about trying to just reuse the first PipelineRun's artifact storage PVC, but decided it was better to make a new PVC and use that, so that we don't have to mess with the cleanup logic for artifact storage PVCs. Anyway, I've got some experimental work on this and will hopefully have a proof-of-concept PR up this week.

@abayer abayer added kind/feature Categorizes issue or PR as related to a new feature. design This task is about creating and discussing a design priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. kind/design Categorizes issue or PR as related to design. labels Jul 10, 2019
@abayer abayer self-assigned this Jul 10, 2019
abayer added a commit to abayer/tektoncd-pipeline that referenced this issue Jul 16, 2019
fixes tektoncd#1062

This will allow copying content either into or out of a `TaskRun`,
either to an existing volume or a newly created volume. The immediate
use case is for copying a pipeline's workspace to be made available as
the input for another pipeline's workspace without needing to deal
with uploading everything to a bucket. The volume, whether already
existing or created, will not be deleted at the end of the
`PipelineRun`, unlike the artifact storage PVC.

This is just the initial work - the unit tests are not complete, and
there need to be e2e tests, obviously, but I just wanted to get this
initial work up for evaluation.

Signed-off-by: Andrew Bayer <andrew.bayer@gmail.com>
abayer added a commit to abayer/tektoncd-pipeline that referenced this issue Jul 16, 2019
fixes tektoncd#1062

This will allow copying content either into or out of a `TaskRun`,
either to an existing volume or a newly created volume. The immediate
use case is for copying a pipeline's workspace to be made available as
the input for another pipeline's workspace without needing to deal
with uploading everything to a bucket. The volume, whether already
existing or created, will not be deleted at the end of the
`PipelineRun`, unlike the artifact storage PVC.

This is just the initial work - the unit tests are not complete, and
there need to be e2e tests, obviously, but I just wanted to get this
initial work up for evaluation.

Signed-off-by: Andrew Bayer <andrew.bayer@gmail.com>
dlorenc pushed a commit to dlorenc/build-pipeline that referenced this issue Aug 12, 2019
fixes tektoncd#1062

This will allow copying content either into or out of a `TaskRun`,
either to an existing volume or a newly created volume. The immediate
use case is for copying a pipeline's workspace to be made available as
the input for another pipeline's workspace without needing to deal
with uploading everything to a bucket. The volume, whether already
existing or created, will not be deleted at the end of the
`PipelineRun`, unlike the artifact storage PVC.

This is just the initial work - the unit tests are not complete, and
there need to be e2e tests, obviously, but I just wanted to get this
initial work up for evaluation.

Signed-off-by: Andrew Bayer <andrew.bayer@gmail.com>
@bobcatfish bobcatfish added this to the Pipelines 0.7 🐱 milestone Sep 6, 2019
bobcatfish added a commit to bobcatfish/pipeline that referenced this issue Sep 14, 2019
In tektoncd#1109 we will be removing support for git as an output. In the
current implementation, git as an output is just a volume that holds the
data from the git repo, and copies it between Tasks (when git as an
output is linked to git as an input). As discussed in tektoncd#1076, the model
we want for PipelineResources is for them to take the outside world and
represent it on disk when used as an input, and when used as an output,
to update the outside world. In order to do this, what we actually want for a
git output is for it to create a commit the repo it is referencing.
However up until this point folks have been using git resources in the
way that we want Volume Resources to behave tektoncd#1062, so we want to
transition folks to Volume Resources and away from using git outputs.

Fixes tektoncd#1283
bobcatfish added a commit to dlorenc/build-pipeline that referenced this issue Sep 17, 2019
This will allow copying content either into or out of a `TaskRun`,
either to an existing volume or a newly created volume. The immediate
use case is for copying a pipeline's workspace to be made available as
the input for another pipeline's workspace without needing to deal
with uploading everything to a bucket. The volume, whether already
existing or created, will not be deleted at the end of the
`PipelineRun`, unlike the artifact storage PVC.

The Volume resource is a sub-type of the general Storage resource.

Since this type will require the creation of a PVC to function (to be
configurable later), this commit adds a Setup interface that
PipelineResources can implement if they need to do setup that involves
instantiating objects in Kube. This could be a place to later add
features like caching, and also is the sort of design we'd expect once
PipelineResources are extensible (PipelineResources will be free to do
whatever setup they need).

fixes tektoncd#1062

Co-authored-by: dlorenc <lorenc.d@gmail.com>
Co-authored-by: Christie Wilson <bobcatfish@gmail.com>
bobcatfish added a commit to bobcatfish/pipeline that referenced this issue Sep 17, 2019
This will allow copying content either into or out of a `TaskRun`,
either to an existing volume or a newly created volume. The immediate
use case is for copying a pipeline's workspace to be made available as
the input for another pipeline's workspace without needing to deal
with uploading everything to a bucket. The volume, whether already
existing or created, will not be deleted at the end of the
`PipelineRun`, unlike the artifact storage PVC.

The Volume resource is a sub-type of the general Storage resource.

Since this type will require the creation of a PVC to function (to be
configurable later), this commit adds a Setup interface that
PipelineResources can implement if they need to do setup that involves
instantiating objects in Kube. This could be a place to later add
features like caching, and also is the sort of design we'd expect once
PipelineResources are extensible (PipelineResources will be free to do
whatever setup they need).

fixes tektoncd#1062

Co-authored-by: dlorenc <lorenc.d@gmail.com>
Co-authored-by: Christie Wilson <bobcatfish@gmail.com>
bobcatfish added a commit to dlorenc/build-pipeline that referenced this issue Sep 17, 2019
This will allow copying content either into or out of a `TaskRun`,
either to an existing volume or a newly created volume. The immediate
use case is for copying a pipeline's workspace to be made available as
the input for another pipeline's workspace without needing to deal
with uploading everything to a bucket. The volume, whether already
existing or created, will not be deleted at the end of the
`PipelineRun`, unlike the artifact storage PVC.

The Volume resource is a sub-type of the general Storage resource.

Since this type will require the creation of a PVC to function (to be
configurable later), this commit adds a Setup interface that
PipelineResources can implement if they need to do setup that involves
instantiating objects in Kube. This could be a place to later add
features like caching, and also is the sort of design we'd expect once
PipelineResources are extensible (PipelineResources will be free to do
whatever setup they need).

fixes tektoncd#1062

Co-authored-by: Dan Lorenc <lorenc.d@gmail.com>
Co-authored-by: Christie Wilson <bobcatfish@gmail.com>
bobcatfish added a commit to dlorenc/build-pipeline that referenced this issue Sep 17, 2019
This will allow copying content either into or out of a `TaskRun`,
either to an existing volume or a newly created volume. The immediate
use case is for copying a pipeline's workspace to be made available as
the input for another pipeline's workspace without needing to deal
with uploading everything to a bucket. The volume, whether already
existing or created, will not be deleted at the end of the
`PipelineRun`, unlike the artifact storage PVC.

The Volume resource is a sub-type of the general Storage resource.

Since this type will require the creation of a PVC to function (to be
configurable later), this commit adds a Setup interface that
PipelineResources can implement if they need to do setup that involves
instantiating objects in Kube. This could be a place to later add
features like caching, and also is the sort of design we'd expect once
PipelineResources are extensible (PipelineResources will be free to do
whatever setup they need).

fixes tektoncd#1062

Co-authored-by: Dan Lorenc <lorenc.d@gmail.com>
Co-authored-by: Christie Wilson <bobcatfish@gmail.com>
bobcatfish added a commit to bobcatfish/pipeline that referenced this issue Sep 19, 2019
This will allow copying content either into or out of a `TaskRun`,
either to an existing volume or a newly created volume. The immediate
use case is for copying a pipeline's workspace to be made available as
the input for another pipeline's workspace without needing to deal
with uploading everything to a bucket. The volume, whether already
existing or created, will not be deleted at the end of the
`PipelineRun`, unlike the artifact storage PVC.

The Volume resource is a sub-type of the general Storage resource.

Since this type will require the creation of a PVC to function (to be
configurable later), this commit adds a Setup interface that
PipelineResources can implement if they need to do setup that involves
instantiating objects in Kube. This could be a place to later add
features like caching, and also is the sort of design we'd expect once
PipelineResources are extensible (PipelineResources will be free to do
whatever setup they need).

fixes tektoncd#1062

Co-authored-by: Dan Lorenc <lorenc.d@gmail.com>
Co-authored-by: Christie Wilson <bobcatfish@gmail.com>
bobcatfish added a commit to dlorenc/build-pipeline that referenced this issue Oct 1, 2019
This will allow copying content either into or out of a `TaskRun`,
either to an existing volume or a newly created volume. The immediate
use case is for copying a pipeline's workspace to be made available as
the input for another pipeline's workspace without needing to deal
with uploading everything to a bucket. The volume, whether already
existing or created, will not be deleted at the end of the
`PipelineRun`, unlike the artifact storage PVC.

The Volume resource is a sub-type of the general Storage resource.

Since this type will require the creation of a PVC to function (to be
configurable later), this commit adds a Setup interface that
PipelineResources can implement if they need to do setup that involves
instantiating objects in Kube. This could be a place to later add
features like caching, and also is the sort of design we'd expect once
PipelineResources are extensible (PipelineResources will be free to do
whatever setup they need).

The behavior of this volume resource is:
1. For inputs, copy data _from_ the PVC to the workspace path
2. For outputs, copy data _to_ the PVC from the workspace path

If a user does want to control where the data is copied from, they can:
1. Add a step that copies from the location they want to copy from on
   disk to /workspace/whatever
2. Use the "targetPath" argument in the PipelineResource to control the
   location the data is copied to (still relative to targetPath
   https://github.com/tektoncd/pipeline/blob/master/docs/resources.md#controlling-where-resources-are-mounted)
3. Use `path` https://github.com/tektoncd/pipeline/blob/master/docs/resources.md#overriding-where-resources-are-copied-from
   (tbd if we want to keep this feature post PVC)

The underlying PVC will need to be created by the Task reonciler, if
only a TaskRun is being used, or by the PipelineRun reconciler if a
Pipeline is being used. The PipelineRun reconciler cannot delegate this
to the TaskRun reconciler b/c when two different reconcilers create PVCs
and Tekton is running on a regional GKE cluster, they can get created in
different zones, resulting in a pod that tries to use both being
unschedulable.

fixes tektoncd#1062

Co-authored-by: Dan Lorenc <lorenc.d@gmail.com>
Co-authored-by: Christie Wilson <bobcatfish@gmail.com>
bobcatfish added a commit to dlorenc/build-pipeline that referenced this issue Oct 10, 2019
This will allow copying content either into or out of a `TaskRun`,
either to an existing volume or a newly created volume. The immediate
use case is for copying a pipeline's workspace to be made available as
the input for another pipeline's workspace without needing to deal
with uploading everything to a bucket. The volume, whether already
existing or created, will not be deleted at the end of the
`PipelineRun`, unlike the artifact storage PVC.

The Volume resource is a sub-type of the general Storage resource.

Since this type will require the creation of a PVC to function (to be
configurable later), this commit adds a Setup interface that
PipelineResources can implement if they need to do setup that involves
instantiating objects in Kube. This could be a place to later add
features like caching, and also is the sort of design we'd expect once
PipelineResources are extensible (PipelineResources will be free to do
whatever setup they need).

The behavior of this volume resource is:
1. For inputs, copy data _from_ the PVC to the workspace path
2. For outputs, copy data _to_ the PVC from the workspace path

If a user does want to control where the data is copied from, they can:
1. Add a step that copies from the location they want to copy from on
   disk to /workspace/whatever
2. Use the "targetPath" argument in the PipelineResource to control the
   location the data is copied to (still relative to targetPath
   https://github.com/tektoncd/pipeline/blob/master/docs/resources.md#controlling-where-resources-are-mounted)
3. Use `path` https://github.com/tektoncd/pipeline/blob/master/docs/resources.md#overriding-where-resources-are-copied-from
   (tbd if we want to keep this feature post PVC)

The underlying PVC will need to be created by the Task reonciler, if
only a TaskRun is being used, or by the PipelineRun reconciler if a
Pipeline is being used. The PipelineRun reconciler cannot delegate this
to the TaskRun reconciler b/c when two different reconcilers create PVCs
and Tekton is running on a regional GKE cluster, they can get created in
different zones, resulting in a pod that tries to use both being
unschedulable.

fixes tektoncd#1062

Co-authored-by: Dan Lorenc <lorenc.d@gmail.com>
Co-authored-by: Christie Wilson <bobcatfish@gmail.com>
bobcatfish added a commit to dlorenc/build-pipeline that referenced this issue Oct 10, 2019
This will allow copying content either into or out of a `TaskRun`,
either to an existing volume or a newly created volume. The immediate
use case is for copying a pipeline's workspace to be made available as
the input for another pipeline's workspace without needing to deal
with uploading everything to a bucket. The volume, whether already
existing or created, will not be deleted at the end of the
`PipelineRun`, unlike the artifact storage PVC.

The Volume resource is a sub-type of the general Storage resource.

Since this type will require the creation of a PVC to function (to be
configurable later), this commit adds a Setup interface that
PipelineResources can implement if they need to do setup that involves
instantiating objects in Kube. This could be a place to later add
features like caching, and also is the sort of design we'd expect once
PipelineResources are extensible (PipelineResources will be free to do
whatever setup they need).

The behavior of this volume resource is:
1. For inputs, copy data _from_ the PVC to the workspace path
2. For outputs, copy data _to_ the PVC from the workspace path

If a user does want to control where the data is copied from, they can:
1. Add a step that copies from the location they want to copy from on
   disk to /workspace/whatever
2. Use the "targetPath" argument in the PipelineResource to control the
   location the data is copied to (still relative to targetPath
   https://github.com/tektoncd/pipeline/blob/master/docs/resources.md#controlling-where-resources-are-mounted)
3. Use `path` https://github.com/tektoncd/pipeline/blob/master/docs/resources.md#overriding-where-resources-are-copied-from
   (tbd if we want to keep this feature post PVC)

The underlying PVC will need to be created by the Task reonciler, if
only a TaskRun is being used, or by the PipelineRun reconciler if a
Pipeline is being used. The PipelineRun reconciler cannot delegate this
to the TaskRun reconciler b/c when two different reconcilers create PVCs
and Tekton is running on a regional GKE cluster, they can get created in
different zones, resulting in a pod that tries to use both being
unschedulable.

In order to actually schedule a pod using two volume resources, we had
to:
- Use a storage class that can be scheduled in a GKE regional cluster
  https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/regional-pd
- Either use the same storage class for the PVC attached automatically
  for input/output linking or don't use the PVC (chose the latter!)

This commit removes automatic PVC copying for input output linking of
the VolumeResource b/c since it itself is a PVC, there is no need to
copy between an intermediate PVCs. This makes it simpler to make a Task
using the VolumeResource schedulable, removes redundant copying, and
removes a side effect where if a VolumeResources output was linked to an
input, the Task with the input would see _only_ the changes made by the
output and none of the other contents of the PVC.

Also removing the docs on the `paths` param (i.e. "overriding where
resources are copied from") because it was implemented such that it
would only work in the output -> input linking PVC case and can't
actually be used by users and it will be removed in tektoncd#1284.

fixes tektoncd#1062

Co-authored-by: Dan Lorenc <lorenc.d@gmail.com>
Co-authored-by: Christie Wilson <bobcatfish@gmail.com>
bobcatfish added a commit to bobcatfish/pipeline that referenced this issue Oct 10, 2019
This will allow copying content either into or out of a `TaskRun`,
either to an existing volume or a newly created volume. The immediate
use case is for copying a pipeline's workspace to be made available as
the input for another pipeline's workspace without needing to deal
with uploading everything to a bucket. The volume, whether already
existing or created, will not be deleted at the end of the
`PipelineRun`, unlike the artifact storage PVC.

The Volume resource is a sub-type of the general Storage resource.

Since this type will require the creation of a PVC to function (to be
configurable later), this commit adds a Setup interface that
PipelineResources can implement if they need to do setup that involves
instantiating objects in Kube. This could be a place to later add
features like caching, and also is the sort of design we'd expect once
PipelineResources are extensible (PipelineResources will be free to do
whatever setup they need).

The behavior of this volume resource is:
1. For inputs, copy data _from_ the PVC to the workspace path
2. For outputs, copy data _to_ the PVC from the workspace path

If a user does want to control where the data is copied from, they can:
1. Add a step that copies from the location they want to copy from on
   disk to /workspace/whatever
2. Use the "targetPath" argument in the PipelineResource to control the
   location the data is copied to (still relative to targetPath
   https://github.com/tektoncd/pipeline/blob/master/docs/resources.md#controlling-where-resources-are-mounted)
3. Use `path` https://github.com/tektoncd/pipeline/blob/master/docs/resources.md#overriding-where-resources-are-copied-from
   (tbd if we want to keep this feature post PVC)

The underlying PVC will need to be created by the Task reonciler, if
only a TaskRun is being used, or by the PipelineRun reconciler if a
Pipeline is being used. The PipelineRun reconciler cannot delegate this
to the TaskRun reconciler b/c when two different reconcilers create PVCs
and Tekton is running on a regional GKE cluster, they can get created in
different zones, resulting in a pod that tries to use both being
unschedulable.

In order to actually schedule a pod using two volume resources, we had
to:
- Use a storage class that can be scheduled in a GKE regional cluster
  https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/regional-pd
- Either use the same storage class for the PVC attached automatically
  for input/output linking or don't use the PVC (chose the latter!)

This commit removes automatic PVC copying for input output linking of
the VolumeResource b/c since it itself is a PVC, there is no need to
copy between an intermediate PVCs. This makes it simpler to make a Task
using the VolumeResource schedulable, removes redundant copying, and
removes a side effect where if a VolumeResources output was linked to an
input, the Task with the input would see _only_ the changes made by the
output and none of the other contents of the PVC.

Also removing the docs on the `paths` param (i.e. "overriding where
resources are copied from") because it was implemented such that it
would only work in the output -> input linking PVC case and can't
actually be used by users and it will be removed in tektoncd#1284.

fixes tektoncd#1062

Co-authored-by: Dan Lorenc <lorenc.d@gmail.com>
Co-authored-by: Christie Wilson <bobcatfish@gmail.com>
@bobcatfish
Copy link
Collaborator

bobcatfish commented Oct 25, 2019

After implementing this (har har) we realized that we might not need it after all, which I started to discuss in #1417 (comment)

We were creating the VolumeResource to solve several problems:

@sbwsg and @dlorenc have a new design to propose for #1272 and #1076 which will make it so that:

  • PipelineResources can declare whether they need to be backed by underlying storage such as a PVC or bucket
  • Tekton admins can configure what underlying storage mechanism is used; if none is configured, PVCs will be automatically created and destroyed for this
  • PipelineRuns can override the storage mechanism and provide their own PVC if desired <-- this should handle the use case described by @abayer , tho it would mean the user (or jenkins x in this case) would need to create the PVC before the first PipelineRun

So long story short, between the new design (above is a teaser, watch #1272 and #1076 for more details!) and @skaegi 's proposal in #1438 we think we can meet all the use cases we know of without a Volume Resource but please add comments if you disagree and we can re-open!

(+@ravikiranbukka)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design This task is about creating and discussing a design kind/design Categorizes issue or PR as related to design. kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
3 participants