Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide a pipeline concurrency limit #1305

Closed
jstrachan opened this issue Sep 13, 2019 · 15 comments
Closed

provide a pipeline concurrency limit #1305

jstrachan opened this issue Sep 13, 2019 · 15 comments
Labels
area/roadmap Issues that are part of the project (or organization) roadmap (usually an epic) kind/feature Categorizes issue or PR as related to a new feature. kind/question Issues or PRs that are questions around the project or a particular feature lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@jstrachan
Copy link

jstrachan commented Sep 13, 2019

Expected Behavior

Its common in CI systems to limit the number of concurrent pipelines that can execute on a given repository and branch. e.g. process PRs concurrently, but only allow a maximum of 1 release to be performed at once in case releases clash with each other. e.g. to avoid race conditions between concurrent pipeline steps that operate on shared git repositories/buckets/kubernetes clusters.

e.g. imagine a simple pipeline of

  • get the next incrementing version number
  • spend some time building artifacts/images
  • use kubectl to deploy some resources using this version number or update the website/changelog

If you run this pipeline concurrently all kinds of things could happen due to the wonders of concurrency (e.g. seeing the version number go forwards then backwards).

When working on separate PRs concurrency is not usually an issue; but working on shared resources (e.g. producing a sequential stream of artifacts or updating a shared cluster) we often want to force a clean ordering on the pipelines to avoid confusion or worse things.

Actual Behavior

There's currently no way to force a pipeline to not execute until all other pipelines for that repository + branch have completed without writing some kind of leader election step.

We're pondering writing a little leader election step as a workaround (which would be Jenkins X specific jenkins-x/jx#5471); but figure it would be nice to be able to add this kind of capability into the tekton controller.

If you squint its a little like the tekton controller being like the ReplicaSet controller; if replicas = 1 for a unique string (e.g. the git repository URL + branch name), only start a new Pod for a PipelineRun when no others are running for that string.

Steps to Reproduce the Problem

  1. run lots of pipelines like the above example and watch the version number go up and down

Additional Info

If there was some kind of MaximumConcurrency for a specific source repository and branch we could modify the tekton controller to only create a new Pod when it knows there are no other running pods for a given source repository + branch.

@vdemeester
Copy link
Member

/kind feature
/kind question

I wonder if this should be a feature for pipeline (core) or some integration tooling (like jenkins-x)

@tekton-robot tekton-robot added kind/feature Categorizes issue or PR as related to a new feature. kind/question Issues or PRs that are questions around the project or a particular feature labels Nov 27, 2019
@bobcatfish
Copy link
Collaborator

This sounds like it could be pretty cool!! I agree with @vdemeester in that I'm not 100% sure where it would best belong - but there have been some related ideas that have come up lately (but are slightly more complicated) such as:

  • Keeping track of how much $ a Pipeline is costing (how many resources it's consuming) and limiting how frequently it can run as a result

I think it could be pretty cool to think about what it would be like to apply generic policies to Pipeline execution, maybe in an admission controller, so it could be decoupled from the Tekton Pipelines codebase but could be ultra flexible.

@assertion
Copy link
Contributor

Any updates about this issue? @bobcatfish @vdemeester
We are alse facing this situation when switching repo based CI/CD pipelines to tekton.

@bobcatfish
Copy link
Collaborator

Hey @assertion ! I don't think there has been any movement but if you (or anyone else in the community) wants to take this on I think we'd be happy to see it! Let me know if you want any pointers re. next steps to work on this.

@Fabian-K
Copy link
Contributor

I looked into the approach using an admission controller. To ensure a pipeline has no concurrent executions, I registered a validating admission controller for creation of taskruns. As the incoming AdmissionReview contains the taskrun definition, I can query for other running pipelines and as a result return either true or false.

This works fine as the tekton controller retries creating the taskrun after it was rejected.

As the retry seems to follow some exponential backoff pattern, I´m a bit worried that if this is requested multiple times, a lot of time might be wasted. Any ideas about that? Do you know where I can find details about the backoff pattern?

@bobcatfish bobcatfish added this to Needs triage in Tekton Pipelines Feb 26, 2020
@dibyom dibyom moved this from Needs triage to Backlog in Tekton Pipelines Mar 30, 2020
@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Collaborator

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 13, 2020
@tekton-robot
Copy link
Collaborator

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Tekton Pipelines automation moved this from Unsure what to do next to Closed Aug 13, 2020
@bobcatfish
Copy link
Collaborator

I'm gonna reopen this one, I wonder if we should add it to the roadmap also 🤔

/reopen

@tekton-robot tekton-robot reopened this Aug 13, 2020
Tekton Pipelines automation moved this from Closed to Needs triage Aug 13, 2020
@tekton-robot
Copy link
Collaborator

@bobcatfish: Reopened this issue.

In response to this:

I'm gonna reopen this one, I wonder if we should add it to the roadmap also 🤔

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bobcatfish bobcatfish added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Aug 13, 2020
@dibyom dibyom added area/roadmap Issues that are part of the project (or organization) roadmap (usually an epic) lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/backlog Higher priority than priority/awaiting-more-evidence. labels Aug 24, 2020
@project-bot project-bot bot moved this from Needs triage to High priority in Tekton Pipelines Aug 24, 2020
@bobcatfish bobcatfish changed the title provide a pipeline concurrency limit for a given repository + branch provide a pipeline concurrency limit Aug 24, 2020
@bobcatfish
Copy link
Collaborator

We're adding this to our roadmap, I changed the name a bit to reflect that we feel like we'd want to address this in general but maybe not specifically for branch + repo only

@imjasonh
Copy link
Member

Some thoughts on how this could possibly work, feel free to propose alternatives:

  • introduce a "concurrency bucket" CRD with a cap on task runs and/or resources, and/or something else

  • have PipelineRuns and TaskRuns state what bucket they're counting against; possibly using an annotation?

  • triggers could populate with some key based on repo+branch (or just repo, or org, etc.)

  • PipelineRun controller holds runs in a concurrency-limited state until there's room in the bucket's cap

Open Questions:

  • should items be unblocked FIFO? At random? Scheduled based on requests?
  • how should this interact with existing K8s features for limiting resource usage in a namespace? Can operators use these features effectively today as a stopgap?

@bobcatfish
Copy link
Collaborator

Related issue: #2591

@afrittoli
Copy link
Member

@jstrachan work on this is happening in experimental tektoncd/experimental#699, please continue the investigation / discussion on this there. Closing this one for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/roadmap Issues that are part of the project (or organization) roadmap (usually an epic) kind/feature Categorizes issue or PR as related to a new feature. kind/question Issues or PRs that are questions around the project or a particular feature lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
No open projects
Tekton Pipelines
  
Closed
Development

No branches or pull requests

9 participants