Skip to content

TEP-0114: Support Both V1alpha1 and V1beta1 for Custom Task in the Initial Release#867

Merged
tekton-robot merged 1 commit intotektoncd:mainfrom
XinruZhang:TEP-0114-retries
Nov 28, 2022
Merged

TEP-0114: Support Both V1alpha1 and V1beta1 for Custom Task in the Initial Release#867
tekton-robot merged 1 commit intotektoncd:mainfrom
XinruZhang:TEP-0114-retries

Conversation

@XinruZhang
Copy link
Copy Markdown
Member

This PR serves as a follow-up PR of #862 to reflect @lbernick's comment here #862 (comment)

We will support both v1alpha1.Run and v1beta1.CustomRun in the initial release until the design decision is made in TEP-0121.

cc @jerop, @lbernick, @abayer

@tekton-robot tekton-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Oct 27, 2022
@XinruZhang XinruZhang changed the title Support Both V1alpha1 and V1beta1 for Custom Task in the Initial Release TEP-0114: Support Both V1alpha1 and V1beta1 for Custom Task in the Initial Release Oct 27, 2022
Copy link
Copy Markdown
Member

@jerop jerop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @XinruZhang

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 27, 2022
@lbernick
Copy link
Copy Markdown
Member

/approve cancel
/hold

I need to think more about this-- if we introduce customruns without retries/retriesstatus at first and then add it in later, that's not really a backwards compatible change because we would expect all customruns to support retries, so it forces all controllers to update.

@tekton-robot tekton-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 27, 2022
@vdemeester
Copy link
Copy Markdown
Member

What does it mean to support both ? 🤔

@jerop
Copy link
Copy Markdown
Member

jerop commented Nov 3, 2022

What does it mean to support both ? 🤔

understood it to mean that both v1alpha1.Run and v1beta1.CustomRun will be supported in the initial releases, that is, we won't remove v1alpha1.Run when v1beta1.CustomRun is released

cc @XinruZhang

@XinruZhang
Copy link
Copy Markdown
Member Author

XinruZhang commented Nov 3, 2022

Thanks for your comment @vdemeester! And thanks @jerop for your reply!

As far as I understand, to support both, we have two options to go:

  1. Use the feature flag enable-api-field to controll which version to use when creating a new Custom Task Run in pipeline Controller. But this requires users to migrate all custom tasks to v1beta1 all at once.
  2. Create a new field (and remove that field later on) in PipelineTask to control which version to use when creating a Custom Task Run out of a PipelineTask. But removing that field is backward incompatible.

cc @pritidesai @ScrapCodes

@abayer
Copy link
Copy Markdown
Contributor

abayer commented Nov 3, 2022

My thoughts on that are along the lines of a feature flag you can set to have the PipelineRun reconciler create v1alpha1.Runs instead of v1beta1.CustomRuns, with the feature flag deprecated from introduction to be removed in 6 months.

I think it's important to have some way of supporting continuing to use v1alpha1.Run because if we don't, we're going to break literally every existing custom task usage in the wild.

@pritidesai
Copy link
Copy Markdown
Member

As long as v1beta1.CustomRun is compatible with v1alpha1.Run, choosing feature flag route with enable-api-field sounds reasonable. Six months grace period (or something similar) is sufficient for the users to test their custom tasks controller before migrating in production.

@XinruZhang
Copy link
Copy Markdown
Member Author

Oh actually, if we go with the feature flag solution, do we want to define a new feature flag instead of enable-api-fields for the version switch? A potential use case is: what if a customer wants to use both alpha features and v1beta1.CustomRun.

@abayer
Copy link
Copy Markdown
Contributor

abayer commented Nov 3, 2022

Oh actually, if we go with the feature flag solution, do we want to define a new feature flag instead of enable-api-fields for the version switch? A potential use case is: what if a customer wants to use both alpha features and v1beta1.CustomRun.

Oh, definitely a new feature flag. I'm thinking of this similarly to the embedded-status feature flag - where the purpose isn't to gate a new feature or behavior, but to support a smooth transition to a reworking of existing behavior. In my mind, it's not exactly the same, in that we defaulted embedded-status to the "existing" behavior of fully embedding TaskRun/Run statuses in the PipelineRun status and will be switching that early next year, while here, I'd go with defaulting legacy-custom-tasks (or whatever we call the flag) to "false". That would allow for someone who's already using custom tasks to opt in to using v1alpha1.Run, but would ensure that anyone who isn't already using custom tasks would get v1beta1.CustomRuns going forward.

@pritidesai
Copy link
Copy Markdown
Member

Thanks @abayer, I am just thinking aloud here:

I'd go with defaulting legacy-custom-tasks (or whatever we call the flag) to "false". That would allow for someone who's already using custom tasks to opt in to using v1alpha1.Run

Wouldn't that cause two changes for the operator?

  1. Right now with the current production deployment, to support existing custom tasks.
  2. In a few months, when ready to switch to v1beta1.CustomRun.

Defaulting to v1alpha1.Run will require a single change from the operator:

  1. In a few months, when ready to transition to v1beta1.CustomRun

The users can choose to set the flag to v1beta1.CustomRun to test in their staging cluster.

@abayer
Copy link
Copy Markdown
Contributor

abayer commented Nov 3, 2022

That's a fair point. I think the right choice depends on how many users of custom tasks there are out in the wild right now - if having the default be v1beta1.CustomRun would only impact a small number of users who are currently using custom tasks, that feels like the right choice to me.

@jerop
Copy link
Copy Markdown
Member

jerop commented Nov 7, 2022

/kind tep

@tekton-robot tekton-robot added kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 7, 2022
@XinruZhang
Copy link
Copy Markdown
Member Author

/unhold

As we've decided to support both versions gated by a new feature flag, this PR is ready for review now. PTAL!

@tekton-robot tekton-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 8, 2022
@vdemeester
Copy link
Copy Markdown
Member

2. Create a new field (and remove that field later on) in PipelineTask to control which version to use when creating a Custom Task Run out of a PipelineTask. But removing that field is backward incompatible.

I would definitely go for that one then, a feature flag specific to this particular feature. Having it tied to enable-api-fields would be problematic. What if a user wants to use Matrix but still use v1alpha1.Run Custom Task ?

For the sake of it, this is what we say in our api compatibility policy for alpha feature and API.

These features may be dropped at any time, though you will be given at least one release worth of warning.

Run and CustomTask being alpha, this is something authors of CustomTask have to know and should assume. I am not saying we shouldn't have a 6 month period, etc..

I know we discussed it multiple times during the API working group, and in different PRs, and I'll go with whatever we collectively decide, but…

I think one of the reason we are doing this (feature-flag, support both, …) is because our "views" on what the v1beta1 Custom Task API should look like is not clear — aka on how to support retries and timeouts. Doesn't that just mean we are not ready for "promoting" the Custom Task API ? What we are doing today, is adding extra complexity in the code (feature-flag, multiple types, …), extra work on the user (operator) side of things (set this flag to use CustomTask new API that is not complete, temporarly) and on the Custom Task authors as well (depending on the featureflag, you might need to support both, etc…). Shouldn't we focus on wether we want to support retries in CustomTask, and if we do, how ?

On the same vein, any reason to do a v1beta1 for CustomRun and not a v1alpha1 for them ? Having v1alpha1.CustomRun would at least still "state" that it's alpha feature, can be dropped at any time, etc.. And when we feel v1alpha1.CustomRun is ready (has all the feature we need, etc.), we promote it to v1beta1 ?

@ScrapCodes
Copy link
Copy Markdown
Contributor

ScrapCodes commented Nov 9, 2022

I think one of the reason we are doing this (feature-flag, support both, …) is because our "views" on what the v1beta1 Custom Task API should look like is not clear — aka on how to support retries and timeouts. Doesn't that just mean we are not ready for "promoting" the Custom Task API ? What we are doing today, is adding extra complexity in the code (feature-flag, multiple types, …), extra work on the user (operator) side of things (set this flag to use CustomTask new API that is not complete, temporarly) and on the Custom Task authors as well (depending on the featureflag, you might need to support both, etc…). Shouldn't we focus on wether we want to support retries in CustomTask, and if we do, how ?

👍 I agree ! Adding two source paths -> extra complexity to the code and also for the custom tasks. Infact, my team was discussing we will build our own layer similar to tekton i.e. the ability to switch b/w 2 API using feature-flag for custom task controllers as well.

On the same vein, any reason to do a v1beta1 for CustomRun and not a v1alpha1 for them ? Having v1alpha1.CustomRun would at least still "state" that it's alpha feature, can be dropped at any time, etc.. And when we feel v1alpha1.CustomRun is ready (has all the feature we need, etc.), we promote it to v1beta1 ?

Even after v1beta1 for CustomRun we can keep some alpha features as alpha behind feature-flag.

@vdemeester
Copy link
Copy Markdown
Member

Even after v1beta1 for CustomRun we can keep some alpha features as alpha behind feature-flag.

Right, but a v1beta1 API is more constraint in what we can do. For example, we cannot remove field quickly, we have to add new fields in a backward compatible way, etc.. With an alpha API, we do not have to do that (and that's the main benefit of using an alpha API until we are ready).

@lbernick
Copy link
Copy Markdown
Member

lbernick commented Nov 9, 2022

In terms of promotion readiness: It seems to me like the best path forward would be to resolve TEP-0121 before releasing custom tasks beta.

In terms of what the migration plan should be: even if there were no changes between v1alpha1 Run and v1beta1 CustomRun, we would still want to have a feature flag IMO, because you need to update your controllers to use v1beta1, and having the flag makes the migration easier. (This would still be true even if we didn't rename from Run -> CustomRun.) IMO we should default to alpha and swap to beta after 1 release, and remove the flag after a few more releases.

I'm confused about how gating custom task fields behind feature flags would work, although that's probably a discussion for the future. One potential compatibility policy that would make sense for CustomRun is that we can introduce new "optional" fields, but must give 9 months of warning if we want to make these fields required for all custom task controllers to support. (In general I think we should hesitate before adding anything to CustomRun spec or status, and ask ourselves if it would make sense in the custom spec or custom status instead.)

@vdemeester
Copy link
Copy Markdown
Member

In terms of what the migration plan should be: even if there were no changes between v1alpha1 Run and v1beta1 CustomRun, we would still want to have a feature flag IMO, because you need to update your controllers to use v1beta1, and having the flag makes the migration easier. (This would still be true even if we didn't rename from Run -> CustomRun.) IMO we should default to alpha and swap to beta after 1 release, and remove the flag after a few more releases.

Imho it does not 😅. The feature-flags controls what the "user" (operator) wants to use as API. The fact that a Custom Task controller listen to an api version, or another (or both), is outside its control. The only thing it allows is to be able to use a non-updated Custom Task a little bit longer (the time we support the feature flag). Supporting v1beta1.CustomRun is entirely on the Custom Task author side of things.

@lbernick
Copy link
Copy Markdown
Member

lbernick commented Nov 9, 2022

In terms of what the migration plan should be: even if there were no changes between v1alpha1 Run and v1beta1 CustomRun, we would still want to have a feature flag IMO, because you need to update your controllers to use v1beta1, and having the flag makes the migration easier. (This would still be true even if we didn't rename from Run -> CustomRun.) IMO we should default to alpha and swap to beta after 1 release, and remove the flag after a few more releases.

Imho it does not 😅. The feature-flags controls what the "user" (operator) wants to use as API. The fact that a Custom Task controller listen to an api version, or another (or both), is outside its control. The only thing it allows is to be able to use a non-updated Custom Task a little bit longer (the time we support the feature flag). Supporting v1beta1.CustomRun is entirely on the Custom Task author side of things.

oh ok interesting. Imagine we are not going to make any changes to v1beta1 CustomRun compared to v1alpha1 Run. What would be your preferred migration plan in this case? Would you still want the feature flag, and what would its defaults/timeline be?

@vdemeester
Copy link
Copy Markdown
Member

oh ok interesting. Imagine we are not going to make any changes to v1beta1 CustomRun compared to v1alpha1 Run. What would be your preferred migration plan in this case? Would you still want the feature flag, and what would its defaults/timeline be?

I think I would announce ahead of time that in X release, tektoncd/pipeline will switch using v1beta1.CustomRun, to give time to Custom Task author to add support for it. Then, it becomes a choice of the user (operator) to keep using Custom Task not up-to-date by not upgrading to that version of pipeline (or we could add a feature flag to still create v1alpha1.Run but.. in the end, if the Custom Task author do not upgrade their task, users will have to move away from those or create their own).

@XinruZhang
Copy link
Copy Markdown
Member Author

XinruZhang commented Nov 9, 2022

IMHO, the biggest benefit that feature flag provides is to give custom task authors and end users enough time to test out the new Beta API, and at the same time, they are also able to use other new features or security patch etc. introduced after beta release. Otherwise, they have to make sure custom task beta works before updating to newer releases.

@XinruZhang
Copy link
Copy Markdown
Member Author

Having it tied to enable-api-fields would be problematic. What if a user wants to use Matrix but still use v1alpha1.Run Custom Task ?

@vdemeester I'm not sure If I uderstand correctly here: do you mean after making the new feature flag (say custom-task-version) default to beta, using alpha Matrix together with v1alpha1.Run is problematic?

@lbernick
Copy link
Copy Markdown
Member

lbernick commented Nov 9, 2022

Having it tied to enable-api-fields would be problematic. What if a user wants to use Matrix but still use v1alpha1.Run Custom Task ?

@vdemeester I'm not sure If I uderstand correctly here: do you mean after making the new feature flag (say custom-task-version) default to beta, using alpha Matrix together with v1alpha1.Run is problematic?

I think what vincent is saying is that we don't want to use enable-api-fields to determine what custom task version to use, because e.g. let's say that setting enable-api-fields to alpha enables alpha features and also alpha custom runs, but maybe you want to use an alpha feature and beta custom runs. Using a separate feature flag solves this problem.

@XinruZhang
Copy link
Copy Markdown
Member Author

Closing this PR as we decided to make decisions on TEP-0121 before promoting Custom Task Beta.

@XinruZhang XinruZhang closed this Nov 14, 2022
@lbernick
Copy link
Copy Markdown
Member

Even though we have made a decision on TEP-0121, I still think it's valuable for the TEP to capture the discussion around how we will introduce v1beta1 CustomRuns (i.e. feature flag vs new field in the api vs just swapping to creating CustomRuns instead of Runs). It's not clear that we made a decision on that

@jerop
Copy link
Copy Markdown
Member

jerop commented Nov 14, 2022

@lbernick will open a PR addressing that

@XinruZhang
Copy link
Copy Markdown
Member Author

Oh sorry I missed this thread notification, and yes this is related to migration Plan, thanks for capturing this! Since @lbernick is going to open a PR to address it. I'll leave this as closed.

@XinruZhang XinruZhang mentioned this pull request Nov 18, 2022
6 tasks
@XinruZhang XinruZhang reopened this Nov 18, 2022
@tekton-robot tekton-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Nov 18, 2022
@pritidesai
Copy link
Copy Markdown
Member

/assign @jerop
/assign @vdemeester
/assign @pritidesai

Waiting on the approval

@tekton-robot tekton-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 21, 2022
Comment thread teps/0114-custom-tasks-beta.md Outdated
Support both v1alpha1.Run and v1beta1.CustomRun for 4 LTS releases
after Beta release so that custom task end users and developers are
able to fully test and migrate Custom Task from alpha to beta

To achieve this, this commit introduces a new feature flag to control
the version switch between alpha and beta when defining a custom task
within a Pipeline.
@tekton-robot tekton-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 24, 2022
@tekton-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jerop, vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@pritidesai
Copy link
Copy Markdown
Member

API WG - ready to merge

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 28, 2022
@tekton-robot tekton-robot merged commit 524ca13 into tektoncd:main Nov 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

Status: UnAssigned

Development

Successfully merging this pull request may close these issues.

8 participants