Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Option for transformer annotations #4267

Conversation

natasha41575
Copy link
Contributor

This is a proposal to extend the buildMetadata.originAnnotations feature that we recently introduced. We would like to add the annotations to generated resources and also have an option to capture information about which transformers have processed each resource.

/cc @KnVerey
/cc @yuwenma
/cc @monopole

/hold to allow everyone to get their reviews in before this is merged

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 4, 2021
@k8s-ci-robot
Copy link
Contributor

@natasha41575: GitHub didn't allow me to request PR reviews from the following users: yuwenma.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

This is a proposal to extend the buildMetadata.originAnnotations feature that we recently introduced. We would like to add the annotations to generated resources and also have an option to capture information about which transformers have processed each resource.

/cc @KnVerey
/cc @yuwenma
/cc @monopole

/hold to allow everyone to get their reviews in before this is merged

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Nov 4, 2021
@natasha41575 natasha41575 changed the title proposal for transformer annotations option Proposal: Option for transformer annotations Nov 4, 2021
@yuwenma
Copy link
Contributor

yuwenma commented Nov 4, 2021

/lgtm

@k8s-ci-robot
Copy link
Contributor

@yuwenma: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

proposals/21-11-transformer-annotations.md Show resolved Hide resolved
proposals/21-11-transformer-annotations.md Outdated Show resolved Hide resolved
proposals/21-11-transformer-annotations.md Outdated Show resolved Hide resolved
proposals/21-11-transformer-annotations.md Outdated Show resolved Hide resolved
kustomize doesn't currently support comments.

We also considered making this a flag to `kustomize build` rather than additional field, but we would like to align
with the kustomize way of avoiding build-time side effects and have everything declared explicitly in the kustomization.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the solution you proposed here is more consistent with Kustomize than either alternative. If this truly is a debug-only feature, something else we could do is a separate command like kustomize cfg explain . --resource=Deployment/default/foo (would print meta, origin and transformer info instead of the resource bodies). That would probably require the same annotations under the hood. It could also be used to prove out the usefulness/completeness of the data before we add it to the build command.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to extend the motivations of this feature a little bit:

  1. This feature is an extension of the previous originAnnotations. It may make less sense to split the feature to a separate command and not provide an equivalence declarative approach.
  2. The kustomize cfg approach is harder to scale for a large set of resources, which is how the debugging can mostly help with. For example, when it's hard to triage the origin and transformer of 500 deployments from 1000 resource files, users would be unwillingly to run the kustomize cfg cmd 500 times I guess?
  3. "annotations" is designed to attach metadata, no size-limitations. It is ideal for this kind of tracing info and many other tools or libs use it the same way. e.g. Enterprise level: Google Config Sync, OSS: tekton, skaffold.
  4. When thinking about this approach in the bigger picture, say working in a gitOps CICD approach, the git diff of different kustomize output will not only shows the config change, but the buildMetadata origin changes. It will greatly benefit those users who apply kustomize in their enterprise CICD workflow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to all points Yuwen described above. While I think something like kustomize cfg explain . --resource=Deployment/default/foo could be useful in some scenarios, I particularly agree with Yuwen's points 2 and 3 above - scalability for a user needing to inspect a large set of resources, and that this metadata is useful for other tools and "annotations" is where they'd expect to find them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the proposed implementation clearly builds on originAnnotations, the primary use case set out sounds pretty different, so I think it's worth considering what the best way to achieve the end user goal in question would be rather than presupposing the implementaiton.

In case (2), what is the ideal user workflow in your opinion? Surely they aren't actually looking through all 500 deployments, but rather targeting one or two that have the problem they're investigating. If there's a use case where they really need to delve through all of the large dataset, let's add it.

Re (3), storage isn't actually free, no matter what API server validations will allow. I'm aware of cases of extremely verbose annotations causing significant scalability problems. So if we're designing this feature with the intent that end users will be committing and deploying the data we write, I think we should indeed be careful about how verbose we're being. On the contrary, if the 99% use case is to debug something before commit, we should be super verbose and as helpful as possible.

Further to that point: if the purpose is debugging, it would be most useful to use absolute paths, not paths relative to the parent Kustomization or the current Kustomize root. If we're writing something to be committed, we obviously cannot do that, and in fact should aim to make the annotation content as stable as possible.

Can you give a more detailed example of why (4) is desirable? By embedding what are effectively implementation details of the generation pipeline in the output, we will make it inherently less stable--not generally a good thing for gitOps. For example, renaming a file will cause a diff even though it does not affect the resources' content. Back to my point above, I think optimizing for content that will be committed is in tension with optimizing for debuggability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add on to Yuwen's point 3, one of the use cases is Google's Config Sync, which takes a kustomization repo, automatically runs kustomize build, and syncs it to a cluster. For these users, it is useful to have these annotations available in the cluster for all deployed resources. If a user notices that something is wrong in the cluster, Config Sync can quickly and easily show the user how the resource was created. An open source tool with a very similar requirement is skaffold, which can run kustomize and deploy the output to a cluster via skaffold deploy. kubectl apply -k also falls under this category.

For a user using kustomize CLI directly, a separate command to debug the output of kustomize makes sense. But for users using other tools that automatically run kustomize and deploy, it is better for them if the annotations persist to the cluster. Many of these users may not even have the kustomize CLI installed to their local machines, and so allowing other tools to add these annotations as part of kustomize build is a huge benefit.

Edited to add: I am also open to adding both an option to buildMetadata to annotate resources and a separate kustomize command to provide users some flexbility in how they use the feature. Since both would be driven by the same annotations under the hood, I think this is a maintainable option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Natasha. Yeah, the actual use case is based on the fact that kustomize is integrated into with some other automated tools and users do not directly interact with kustomize.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the proposal to include a user story as described for our use cases, and added adding a separate kustomize command as a considered alternative. Let me know if you feel strongly about having both the command and the built option; otherwise I believe all of the feedback has been addressed and incorporated into the proposal.

proposals/21-11-transformer-annotations.md Outdated Show resolved Hide resolved
proposals/21-11-transformer-annotations.md Show resolved Hide resolved
proposals/21-11-transformer-annotations.md Outdated Show resolved Hide resolved
proposals/21-11-transformer-annotations.md Outdated Show resolved Hide resolved
proposals/21-11-transformer-annotations.md Outdated Show resolved Hide resolved
proposals/21-11-transformer-annotations.md Outdated Show resolved Hide resolved
proposals/21-11-transformer-annotations.md Outdated Show resolved Hide resolved
kustomize doesn't currently support comments.

We also considered making this a flag to `kustomize build` rather than additional field, but we would like to align
with the kustomize way of avoiding build-time side effects and have everything declared explicitly in the kustomization.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the proposed implementation clearly builds on originAnnotations, the primary use case set out sounds pretty different, so I think it's worth considering what the best way to achieve the end user goal in question would be rather than presupposing the implementaiton.

In case (2), what is the ideal user workflow in your opinion? Surely they aren't actually looking through all 500 deployments, but rather targeting one or two that have the problem they're investigating. If there's a use case where they really need to delve through all of the large dataset, let's add it.

Re (3), storage isn't actually free, no matter what API server validations will allow. I'm aware of cases of extremely verbose annotations causing significant scalability problems. So if we're designing this feature with the intent that end users will be committing and deploying the data we write, I think we should indeed be careful about how verbose we're being. On the contrary, if the 99% use case is to debug something before commit, we should be super verbose and as helpful as possible.

Further to that point: if the purpose is debugging, it would be most useful to use absolute paths, not paths relative to the parent Kustomization or the current Kustomize root. If we're writing something to be committed, we obviously cannot do that, and in fact should aim to make the annotation content as stable as possible.

Can you give a more detailed example of why (4) is desirable? By embedding what are effectively implementation details of the generation pipeline in the output, we will make it inherently less stable--not generally a good thing for gitOps. For example, renaming a file will cause a diff even though it does not affect the resources' content. Back to my point above, I think optimizing for content that will be committed is in tension with optimizing for debuggability.

proposals/21-11-transformer-annotations.md Outdated Show resolved Hide resolved
@natasha41575 natasha41575 force-pushed the TransformerAnnotationsProposal branch 3 times, most recently from 2a02b4c to 07daa8b Compare December 3, 2021 21:55
@natasha41575 natasha41575 removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 10, 2021
@KnVerey
Copy link
Contributor

KnVerey commented Dec 13, 2021

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 13, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: KnVerey, natasha41575

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [KnVerey,natasha41575]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants