Declaring preprocessor functions #2385

karlkfi · 2021-07-10T04:22:27Z

I want to be able to use a function in a pipeline to perform processing of values AFTER they have been set by a parent package's apply-setters.

The following cluster package works intuitively as a single package, but it doesn't work when it it's inside a parent package that also uses apply-setters, because the parent package mutators are executed after the child package's mutators...

Parent setters:

apiVersion: kpt.dev/v1
kind: Kptfile
metadata:
  name: cluster
pipeline:
  mutators:
    - image: gcr.io/kpt-fn/apply-setters:v0.1
      configPath: setters.yaml

Child setters:

apiVersion: kpt.dev/v1
kind: Kptfile
metadata:
  name: nodepool
pipeline:
  mutators:
    - image: gcr.io/kpt-fn/apply-setters:v0.1
      configPath: setters.yaml
    - image: gcr.io/kpt-fn/starlark:v0.1
      configPath: truncate-service-accounts.yaml

The workaround I know of is to copy the function to the parent package kptfile, but if I do that, it will affect all resources in the parent package, not just the resources in the child package. And even if I could scope it to the right package, it would be a hassle to have to copy function config from child packages to parent packages.

One way to solve this might be to have early and late mutators.

Order of execution:

parent "early mutators"
child "early mutators"
child "late mutators"
parent "late mutators"

This would kill two birds with one stone, because it would also allow for parent packages to modify the setter values of child packages. I could put apply-setters as an "early mutator" and the starlark function in a "late mutator" and it would affect the setter values as passed from parent to child without affecting the parent or sibling packages.

For example...

Parent setters:

apiVersion: v1
kind: ConfigMap
metadata:
  name: setters
  annotations:
    config.kubernetes.io/local-config: "true"
data:
  name: cluster-1
  project-id: example-1234

Child setters:

apiVersion: v1
kind: ConfigMap
metadata:
  name: setters
  annotations:
    config.kubernetes.io/local-config: "true"
data:
  name: pool-1
  cluster-name: cluster-1 # kpt-set: ${name}
  project-id: example-1234 # kpt-set: ${project-id}

Child resource:

apiVersion: iam.cnrm.cloud.google.com/v1beta1
kind: IAMServiceAccount
metadata:
  name: gke-cluster-1-pool-1 # kpt-set: gke-${cluster-name}-${name}

Service account names are frequently too long when concatenating other names together. With this configuration the starlark function could truncate it after the setters have been applied. And with this pattern, the name setter means "node pool name" in the child package and "cluster name" in the parent package, allowing for hierarchical setter namespacing.

The text was updated successfully, but these errors were encountered:

frankfarzan · 2021-07-12T17:08:35Z

This is a use case that has been discussed, referred to as "pre-processors" as opposed to current execution order where parent package functions run as "post-processors". This is something that can be added incrementally as it is purely additive (i.e. via a preprocessors field in the pipeline section) so was left out of v1 scope. We should look at this in v1.1 time frame.

morgante · 2021-07-20T02:49:21Z

Please prioritize this. The lack of this basically makes setter inheritance unworkable as currently implemented. Let's say I want to apply a setter (ex. organization-id) but some functions in lower packages depend on that setter value. The functions in subpackages will always run before the parent apply-setters function, meaning my only option is to manually copy the setter value down into my subpackage.

phanimarupaka · 2021-07-22T22:27:34Z

@karlkfi Not proposing a solution here, but trying to understand the problem better, does it help if we make the order of hydration configurable ? Currently, the default order if bottom up, where children are hydrated before parents. What if we make it configurable, and hydrate parents first and then children?(top down)

morgante · 2021-07-22T22:37:07Z

@phanimarupaka That would not solve the problem, as we still have cases where parents need to be applied first (ex. setter inheritance, the one I mentioned). Shifting the burden to users via configuration isn't sufficient.

phanimarupaka · 2021-07-22T22:54:20Z

So for this case, having early-mutators, early-validators which are applied in top-down fashion(additive change) and mutators, validators which apply in bottom-up fashion(current behavior) should help I guess. Advanced users can pick and choose where to include the functions.

phanimarupaka · 2021-07-23T00:20:05Z

@morgante Another way is by using target selector for functions. The starlark function can be moved to parent package and specify target functions as mentioned in this #2015.

morgante · 2021-07-23T01:29:46Z

@phanimarupaka No, that is not a solution. The entire point is that the parent package should not need to know the details of child packages. Once you start tightly coupling them like that, you've lost the package boundary.

We need either a form of pre and post hooks, or a way to assign weights to functions.

karlkfi · 2021-07-23T03:37:27Z

Morgante is right. The point is to be able to re-use packages by wrapping them in a parent package and have the parent be able to mutate the inputs and outputs of the child package, without needing the user to manually modify the child package directly.

In the common case, the setters are inputs and the resulting yaml is the output. Early mutators allow modifying the inputs. Late mutators allow modifying the outputs.

I prefer this to weights because any new parent package can be added as a wrapper without the child being able to “escape” the parent’s control. It’s encapsulation, like a function calling another function and being able to modify arguments and return values.

droot · 2021-07-23T19:01:51Z

@phanimarupaka about: does it help if we make the order of hydration configurable ?

That won't help technically because we don't allow operating on meta resources during render, while in this case, primary objective for top-down processing is to customize the input to the pipeline.

The point is to be able to re-use packages by wrapping them in a parent package and have the parent be able to mutate the inputs and outputs of the child package, without needing the user to manually modify the child package directly.
In the common case, the setters are inputs and the resulting yaml is the output. Early mutators allow modifying the inputs. Late mutators allow modifying the outputs.

Thanks @karlkfi This articulates the use-case so well conceptually.

@bgrant0607 also touched on this in the comment #1280 (comment)

I think we should design this with filter functionality together because they seem to have an interplay here.

phanimarupaka · 2021-07-23T19:13:22Z

Morgante is right. The point is to be able to re-use packages by wrapping them in a parent package and have the parent be able to mutate the inputs and outputs of the child package, without needing the user to manually modify the child package directly.

So the early mutators should act only on the inputs, and early validators should validate the inputs to solve this problem. The functions declared in pre-processors section (which holds early mutators and validators) should act only on the Kptfile/function config (meta resources) files and not actual resources. I feel that this is pretty powerful in terms of input values mutations and validations. This would also solve issues like #2041.

Some prior art in this space is PersistentPreRun hook in Cobra https://github.com/spf13/cobra/blob/v1.2.1/user_guide.md#prerun-and-postrun-hooks

In the common case, the setters are inputs and the resulting yaml is the output. Early mutators allow modifying the inputs. Late mutators allow modifying the outputs.

I prefer this to weights because any new parent package can be added as a wrapper without the child being able to “escape” the parent’s control. It’s encapsulation, like a function calling another function and being able to modify arguments and return values.

droot · 2021-07-23T19:36:17Z

The functions declared in pre-processors section (which holds early mutators and validators) should act only on the Kptfile/function config (meta resources) files and not actual resources.

Not sure, if we need to have such restriction. Being able to modify a resource in pre-processor is also a way to configure input to the pipeline of a package. For ex, changing the replica count of kafka cluster in the resource yaml and then reconcile function changing the dns names on the basis of new replica count is a simple use-case.

phanimarupaka · 2021-07-23T19:52:14Z

The functions declared in pre-processors section (which holds early mutators and validators) should act only on the Kptfile/function config (meta resources) files and not actual resources.

Not sure, if we need to have such restriction. Being able to modify a resource in pre-processor is also a way to configure input to the pipeline of a package. For ex, changing the replica count of kafka cluster in the resource yaml and then reconcile function changing the dns names on the basis of new replica count is a simple use-case.

Agree. I think we got good inputs regarding the problem statement. We can switch to internal design doc to finalize the design.

karlkfi · 2021-07-23T20:20:28Z

Not sure, if we need to have such restriction. Being able to modify a resource in pre-processor is also a way to configure input to the pipeline of a package.

Agreed. Because kpt uses in-place hydration, all of the outputs may be used as inputs.

RafalMaleska · 2022-12-07T11:05:49Z

stumbled about this as well.
makes in our case the usage of child-parent kptfiles difficult

current workaround: wrapper script

karlkfi added the enhancement New feature or request label Jul 10, 2021

frankfarzan changed the title ~~Early & Late Mutator Functions~~ Declaring preprocessor functions Jul 12, 2021

frankfarzan added the area/hydrate label Jul 12, 2021

frankfarzan added this to To do in kpt kanban board via automation Jul 12, 2021

frankfarzan added the triaged Issue has been triaged by adding an `area/` label label Jul 12, 2021

frankfarzan added this to the v1.1 milestone Jul 12, 2021

mikebz removed this from the v1.1 milestone Jul 14, 2021

karlkfi mentioned this issue Jul 20, 2021

Upgrade GKE package to kpt v1 GoogleCloudPlatform/blueprints#40

Merged

morgante mentioned this issue Jul 21, 2021

org-id quirk in hierarchy from fn eval order GoogleCloudPlatform/blueprints#43

Closed

phanimarupaka self-assigned this Jul 21, 2021

phanimarupaka added this to ToDo in kpt kanban board Jul 26, 2021

phanimarupaka moved this from ToDo to In progress in kpt kanban board Jul 26, 2021

droot mentioned this issue Aug 2, 2021

Adds configmap injector function GoogleContainerTools/kpt-functions-catalog#508

Closed

phanimarupaka mentioned this issue Aug 2, 2021

[POC] Early-mutators and early-validators for render #2424

Closed

phanimarupaka moved this from In progress to In Review in kpt kanban board Aug 11, 2021

phanimarupaka moved this from In Review to In progress in kpt kanban board Aug 11, 2021

phanimarupaka added the customer deep engagement label Aug 21, 2021

phanimarupaka added this to the Q3-2021 milestone Aug 24, 2021

phanimarupaka moved this from In progress to ToDo in kpt kanban board Sep 7, 2021

phanimarupaka modified the milestones: Q3-2021, Q4-2021 Sep 22, 2021

droot removed this from ToDo in kpt kanban board Jan 19, 2022

droot assigned droot and unassigned droot and phanimarupaka Mar 21, 2022

bgrant0607 mentioned this issue May 10, 2022

Could we just use kustomize for transformations? #3121

Open

droot added the design-doc label May 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Declaring preprocessor functions #2385

Declaring preprocessor functions #2385

karlkfi commented Jul 10, 2021

frankfarzan commented Jul 12, 2021 •

edited

Loading

morgante commented Jul 20, 2021

phanimarupaka commented Jul 22, 2021

morgante commented Jul 22, 2021

phanimarupaka commented Jul 22, 2021

phanimarupaka commented Jul 23, 2021

morgante commented Jul 23, 2021

karlkfi commented Jul 23, 2021 •

edited

Loading

droot commented Jul 23, 2021 •

edited

Loading

phanimarupaka commented Jul 23, 2021 •

edited

Loading

droot commented Jul 23, 2021

phanimarupaka commented Jul 23, 2021

karlkfi commented Jul 23, 2021

RafalMaleska commented Dec 7, 2022

Declaring preprocessor functions #2385

Declaring preprocessor functions #2385

Comments

karlkfi commented Jul 10, 2021

frankfarzan commented Jul 12, 2021 • edited Loading

morgante commented Jul 20, 2021

phanimarupaka commented Jul 22, 2021

morgante commented Jul 22, 2021

phanimarupaka commented Jul 22, 2021

phanimarupaka commented Jul 23, 2021

morgante commented Jul 23, 2021

karlkfi commented Jul 23, 2021 • edited Loading

droot commented Jul 23, 2021 • edited Loading

phanimarupaka commented Jul 23, 2021 • edited Loading

droot commented Jul 23, 2021

phanimarupaka commented Jul 23, 2021

karlkfi commented Jul 23, 2021

RafalMaleska commented Dec 7, 2022

frankfarzan commented Jul 12, 2021 •

edited

Loading

karlkfi commented Jul 23, 2021 •

edited

Loading

droot commented Jul 23, 2021 •

edited

Loading

phanimarupaka commented Jul 23, 2021 •

edited

Loading