Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

include-file function #2350

Open
frankfarzan opened this issue Jun 29, 2021 · 7 comments
Open

include-file function #2350

frankfarzan opened this issue Jun 29, 2021 · 7 comments
Labels
area/hydrate enhancement New feature or request p1 triaged Issue has been triaged by adding an `area/` label

Comments

@frankfarzan
Copy link
Contributor

frankfarzan commented Jun 29, 2021

A weakness of execution configuration pattern is that it interleaves configuration and code. For example, gatekeeper constraint templates wrap Rego code in KRM. Similarly, starlark code is wrapped in StarlarkRun resource. We can improve the development UX by providing an imperative pre-processor function that injects the content of a file into a particular field of a KRM resource. Note that this is useful for gatekeeper admission controller, even when not using the gatekeeper kpt function.

Strawman:

Assume one or more resources (which may be deeply nested) in the current directory have a comment directive pointing to a file containing code, e.g.:

apiVersion: fn.kpt.dev/v1alpha1
kind: StarlarkRun
metadata:
  name: set-namespace-to-prod
# kpt-include: myscript.star
source: |
...

This script references myscript.star in the same directory:

def setnamespace(resources, namespace):
  for resource in resources:
    # mutate the resource
    resource["metadata"]["namespace"] = namespace
setnamespace(ctx.resource_list["items"], "prod")

And we have a gatekeeper template:

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata: # kpt-merge: /k8sbannedconfigmapkeysv1
  name: k8sbannedconfigmapkeysv1
spec:
  crd:
    spec:
      names:
        kind: K8sBannedConfigMapKeysV1
        validation:
          openAPIV3Schema:
            properties:
              keys:
                type: array
                items:
                  type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      # kpt-include: myscript.rego
      rego: |
...

That references myscript.rego:

package ban_keys

violation[{"msg": sprintf("%v", [val])}] {
  keys = {key | input.review.object.data[key]}
  banned = {key | input.parameters.keys[_] = key}
  overlap = keys & banned
  count(overlap) > 0
  val := sprintf("The following banned keys are being used in the ConfigMap: %v", [overlap])
}

After running the imperative function:

$ kpt fn eval -i gcr.io/kpt-fn/include-file --mount-current

(See #2351)

All resources containing the kpt-include comment directive are expanded:

apiVersion: fn.kpt.dev/v1alpha1
kind: StarlarkRun
metadata:
  name: set-namespace-to-prod
# kpt-include: myscript.star
source: |
  # set the namespace on all resources
  def setnamespace(resources, namespace):
    for resource in resources:
      # mutate the resource
      resource["metadata"]["namespace"] = namespace
  setnamespace(ctx.resource_list["items"], "prod")
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata: # kpt-merge: /k8sbannedconfigmapkeysv1
  name: k8sbannedconfigmapkeysv1
spec:
  crd:
    spec:
      names:
        kind: K8sBannedConfigMapKeysV1
        validation:
          openAPIV3Schema:
            properties:
              keys:
                type: array
                items:
                  type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
     # kpt-include: myscript.rego
      rego: |-
        package ban_keys

        violation[{"msg": sprintf("%v", [val])}] {
          keys = {key | input.review.object.data[key]}
          banned = {key | input.parameters.keys[_] = key}
          overlap = keys & banned
          count(overlap) > 0
          val := sprintf("The following banned keys are being used in the ConfigMap: %v", [overlap])
        }

Considerations:

  • Potential for abuse of this function. We don't want to encourage its use for maintaining partial resource configuration as a DRY technique (i.e. injecting KRM into KRM). Injecting into a single field of type string mitigates this. Should also consider a more descriptive name.
  • Whether to support relative paths or only allow the file in the same directory as the KRM. The latter is fine to start with to keep things simple.
@frankfarzan frankfarzan added this to To do in kpt kanban board via automation Jun 29, 2021
@frankfarzan frankfarzan added this to the v1.1 milestone Jun 29, 2021
@frankfarzan frankfarzan added the triaged Issue has been triaged by adding an `area/` label label Jun 29, 2021
@bgrant0607
Copy link
Contributor

I agree with the injection idea in order to reduce boilerplate and facilitate separation of code and data, and possibly other preprocessing. This could also facilitate testing.

A non-kpt ConstraintTemplate generation example:
https://github.com/plexsystems/konstraint/tree/main/examples/container-deny-latest-tag

Possibly another issue would be warranted, but functions are currently too verbose. For example, setting an annotation should only require a couple of lines of code. The current function contains 150.
https://github.com/GoogleContainerTools/kpt-functions-catalog/blob/master/functions/go/set-annotations/main.go

Other Go implementations that set annotations are also fairly long, largely because they include a lot of code that could be in a more generic harness:
https://github.com/kubernetes-sigs/kustomize/blob/master/plugin/builtin/annotationstransformer/AnnotationsTransformer.go
https://github.com/kubernetes-sigs/kustomize/blob/master/api/filters/annotations/annotations.go
https://github.com/jenkins-x-plugins/jx-gitops/blob/main/pkg/cmd/annotate/annotate.go
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/kubectl/pkg/cmd/annotate/annotate.go

Interpreted languages especially have the potential for simplification, but even our typescript functions are much longer than is ideal and would benefit from kernel injection or composition:
https://github.com/GoogleContainerTools/kpt-functions-catalog/blob/master/functions/ts/kubeval/src/kubeval.ts

Metacontroller did a decent job of facilitating simple transformation kernels:
https://github.com/GoogleCloudPlatform/metacontroller/blob/master/examples/catset/sync.js

In particular, transformers and validators would benefit from a simple filter kernel framework, with filtering implemented in the orchestrator, as discussed in #2364.

@karlkfi @droot

@frankfarzan frankfarzan changed the title include-file funnction include-file function Jul 9, 2021
@mgoltzsche
Copy link

mgoltzsche commented Jul 11, 2021

Coming from #1151 (comment) supporting such an include comment directive sounds interesting.

My use cases that could benefit from such an include functionality are as follows:

  • Loading a kustomization including JSON patches as input of a kustomize pipeline function.
  • Loading a local Helm chart's templates as input of a khelm function.

In both cases the function could define a custom resource kind (e.g. FileBlob) used as wrapper resource for non-KRM contents and the comment directive used to add a file's contents to it.
However creating a separate resource for each required non-KRM file could be very cumbersome. Ideally users should be able to just point to a directory (within a Kptfile?) whose files should be wrapped into FileBlob resources (defining the resource kind within the kpt project) - in that case the include comment directive wouldn't be involved.
The Helm chart use case comes with an additional problem: Helm templates need to be maintained outside the kpt function source directory since they contain *.yaml files that actually aren't yaml which makes the kpt source yaml parser fail otherwise.
Therefore one would either need to be able to let the comment directive refer to files outside the function source directory (which should be avoided due to security reasons) or prevent kpt from parsing yaml files that are included as wrapped KRM resources.

@linde
Copy link
Contributor

linde commented Dec 1, 2021

@maxsmythe and @julianKatz and @willbeason -- just FYI, not sure if this is on your radar

@maxsmythe
Copy link

Would this function only modify files in-place, or could it redirect the output to a separate directory, leaving the original YAML untouched?

@maxsmythe
Copy link

Also, at what granularity could/does this operate? If I point it to a directory tree, will it perform all substitutions in that directory tree?

If I'm not modifying the file in-place, will the structure of the output's dirtree be maintained?

@bgrant0607
Copy link
Contributor

bgrant0607 commented Jul 14, 2022

Rather than mounting the current working directory, another approach could be including non-KRM files in the ResourceList by wrapping them with KRM: #3118. That's probably the approach we'll need for porch. Like the FileBlob proprosal above.

@marniks7
Copy link

Hi,
few more cases for this feature.

Include non-yaml files (sh scripts, python scripts, etc)

Tekton allows to add content of the script to yaml file

apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
  name: kubectl-run
spec:
  taskRef:
    name: kubernetes-actions
  params:
    - name: script
      value: |
        kubectl get pods 
        echo "-----------"
        kubectl get deploy

This is fine when the script is small, but when it become bigger you may want to move it to a separate file to get IDE features

e.g.
my-script.sh

#!/bin/bash
kubectl get pods 
echo "-----------"
kubectl get deploy

Include Markdown files

In order to describe Pipeline in Tekton can be used description field

apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
  name: cep0001
  namespace: tekton-namespace
spec:
  description: |
    TBD. include-file @cep0001.md

I would like to store description separately from the pipeline and include it into Pipeline with kpt
cep0001.md

Chaos Scenario CEP0001

Link: TBD

Flow:
1. Get pods
2. Kill master redis pod
3. Get pods

Include YAML files

I can have PipelineRun where my input parameter is YAML as well and I want IDE features to be able to change that yaml separately

apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: cep0002-run
spec:
  params:
    - name: chaos-mesh-yaml
      value: |
         TBD include pod-kill.yaml

pod-kill.yaml

kind: PodChaos
apiVersion: chaos-mesh.org/v1alpha1
metadata:
  namespace: target-namespace
  name: pod-kill
  annotations:
    config.kubernetes.io/local-config: "true"
spec:
  selector:
    namespaces:
      -  target-namespace
    labelSelectors:
      app.kubernetes.io/component: master
      app.kubernetes.io/name: redis
  mode: one
  action: pod-kill
  gracePeriod: 0

Note: I would like to modify pod-kill.yaml by kpt before including into file

Partial YAMLs included into one YAML

https://github.com/argoproj/argo-workflows/blob/master/examples/handle-large-output-results.yaml
This huge yaml can be split into different parts with the functions where to upsert it

templates:
    - name: echo
      inputs:
        parameters:
          - name: index
        artifacts:
          - name: items
            path: /tmp/items
      container:
        image: stedolan/jq:latest
        command: [sh, -c]
        # jq is just one way to get the appropriate item.
        args: ["cat /tmp/items | jq '.[{{inputs.parameters.index}}]'"]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/hydrate enhancement New feature or request p1 triaged Issue has been triaged by adding an `area/` label
Projects
None yet
Development

No branches or pull requests

8 participants