Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pulumi fails to upgrade chart when SSA enabled #2215

Closed
hagaibarel opened this issue Oct 26, 2022 · 8 comments · Fixed by #2266
Closed

Pulumi fails to upgrade chart when SSA enabled #2215

hagaibarel opened this issue Oct 26, 2022 · 8 comments · Fixed by #2266
Labels
area/server-side-apply impact/usability Something that impacts users' ability to use the product easily and intuitively kind/bug Some behavior is incorrect or out of spec kind/question Questions about existing features resolution/fixed This issue was fixed

Comments

@hagaibarel
Copy link

hagaibarel commented Oct 26, 2022

What happened?

After initial installation of a chart, following actions (up/preview) fail because some fields on resources are manged by the installed components.

Steps to reproduce

Consider the following pulumi program:

package main

import (
	"github.com/pulumi/pulumi-kubernetes/sdk/v3/go/kubernetes/helm/v3"
	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)

func main() {
	pulumi.Run(func(ctx *pulumi.Context) error {
		_, err := helm.NewChart(ctx, "kruise", helm.ChartArgs{
			Chart:   pulumi.String("kruise"),
			Version: pulumi.String("1.3.0"),
			FetchArgs: helm.FetchArgs{
				Repo: pulumi.String("https://openkruise.github.io/charts/"),
			},
			Values: pulumi.Map{
				"manager": pulumi.Map{
					"replicas": pulumi.Int(1),
				},
			},
		})
		if err != nil {
			return err
		}

		return nil
	})
}

Initial installation works fine, but when trying to run the next operation (e.g. preivew), the follow error occuers:

Previewing update (dev):
     Type                                                                             Name                                     Plan       Info
     pulumi:pulumi:Stack                                                              kruise-ssa-dev                                      1 error
     └─ kubernetes:helm.sh/v3:Chart                                                   kruise                                              
 ~      └─ kubernetes:admissionregistration.k8s.io/v1:ValidatingWebhookConfiguration  kruise-validating-webhook-configuration  update     [diff: ~metadata,webhooks]; 1 error
 
Diagnostics:
  kubernetes:admissionregistration.k8s.io/v1:ValidatingWebhookConfiguration (kruise-validating-webhook-configuration):
    error: Preview failed: 1 error occurred:
    	* the Kubernetes API server reported that "kruise-validating-webhook-configuration" failed to fully initialize or become live: use `pulumi.com/patchForce` to override the conflict: Apply failed with 20 conflicts: conflicts with "kruise-manager" using admissionregistration.k8s.io/v1:
    - .metadata.annotations.template
    - .webhooks[name="vadvancedcronjob.kb.io"].clientConfig.caBundle
    - .webhooks[name="vbroadcastjob.kb.io"].clientConfig.caBundle
    - .webhooks[name="vbuiltindeployment.kb.io"].clientConfig.caBundle
    - .webhooks[name="vbuiltinreplicaset.kb.io"].clientConfig.caBundle
    - .webhooks[name="vbuiltinstatefulset.kb.io"].clientConfig.caBundle
    - .webhooks[name="vcloneset.kb.io"].clientConfig.caBundle
    - .webhooks[name="vcustomresourcedefinition.kb.io"].clientConfig.caBundle
    - .webhooks[name="vdaemonset.kb.io"].clientConfig.caBundle
    - .webhooks[name="vimagepulljob.kb.io"].clientConfig.caBundle
    - .webhooks[name="vnamespace.kb.io"].clientConfig.caBundle
    - .webhooks[name="vnodeimage.kb.io"].clientConfig.caBundle
    - .webhooks[name="vpodprobemarker.kb.io"].clientConfig.service.name
    - .webhooks[name="vpodprobemarker.kb.io"].clientConfig.service.namespace
    - .webhooks[name="vpodunavailablebudget.kb.io"].clientConfig.caBundle
    - .webhooks[name="vresourcedistribution.kb.io"].clientConfig.caBundle
    - .webhooks[name="vsidecarset.kb.io"].clientConfig.caBundle
    - .webhooks[name="vstatefulset.kb.io"].clientConfig.caBundle
    - .webhooks[name="vuniteddeployment.kb.io"].clientConfig.caBundle
    - .webhooks[name="vworkloadspread.kb.io"].clientConfig.caBundle
 
  pulumi:pulumi:Stack (kruise-ssa-dev):
    error: preview failed

From what I understand, since the caBunlde is set to some dummy value in the chart, and the controller rewrites them once it starts it's now the owner of those fields, yet pulumi is still trying to override them. There's no way to pass pulumi.com/patchForce=true to those resources, nor would I like to as it can break the webhook ca configuration.

Expected Behavior

pulumi should not try to update fields owned by another entity and opeartions should work

Actual Behavior

See repro above

Output of pulumi about

CLI          
Version      3.44.0
Go Version   go1.19.2
Go Compiler  gc

Plugins
NAME        VERSION
go          unknown
kubernetes  3.22.0

Host     
OS       ubuntu
Version  22.04
Arch     x86_64

This project is written in go: executable='/usr/local/go/bin/go' version='go version go1.19.2 linux/amd64'

(redacted stack and backend info)

Dependencies:
NAME                                        VERSION
github.com/pulumi/pulumi-kubernetes/sdk/v3  3.22.0
github.com/pulumi/pulumi/sdk/v3             3.44.1

Pulumi locates its logs in /tmp by default

Additional context

I've seeing this happening in other charts as well (e.g. kube-state-metrics) where other controllers update fields and are set as owners but in those cases there was a way to pass forcePatch

Contributing

Vote on this issue by adding a 👍 reaction.
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

@hagaibarel hagaibarel added kind/bug Some behavior is incorrect or out of spec needs-triage Needs attention from the triage team labels Oct 26, 2022
@roothorp
Copy link

Hi @hagaibarel, thanks for the issue. I'm sorry to hear you're having issues with server-side apply. There are a couple of possible remedies to this issue that we suggest:

@roothorp roothorp added impact/usability Something that impacts users' ability to use the product easily and intuitively kind/question Questions about existing features and removed needs-triage Needs attention from the triage team labels Oct 26, 2022
@hagaibarel
Copy link
Author

Thanks for the reply,

We changed SSA to false for now until we'll have a solution for this.

Setting ignore changes doesn't have any affect on helm chart resources if I'm not missing anything, and from experience in other resources, the pulumi.com/patchForce=true has to be set on the resource for ignore changes to work. I guess the check for managed field ownership happens before the ignore changes option, but that's not possible in this use case.

@lukehoban
Copy link
Member

lukehoban commented Oct 26, 2022

Two other reports of this at #2206 (comment) and #2206 (comment).

@lblackstone @roothorp Beyond disabling this, what do users need to do to get the default configuration of the provider to work in these cases?

@lblackstone
Copy link
Member

We are reverting the default SSA-mode change in #2216, and will follow up with additional testing and documentation for Helm charts. It seems like certain charts always cause conflicts by default, and we'll need to figure out how to address that.

@hagaibarel
Copy link
Author

Another use case where we're seeing issues around helm charts and SSA:

  1. Install and configure registry-creds
  2. Install kube-state-metrics
  3. The next operation will always try to set the service account's imagePullSecret to [] as that's what's the default value in chart, but it has already been updated by registry-creds to contain a ref to a pull secret

In this case, since the sa allows passing annotations and has more flexibility, we're able to work around it but it's basically the same issue

@squaremo
Copy link
Contributor

Just a note to say that the patch release v3.22.1 reverts the change of server-side apply being the default. This doesn't solve the problem with server-side apply with Helm, so I'll leave this issue open. You may want to keep the explicit enableServerSideApply: false as a kind of sentinel, in the meantime.

@hagaibarel
Copy link
Author

Just a note to say that the patch release v3.22.1 reverts the change of server-side apply being the default. This doesn't solve the problem with server-side apply with Helm, so I'll leave this issue open. You may want to keep the explicit enableServerSideApply: false as a kind of sentinel, in the meantime.

Yes, we upgraded and explicitly set the option to false on all of our stack configs. Honestly this has taken us by surprise, it broke all of our pulumi programs (3 different ones, ~20 stacks, managing a total of ~800 k8s resources)

@lblackstone
Copy link
Member

Sorry for the bumpy rollout on this feature. We did test Helm with SSA mode, but missed the case where the Chart conflicts with a controller setting. I've tested the kruise example that you provided, and found that you can work around the problem using the ignoreChanges resource option. Since helm.Chart is provisioning these resources in a Component, you'd need to use transformations to accomplish this. Here's a test case in TypeScript that works as expected:

import * as k8s from "@pulumi/kubernetes";

// Create a Provider with SSA enabled.
const provider = new k8s.Provider("k8s", {
    enableServerSideApply: true,
});

new k8s.helm.v3.Chart("test", {
    chart: "kruise",
    version: "1.3.0",
    fetchOpts: {
        repo: "https://openkruise.github.io/charts/"
    },
    values: {
        manager: {
            replicas: 1
        }
    },
    transformations: [
        (obj: any, opts: pulumi.CustomResourceOptions) => {
            if (obj.kind === "ValidatingWebhookConfiguration" || obj.kind === "MutatingWebhookConfiguration") {
                opts.ignoreChanges = [
                    "metadata.annotations.template",
                    "webhooks[*].clientConfig",
                ];
            }
        }
    ]
}, {provider})

It's unfortunate that this requires more fiddling than it does with Client-side Apply, but it is surfacing a legitimate conflict on those fields compared to what the Pulumi program is setting. I think that the additional rigor will generally be a good thing, but will cause some friction initially as users need to resolve conflicts that were previously undetected.

We will also update the conflict error message to be clearer about all of the resolution options, and when each one should be used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/server-side-apply impact/usability Something that impacts users' ability to use the product easily and intuitively kind/bug Some behavior is incorrect or out of spec kind/question Questions about existing features resolution/fixed This issue was fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants