Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

increasing minMasterVersion of google container cluster previews replacement instead of update #88

Closed
geekflyer opened this issue Feb 2, 2019 · 20 comments
Assignees
Labels
p1 A bug severe enough to be the next item assigned to an engineer
Milestone

Comments

@geekflyer
Copy link

geekflyer commented Feb 2, 2019

We have a bunch gke clusters.
I was just planning to upgrade one of their masters by bumping minMasterVersion.
Unfortunately when doing so pulumi preview says it will do a replacement of the entire cluster instead of an update. I'm not sure if this is just an issue with preview but I'm pretty sure bumping the master version shouldn't replace the entire cluster and a replacement of a cluster is a pretty scary operation. As intermediate workaround I upgraded the cluster via GCPs UI and left the pulumi minMasterVersion param untouched.

Example:

existing cluster with:

export const cluster = new gcp.container.Cluster('api-cluster', {
  name: 'foo',
  initialNodeCount: 1,
  minMasterVersion: '1.10.6-gke.1',
});

when changing minMasterVersion to 1.10.12-gke.1 pulumi preview shows the following:

Previewing update (acme/api-cluster-prod-b):

     Type                                                           Name                                   Plan        Info
     pulumi:pulumi:Stack                                            api-cluster-prod-b-api-cluster-prod-b
 +-  ├─ gcp:container:Cluster                                       api-cluster                            replace     [diff: ~minMasterVersion]
@lukehoban
Copy link
Member

I just tried this with latest GCP provider - and I see an update not a replace when using the same code as you have above.

Previewing update (dev):

     Type                      Name         Plan       Info
     pulumi:pulumi:Stack       gcp88-dev
 ~   └─ gcp:container:Cluster  api-cluster  update     [diff: ~minMasterVersion]

Resources:
    ~ 1 to update
    1 unchanged

The one change I had to make was to use 1.10.12-gke.1 as the before and 1.11.6-gke.3 as the after because the version you list above is no longer supported for new clusters. It's unclear to me how that could affect the results of this though.

Just to make sure - was this the only property change you made? Is your real use case a more complex configuration of the Cluster?

@lukehoban lukehoban self-assigned this Feb 2, 2019
@lukehoban lukehoban added this to the 0.21 milestone Feb 2, 2019
@lukehoban
Copy link
Member

@geekflyer I wasn't able to reproduce this after following the steps you describe. If there were any other properties set on the resource, or changes made, that would help reproduce, let me know and we'll reopen. It's very possible this was hitting something like the issue in pulumi/pulumi-azure#182, but I can't confirm that without a reproduction of the issue unfortunately.

@RichardWLaub
Copy link

I am seeing the same issue while attempting to update from 1.11.7-gke.12 to 1.12.5-gke.5:

+-  └─ gcp:container:Cluster         local-dev              replace     [diff: ~minMasterVersion]

Here are all the properties I am setting on the resource:

    const k8sCluster = new gcp.container.Cluster(name, {
        name,
        masterAuthorizedNetworksConfig: {
            cidrBlocks: k8sAllowedCidrBlocks
        },
        minMasterVersion,
        project,
        region,
        privateClusterConfig: {
            enablePrivateNodes: true,
            masterIpv4CidrBlock: '172.16.2.0/28'
        },
        ipAllocationPolicy: {
            createSubnetwork: true,
            subnetworkName: subnetName
        },
        network: network.selfLink,
        removeDefaultNodePool: true,
        nodePools: [
            {
                name: 'default-pool',
                nodeCount: 0
            }
        ]
    });

@lukehoban let me know if there is any additional information I can provide to help troubleshoot this.
The resources were originally created on version 0.16.9. The update doesn't work on that version nor does it work on the latest (0.18.0).

@lukehoban
Copy link
Member

I was able to reproduce this with the program below.

The output here is misleading due to pulumi/pulumi#2453, but verbose logging shows that the properties forcing replacement are actuall:

  • nodePools
  • ipAllocationPolicy

Indeed, if I comment out the single element of the nodePools array, that one goes away and only ipAllocationPolicy remains. This is similar to hashicorp/terraform-provider-google#2115, and I suspect is an unfortunate interaction with the awkward API design of nodePools + removeDefaultNodePool.

I am not yet clear on exactly why that is triggering a replacement even though these is not a change. This is almost certainly related to pulumi/pulumi-terraform#329. But as noted in that issue, every other case of this has been a bug in the upstream Terraform provider. I'll need to spend some more time looking into this to nail down what is triggering this (and whether the same problem exists in Terraform.

import * as gcp from "@pulumi/gcp";

const minMasterVersion = "1.12.5-gke.5";

const name = "hello";
const cidrBlock = "10.0.0.0/16"; 
const project = "pulumi-development";
const region = "us-central1";
const subnetName = "mysubnet";

const network = new gcp.compute.Network("network");

const k8sCluster = new gcp.container.Cluster(name, {
    name,
    masterAuthorizedNetworksConfig: {
        cidrBlocks: [{ cidrBlock, displayName: "mycidr"}],
    },
    minMasterVersion,
    project,
    region,
    privateClusterConfig: {
        enablePrivateNodes: true,
        masterIpv4CidrBlock: '172.16.2.0/28'
    },
    ipAllocationPolicy: {
        createSubnetwork: true,
        subnetworkName: subnetName
    },
    network: network.selfLink,
    removeDefaultNodePool: true,
    nodePools: [
        {
            name: 'default-pool',
            nodeCount: 0
        }
    ]
});

export const endpoint = k8sCluster.endpoint;

@lukehoban lukehoban reopened this Mar 28, 2019
@lukehoban lukehoban modified the milestones: 0.21, 0.22 Mar 28, 2019
@RichardWLaub
Copy link

hashicorp/terraform-provider-google#3319
I think this PR should fix the ipAllocationPolicy forcing a replacement.

@lukehoban
Copy link
Member

hashicorp/terraform-provider-google#3319. I think this PR should fix the ipAllocationPolicy forcing a replacement.

Indeed - that's exactly what I was looking for but couldn't find last night. Looks like we'll pull down that fix with the next release of the upstream Terraform provider.

@lukehoban
Copy link
Member

The nodePools part of this can be addressed by either doing this instead of the dummy inline nodePools:

    removeDefaultNodePool: true,
    initialNodeCount: 1,

or by removing the dummy nodePool entry after the initial update (since you had asked for it to be removed).

@geekflyer
Copy link
Author

geekflyer commented Apr 8, 2019

@lukehoban I'm facing this issue now again with one of our clusters (it actually is our most important production cluster, so this is kinda scary).

Here's a subset of the pulumi program of that cluster:

export const cluster = new gcp.container.Cluster(
  'api-cluster',
  {
    name: clusterName,
    initialNodeCount: 1,
    ipAllocationPolicy: {
      clusterSecondaryRangeName: 'gke-api-cluster-prod-b-pods-a28cc097',
      servicesSecondaryRangeName: 'gke-api-cluster-prod-b-services-a28cc097'
    },
    network: 'default',
    enableKubernetesAlpha: false,
    enableLegacyAbac: false,
    additionalZones,
    loggingService: 'logging.googleapis.com/kubernetes',
    minMasterVersion: gkeMasterVersion,
    monitoringService: 'monitoring.googleapis.com/kubernetes',
    removeDefaultNodePool: true,
    zone: primaryZone
  }
);

What I attempted to do is to upgrade the master from 1.10.6-gke.2 to 1.11.8-gke.6 by changing minMasterVersion (and that is literally the only change). Also coming back to your original question: In the real setup there are several nodepools (seperate gcp.container.NodePool instances) attached to that cluster (haven't pasted them here for brevity).

The preview shows it it is planning to do a replacement of the cluster due a change in ~minMasterVesion.

versions are:

 "@pulumi/gcp": "0.17.1",
  "@pulumi/kubernetes": "0.22.0",
  "@pulumi/pulumi": "0.17.4",
  "@solvvy/pulumi-util": "0.4.0"

This is kind of blocking me to do some required maintenance on that cluster and since it's a high value prod cluster I can't easily "replace" it :)

In one of the comments above you said verbose logging shows that the properties forcing replacement are actually... . Is there a way I can verify this myself? I tried out -v 5 --logtostderr which shows a lot more logs but nothing obvious which tells me what's actually causing a replacement (beside ~minMasterVersion).

I don't really have a separate shareable program for a reproduction but I'm more than happy to jump on a screenshare to debug the issue.

@lukehoban
Copy link
Member

In one of the comments above you said verbose logging shows that the properties forcing replacement are actually... . Is there a way I can verify this myself? I tried out -v 5 --logtostderr which shows a lot more logs but nothing obvious which tells me what's actually causing a replacement (beside ~minMasterVersion).

You can do:

pulumi preview --logtostderr -v=9 2> out.txt

And then in out.txt look for replaces=. If you share all the lines that include that, we can diagnose which field change is triggering replacement.

I am so far aware of upstream Terraform provider issues related to the following:

However, I do not see any of those in your example above.

@geekflyer
Copy link
Author

geekflyer commented Apr 8, 2019

Ok cool, looks like initialNodeCount is triggering the replacement even though I haven't changed that.

Here's a subset of the logs which I thought captures the most interesting parts:

I0408 10:52:04.404414   85701 rpc.go:204] Unmarshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).inputs]: initialNodeCount={1}
...
I0408 10:52:04.404755   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).olds]: initialNodeCount={0}
...
I0408 10:52:04.405431   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).olds]: initialNodeCount={6}
...
I0408 10:52:04.405734   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).olds]: initialNodeCount={1}
...
I0408 10:52:04.406029   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).olds]: removeDefaultNodePool={true}
I0408 10:52:04.406037   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).olds]: resourceLabels={map[]}
I0408 10:52:04.406046   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).olds]: subnetwork={projects/redacted/regions/redacted/subnetworks/default}
I0408 10:52:04.406058   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).olds]: zone={redacted}
I0408 10:52:04.406072   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).news]: additionalZones={[{redacted}]}
I0408 10:52:04.406081   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).news]: enableKubernetesAlpha={false}
I0408 10:52:04.406090   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).news]: enableLegacyAbac={false}
I0408 10:52:04.406098   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).news]: initialNodeCount={1}
I0408 10:52:04.406106   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).news]: ipAllocationPolicy={map[clusterSecondaryRangeName:{gke-api-cluster-prod-b-pods-a28cc097} servicesSecondaryRangeName:{gke-api-cluster-prod-b-services-a28cc097}]}
I0408 10:52:04.406120   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).news]: clusterSecondaryRangeName={gke-api-cluster-prod-b-pods-a28cc097}
I0408 10:52:04.406129   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).news]: servicesSecondaryRangeName={gke-api-cluster-prod-b-services-a28cc097}
I0408 10:52:04.406137   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).news]: loggingService={logging.googleapis.com/kubernetes}
I0408 10:52:04.406145   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).news]: minMasterVersion={1.11.8-gke.6}
I0408 10:52:04.406153   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).news]: monitoringService={monitoring.googleapis.com/kubernetes}
I0408 10:52:04.406161   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).news]: name={api-cluster-prod-b}
I0408 10:52:04.406170   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).news]: network={default}
I0408 10:52:04.406178   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).news]: removeDefaultNodePool={true}
I0408 10:52:04.406188   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b).news]: zone={redacted}
I0408 10:52:04.412154   85701 provider_plugin.go:308] Provider[gcp, 0xc000d64640].Diff(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster,api-cluster-prod-b) success: changes=2 #replaces=[initialNodeCount] #stables=[enableKubernetesAlpha description ipAllocationPolicy clusterIpv4Cidr project subnetwork name zone privateCluster enableTpu network masterIpv4CidrBlock nodePools region nodeConfig] delbefrepl=false, diffs=#[]
I0408 10:52:04.412209   85701 provider_plugin.go:187] Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster) executing (#olds=0,#news=12
I0408 10:52:04.412242   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).news]: additionalZones={[{redacted}]}
I0408 10:52:04.412262   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).news]: enableKubernetesAlpha={false}
I0408 10:52:04.412275   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).news]: enableLegacyAbac={false}
I0408 10:52:04.412288   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).news]: initialNodeCount={1}
I0408 10:52:04.412303   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).news]: ipAllocationPolicy={map[clusterSecondaryRangeName:{gke-api-cluster-prod-b-pods-a28cc097} servicesSecondaryRangeName:{gke-api-cluster-prod-b-services-a28cc097}]}
I0408 10:52:04.412323   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).news]: clusterSecondaryRangeName={gke-api-cluster-prod-b-pods-a28cc097}
I0408 10:52:04.412337   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).news]: servicesSecondaryRangeName={gke-api-cluster-prod-b-services-a28cc097}
I0408 10:52:04.412351   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).news]: loggingService={logging.googleapis.com/kubernetes}
I0408 10:52:04.412378   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).news]: minMasterVersion={1.11.8-gke.6}
I0408 10:52:04.412406   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).news]: monitoringService={monitoring.googleapis.com/kubernetes}
I0408 10:52:04.412416   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).news]: name={api-cluster-prod-b}
I0408 10:52:04.412433   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).news]: network={default}
I0408 10:52:04.412443   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).news]: removeDefaultNodePool={true}
I0408 10:52:04.412452   85701 rpc.go:68] Marshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).news]: zone={redacted}
I0408 10:52:04.413023   85701 rpc.go:204] Unmarshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).inputs]: additionalZones={[{redacted}]}
I0408 10:52:04.413049   85701 rpc.go:204] Unmarshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).inputs]: enableKubernetesAlpha={false}
I0408 10:52:04.413062   85701 rpc.go:204] Unmarshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).inputs]: enableLegacyAbac={false}
I0408 10:52:04.413071   85701 rpc.go:204] Unmarshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).inputs]: initialNodeCount={1}
I0408 10:52:04.413082   85701 rpc.go:204] Unmarshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).inputs]: clusterSecondaryRangeName={gke-api-cluster-prod-b-pods-a28cc097}
I0408 10:52:04.413091   85701 rpc.go:204] Unmarshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).inputs]: servicesSecondaryRangeName={gke-api-cluster-prod-b-services-a28cc097}
I0408 10:52:04.413102   85701 rpc.go:204] Unmarshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).inputs]: ipAllocationPolicy={map[clusterSecondaryRangeName:{gke-api-cluster-prod-b-pods-a28cc097} servicesSecondaryRangeName:{gke-api-cluster-prod-b-services-a28cc097}]}
I0408 10:52:04.413115   85701 rpc.go:204] Unmarshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).inputs]: loggingService={logging.googleapis.com/kubernetes}
I0408 10:52:04.413123   85701 rpc.go:204] Unmarshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).inputs]: minMasterVersion={1.11.8-gke.6}
I0408 10:52:04.413132   85701 rpc.go:204] Unmarshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).inputs]: monitoringService={monitoring.googleapis.com/kubernetes}
I0408 10:52:04.413140   85701 rpc.go:204] Unmarshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).inputs]: name={api-cluster-prod-b}
I0408 10:52:04.413149   85701 rpc.go:204] Unmarshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).inputs]: network={default}
I0408 10:52:04.413157   85701 rpc.go:204] Unmarshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).inputs]: removeDefaultNodePool={true}
I0408 10:52:04.413165   85701 rpc.go:204] Unmarshaling property for RPC[Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster).inputs]: zone={redacted}
I0408 10:52:04.413174   85701 provider_plugin.go:239] Provider[gcp, 0xc000d64640].Check(urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster) success: inputs=#12 failures=#0
I0408 10:52:04.413185   85701 step_generator.go:307] Planner decided to replace 'urn:pulumi:api-cluster-prod-b::api-cluster-prod-b::gcp:container/cluster:Cluster::api-cluster' (oldprops=map[enableLegacyAbac:{false} initialNodeCount:{1} minMasterVersion:{1.10.6-gke.1} name:{api-cluster-prod-b} ipAllocationPolicy:{map[clusterSecondaryRangeName:{gke-api-cluster-prod-b-pods-a28cc097} servicesSecondaryRangeName:{gke-api-cluster-prod-b-services-a28cc097}]} monitoringService:{monitoring.googleapis.com/kubernetes} network:{default} loggingService:{logging.googleapis.com/kubernetes} additionalZones:{[{redacted}]} enableKubernetesAlpha:{false} zone:{redacted} removeDefaultNodePool:{true}] inputs=map[enableLegacyAbac:{false} ipAllocationPolicy:{map[clusterSecondaryRangeName:{gke-api-cluster-prod-b-pods-a28cc097} servicesSecondaryRangeName:{gke-api-cluster-prod-b-services-a28cc097}]} monitoringService:{monitoring.googleapis.com/kubernetes} network:{default} removeDefaultNodePool:{true} zone:{redacted} additionalZones:{[{redacted}]} enableKubernetesAlpha:{false} initialNodeCount:{1} loggingService:{logging.googleapis.com/kubernetes} minMasterVersion:{1.11.8-gke.6} name:{api-cluster-prod-b}])

I can share you the entire log in private if that's necessary.

@lukehoban
Copy link
Member

Talked with @geekflyer offline, and established that in his case, he had used initialNodeCount: 1 and removeDefaultNodePool: true, and then his outputs included initialNodeCount: 0. So then when he tried to do an update, the provider saw a change of initialNodeCount from 1 to 0 which lead it to think a replacement was needed.

This feels like yet-another issue in the upstream provider. I'll see if I can repro it independently.

@lukehoban
Copy link
Member

lukehoban commented Apr 9, 2019

The core issue with nodePools + removeDefaultNodePool that several folks have seen is a bug in the upstream Terraform provider. Here is a a repro of the issue in Terraform:

resource "google_compute_network" "this" {
  name    = "mynet"
  project = "pulumi-development"
}

resource "google_container_cluster" "this" {
  name                     = "mycluster"
  
  min_master_version       = "latest"
  network                  = "${google_compute_network.this.self_link}"
  project                  = "pulumi-development"
  zone                     = "us-west1-a"

  remove_default_node_pool = true
  node_pool {
    name       = "default-pool"
    node_count = 0
  }

  master_auth {
    username = ""
    password = ""

    client_certificate_config {
      issue_client_certificate = false
    }
  }

  network_policy {
    provider = "CALICO"
    enabled  = true
  }
  addons_config {
    http_load_balancing {
      disabled = false
    }

    horizontal_pod_autoscaling {
      disabled = false
    }
  }
}

Changing network_policy.enabled to false leads to proposing a replacement due to changes in node_pools:

-/+ google_container_cluster.this (new resource required)
      id:                                                                 "mycluster" => <computed> (forces new resource)
      additional_zones.#:                                                 "0" => <computed>
      addons_config.#:                                                    "1" => "1"
      addons_config.0.horizontal_pod_autoscaling.#:                       "1" => "1"
      addons_config.0.horizontal_pod_autoscaling.0.disabled:              "false" => "false"
      addons_config.0.http_load_balancing.#:                              "1" => "1"
      addons_config.0.http_load_balancing.0.disabled:                     "false" => "false"
      addons_config.0.kubernetes_dashboard.#:                             "1" => <computed>
      addons_config.0.network_policy_config.#:                            "1" => <computed>
      cluster_autoscaling.#:                                              "0" => <computed>
      cluster_ipv4_cidr:                                                  "10.48.0.0/14" => <computed>
      enable_binary_authorization:                                        "" => <computed>
      enable_kubernetes_alpha:                                            "false" => "false"
      enable_legacy_abac:                                                 "false" => "false"
      enable_tpu:                                                         "" => <computed>
      endpoint:                                                           "104.196.224.160" => <computed>
      instance_group_urls.#:                                              "0" => <computed>
      ip_allocation_policy.#:                                             "0" => <computed>
      location:                                                           "us-west1-a" => <computed>
      logging_service:                                                    "logging.googleapis.com" => <computed>
      master_auth.#:                                                      "1" => "1"
      master_auth.0.client_certificate:                                   "" => <computed>
      master_auth.0.client_certificate_config.#:                          "1" => "1"
      master_auth.0.client_certificate_config.0.issue_client_certificate: "false" => "false"
      master_auth.0.client_key:                                           <sensitive> => <computed> (attribute changed)
      master_auth.0.cluster_ca_certificate:                               "LS0tLS1CRUdJTiBDRVJUsnip" => <computed>
      master_ipv4_cidr_block:                                             "" => <computed>
      master_version:                                                     "1.12.6-gke.10" => <computed>
      min_master_version:                                                 "latest" => "latest"
      monitoring_service:                                                 "monitoring.googleapis.com" => <computed>
      name:                                                               "mycluster" => "mycluster"
      network:                                                            "projects/pulumi-development/global/networks/mynet" => "https://www.googleapis.com/compute/v1/projects/pulumi-development/global/networks/mynet"
      network_policy.#:                                                   "1" => "1"
      network_policy.0.enabled:                                           "false" => "true"
      network_policy.0.provider:                                          "" => "CALICO"
      node_config.#:                                                      "0" => <computed>
      node_locations.#:                                                   "0" => <computed>
      node_pool.#:                                                        "0" => "1" (forces new resource)
      node_pool.0.initial_node_count:                                     "" => <computed>
      node_pool.0.instance_group_urls.#:                                  "" => <computed>
      node_pool.0.management.#:                                           "" => <computed>
      node_pool.0.max_pods_per_node:                                      "" => <computed>
      node_pool.0.name:                                                   "" => "default-pool" (forces new resource)
      node_pool.0.name_prefix:                                            "" => <computed>
      node_pool.0.node_config.#:                                          "" => <computed> (forces new resource)
      node_pool.0.node_count:                                             "" => "0"
      node_pool.0.version:                                                "" => <computed>
      node_version:                                                       "1.12.6-gke.10" => <computed>
      private_cluster:                                                    "" => <computed>
      project:                                                            "pulumi-development" => "pulumi-development"
      region:                                                             "" => <computed>
      remove_default_node_pool:                                           "true" => "true"
      zone:                                                               "us-west1-a" => "us-west1-a"

@naveensrinivasan
Copy link

@lukehoban This is a blocker for us. Which is causing to delete the cluster. What is the workaround for this?

@lukehoban
Copy link
Member

For anyone hitting the nodePools + removeDefaultNodePool issue here - the workaround is to remove the nodePool from the Pulumi program since it has been removed via the removeDefaultNodePool option.

@casey-robertson
Copy link

casey-robertson commented Apr 11, 2019

Tried it 3 times now and still an issue. Brand new stack with minMasterVersion: '1.11.7-gke.12'. Stack comes up. Then remove the nodePool section from the gke code:

nodePools: [
            {
                name: 'default-pool',
                nodeCount: 0
            }
        ]

Bump the minMaster version to minMasterVersion: '1.12.6-gke.10'

Results in:

Previewing update (MINDBODY-Platform/casey-robertson):

     Type                             Name                      Plan        Info
     pulumi:pulumi:Stack              viserion-casey-robertson
 >   ├─ pulumi:pulumi:StackReference  identityStack             read
 +-  └─ gcp:container:Cluster         local-dev                 replace     [diff: ~minMasterVersion]

Resources:
    +-1 to replace
    21 unchanged

From the verbose logging:

I0411 16:13:40.211594   28175 provider_plugin.go:308] Provider[gcp, 0xc0001172c0].Diff(urn:pulumi:casey-robertson::viserion::gcp:container/cluster:Cluster::local-dev,local-dev) success: changes=2 #replaces=[nodePool.0.name] #stables=[initialNodeCount network zone subnetwork privateCluster masterIpv4CidrBlock project region nodePools clusterIpv4Cidr description ipAllocationPolicy name enableTpu nodeConfig enableKubernetesAlpha] delbefrepl=false, diffs=#[]

@geekflyer
Copy link
Author

geekflyer commented Apr 11, 2019

I never put anything in the nodePools section. We always create our clusters with removeDefaultPool: true.

The workaround that helped (found with Luke's help) when the replacement issue occurred was to change initialNodeCount from 1 to 0.

@casey-robertson
Copy link

Ours has always been zero.

@lukehoban
Copy link
Member

@casey-robertson You appear to be hitting exactly what is described above in the workaround at #88 (comment).

@casey-robertson
Copy link

casey-robertson commented Apr 11, 2019

ok - but do you mean that the workaround actually works because at least in my testing it doesn't fix it and the cluster still wants to be replaced. Sorry if I'm being dense :-)

@lukehoban
Copy link
Member

I've debugged variants of this with 5 users. Ultimately - all issues here have boiled down to issues in the upstream Terraform provider. So far, this has been caused by one-or-more of the following:

As these fixes get released in the upstream provider, they will get pulled into @pulumi/gcp. For now though, I believe there is no specific issue to track here on the Pulumi side - beyond pulumi/pulumi#2453 which obfuscates the real underlying issue in these cases.

@infin8x infin8x added the p1 A bug severe enough to be the next item assigned to an engineer label Jul 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p1 A bug severe enough to be the next item assigned to an engineer
Projects
None yet
Development

No branches or pull requests

7 participants