Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update cluster attempts to create resources that already exist. #10596

Closed
myspotontheweb opened this issue Jan 16, 2021 · 3 comments · Fixed by #10599
Closed

Update cluster attempts to create resources that already exist. #10596

myspotontheweb opened this issue Jan 16, 2021 · 3 comments · Fixed by #10599
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@myspotontheweb
Copy link

myspotontheweb commented Jan 16, 2021

1. What kops version are you running? The command kops version, will display
this information.

$ kops version
Version 1.19.0-beta.3 (git-e43f1cc6e3c77d093935d1706042861095d75eb7)

Note:

  • We first noticed this problem after upgrading to kops 1.17

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.1", GitCommit:"c4d752765b3bbac2237bf87cf0b1c2e307844666", GitTreeState:"clean", BuildDate:"2020-12-18T12:09:25Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.6", GitCommit:"fbf646b339dc52336b55d8ec85c181981b86331a", GitTreeState:"clean", BuildDate:"2020-12-18T12:01:36Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
$ kubectl version --short
Client Version: v1.20.1
Server Version: v1.19.6

3. What cloud provider are you using?

AWS

4. What commands did you run? What is the simplest way to reproduce this issue?

Generate a cluster spec

export NAME=kops-staging-test2.k8s.local

kops create cluster --name $NAME \
   --cloud aws \
   --master-zones eu-west-1b \
   --master-size r4.large \
   --zones eu-west-1a,eu-west-1b \
   --node-size r4.xlarge \
   --node-count 2 \
   --cloud-labels Name=Kubernetes,TWApp=CrossProduct \
   --authorization RBAC \
   --topology private   \
   --networking calico    \
   --dry-run \
   -o yaml > $NAME.clusterspec.yaml 

Create the cluster

# Register cluster
kops create -f kops-staging-test2.k8s.local.clusterspec.yaml

# Assign SSH key
kops create secret --name kops-staging-test2.k8s.local sshpublickey admin -i ~/.ssh/mykey.pem.pub

# Create cloud resources
kops update cluster kops-staging-test2.k8s.local --yes

# Setup admin client context
kops export kubecfg kops-staging-test2.k8s.local --admin 

5. What happened after the commands executed?

After allowing the cluster to come to a steady state (kops validate cluster returns success), run an cluster update operation:

$ kops update cluster --name kops-staging-test2.k8s.local
I0116 13:16:46.975338   14898 apply_cluster.go:465] Gossip DNS: skipping DNS validation
I0116 13:16:47.114344   14898 executor.go:103] Tasks: 0 done / 104 total; 45 can run
I0116 13:16:47.973442   14898 executor.go:103] Tasks: 45 done / 104 total; 22 can run
I0116 13:16:48.493120   14898 executor.go:103] Tasks: 67 done / 104 total; 24 can run
I0116 13:16:48.844410   14898 executor.go:103] Tasks: 91 done / 104 total; 5 can run
I0116 13:16:48.968126   14898 executor.go:103] Tasks: 96 done / 104 total; 5 can run
I0116 13:16:49.553751   14898 executor.go:103] Tasks: 101 done / 104 total; 3 can run
I0116 13:16:49.870945   14898 executor.go:103] Tasks: 104 done / 104 total; 0 can run
Will create resources:
  DHCPOptions/kops-staging-test2.k8s.local
  	DomainName          	eu-west-1.compute.internal
  	DomainNameServers   	AmazonProvidedDNS
  	Shared              	false
  	Tags                	{KubernetesCluster: kops-staging-test2.k8s.local, kubernetes.io/cluster/kops-staging-test2.k8s.local: owned, TWApp: CrossProduct, Name: Kubernetes}

  InternetGateway/kops-staging-test2.k8s.local
  	VPC                 	name:kops-staging-test2.k8s.local
  	Shared              	false
  	Tags                	{Name: Kubernetes, KubernetesCluster: kops-staging-test2.k8s.local, kubernetes.io/cluster/kops-staging-test2.k8s.local: owned, TWApp: CrossProduct}

  RouteTableAssociation/private-eu-west-1a.kops-staging-test2.k8s.local
  	RouteTable          	name:private-eu-west-1a.kops-staging-test2.k8s.local id:rtb-00934d41b8cc17771
  	Subnet              	name:eu-west-1a.kops-staging-test2.k8s.local

  RouteTableAssociation/private-eu-west-1b.kops-staging-test2.k8s.local
  	RouteTable          	name:private-eu-west-1b.kops-staging-test2.k8s.local id:rtb-058f6f3ce81878aff
  	Subnet              	name:eu-west-1b.kops-staging-test2.k8s.local

  RouteTableAssociation/utility-eu-west-1a.kops-staging-test2.k8s.local
  	RouteTable          	name:kops-staging-test2.k8s.local id:rtb-016184b429493f985
  	Subnet              	name:utility-eu-west-1a.kops-staging-test2.k8s.local

  RouteTableAssociation/utility-eu-west-1b.kops-staging-test2.k8s.local
  	RouteTable          	name:kops-staging-test2.k8s.local id:rtb-016184b429493f985
  	Subnet              	name:utility-eu-west-1b.kops-staging-test2.k8s.local

  SecurityGroup/api-elb.kops-staging-test2.k8s.local
  	Description         	Security group for api ELB
  	VPC                 	name:kops-staging-test2.k8s.local
  	RemoveExtraRules    	[port=443]
  	Tags                	{TWApp: CrossProduct, Name: Kubernetes, KubernetesCluster: kops-staging-test2.k8s.local, kubernetes.io/cluster/kops-staging-test2.k8s.local: owned}

  SecurityGroup/masters.kops-staging-test2.k8s.local
  	Description         	Security group for masters
  	VPC                 	name:kops-staging-test2.k8s.local
  	RemoveExtraRules    	[port=22, port=443, port=2380, port=2381, port=4001, port=4002, port=4789, port=179, port=8443]
  	Tags                	{KubernetesCluster: kops-staging-test2.k8s.local, kubernetes.io/cluster/kops-staging-test2.k8s.local: owned, TWApp: CrossProduct, Name: Kubernetes}

  SecurityGroup/nodes.kops-staging-test2.k8s.local
  	Description         	Security group for nodes
  	VPC                 	name:kops-staging-test2.k8s.local
  	RemoveExtraRules    	[port=22]
  	Tags                	{TWApp: CrossProduct, Name: Kubernetes, KubernetesCluster: kops-staging-test2.k8s.local, kubernetes.io/cluster/kops-staging-test2.k8s.local: owned}

  SecurityGroupRule/all-master-to-master
  	SecurityGroup       	name:masters.kops-staging-test2.k8s.local
  	SourceGroup         	name:masters.kops-staging-test2.k8s.local

  SecurityGroupRule/all-master-to-node
  	SecurityGroup       	name:nodes.kops-staging-test2.k8s.local
  	SourceGroup         	name:masters.kops-staging-test2.k8s.local

  SecurityGroupRule/all-node-to-node
  	SecurityGroup       	name:nodes.kops-staging-test2.k8s.local
  	SourceGroup         	name:nodes.kops-staging-test2.k8s.local

  SecurityGroupRule/api-elb-egress
  	SecurityGroup       	name:api-elb.kops-staging-test2.k8s.local
  	CIDR                	0.0.0.0/0
  	Egress              	true

  SecurityGroupRule/https-api-elb-0.0.0.0/0
  	SecurityGroup       	name:api-elb.kops-staging-test2.k8s.local
  	CIDR                	0.0.0.0/0
  	Protocol            	tcp
  	FromPort            	443
  	ToPort              	443

  SecurityGroupRule/https-elb-to-master
  	SecurityGroup       	name:masters.kops-staging-test2.k8s.local
  	Protocol            	tcp
  	FromPort            	443
  	ToPort              	443
  	SourceGroup         	name:api-elb.kops-staging-test2.k8s.local

  SecurityGroupRule/icmp-pmtu-api-elb-0.0.0.0/0
  	SecurityGroup       	name:api-elb.kops-staging-test2.k8s.local
  	CIDR                	0.0.0.0/0
  	Protocol            	icmp
  	FromPort            	3
  	ToPort              	4

  SecurityGroupRule/master-egress
  	SecurityGroup       	name:masters.kops-staging-test2.k8s.local
  	CIDR                	0.0.0.0/0
  	Egress              	true

  SecurityGroupRule/node-egress
  	SecurityGroup       	name:nodes.kops-staging-test2.k8s.local
  	CIDR                	0.0.0.0/0
  	Egress              	true

  SecurityGroupRule/node-to-master-protocol-ipip
  	SecurityGroup       	name:masters.kops-staging-test2.k8s.local
  	Protocol            	4
  	SourceGroup         	name:nodes.kops-staging-test2.k8s.local

  SecurityGroupRule/node-to-master-tcp-1-2379
  	SecurityGroup       	name:masters.kops-staging-test2.k8s.local
  	Protocol            	tcp
  	FromPort            	1
  	ToPort              	2379
  	SourceGroup         	name:nodes.kops-staging-test2.k8s.local

  SecurityGroupRule/node-to-master-tcp-2382-4000
  	SecurityGroup       	name:masters.kops-staging-test2.k8s.local
  	Protocol            	tcp
  	FromPort            	2382
  	ToPort              	4000
  	SourceGroup         	name:nodes.kops-staging-test2.k8s.local

  SecurityGroupRule/node-to-master-tcp-4003-65535
  	SecurityGroup       	name:masters.kops-staging-test2.k8s.local
  	Protocol            	tcp
  	FromPort            	4003
  	ToPort              	65535
  	SourceGroup         	name:nodes.kops-staging-test2.k8s.local

  SecurityGroupRule/node-to-master-udp-1-65535
  	SecurityGroup       	name:masters.kops-staging-test2.k8s.local
  	Protocol            	udp
  	FromPort            	1
  	ToPort              	65535
  	SourceGroup         	name:nodes.kops-staging-test2.k8s.local

  SecurityGroupRule/ssh-external-to-master-0.0.0.0/0
  	SecurityGroup       	name:masters.kops-staging-test2.k8s.local
  	CIDR                	0.0.0.0/0
  	Protocol            	tcp
  	FromPort            	22
  	ToPort              	22

  SecurityGroupRule/ssh-external-to-node-0.0.0.0/0
  	SecurityGroup       	name:nodes.kops-staging-test2.k8s.local
  	CIDR                	0.0.0.0/0
  	Protocol            	tcp
  	FromPort            	22
  	ToPort              	22

  Subnet/eu-west-1a.kops-staging-test2.k8s.local
  	ShortName           	eu-west-1a
  	VPC                 	name:kops-staging-test2.k8s.local
  	AvailabilityZone    	eu-west-1a
  	CIDR                	172.20.32.0/19
  	Shared              	false
  	Tags                	{kubernetes.io/role/internal-elb: 1, Name: Kubernetes, KubernetesCluster: kops-staging-test2.k8s.local, kubernetes.io/cluster/kops-staging-test2.k8s.local: owned, TWApp: CrossProduct, SubnetType: Private}

  Subnet/eu-west-1b.kops-staging-test2.k8s.local
  	ShortName           	eu-west-1b
  	VPC                 	name:kops-staging-test2.k8s.local
  	AvailabilityZone    	eu-west-1b
  	CIDR                	172.20.64.0/19
  	Shared              	false
  	Tags                	{Name: Kubernetes, KubernetesCluster: kops-staging-test2.k8s.local, kubernetes.io/cluster/kops-staging-test2.k8s.local: owned, TWApp: CrossProduct, SubnetType: Private, kubernetes.io/role/internal-elb: 1}

  Subnet/utility-eu-west-1a.kops-staging-test2.k8s.local
  	ShortName           	utility-eu-west-1a
  	VPC                 	name:kops-staging-test2.k8s.local
  	AvailabilityZone    	eu-west-1a
  	CIDR                	172.20.0.0/22
  	Shared              	false
  	Tags                	{Name: Kubernetes, KubernetesCluster: kops-staging-test2.k8s.local, kubernetes.io/cluster/kops-staging-test2.k8s.local: owned, TWApp: CrossProduct, SubnetType: Utility, kubernetes.io/role/elb: 1}

  Subnet/utility-eu-west-1b.kops-staging-test2.k8s.local
  	ShortName           	utility-eu-west-1b
  	VPC                 	name:kops-staging-test2.k8s.local
  	AvailabilityZone    	eu-west-1b
  	CIDR                	172.20.4.0/22
  	Shared              	false
  	Tags                	{kubernetes.io/cluster/kops-staging-test2.k8s.local: owned, TWApp: CrossProduct, SubnetType: Utility, kubernetes.io/role/elb: 1, Name: Kubernetes, KubernetesCluster: kops-staging-test2.k8s.local}

  VPC/kops-staging-test2.k8s.local
  	CIDR                	172.20.0.0/16
  	EnableDNSHostnames  	true
  	EnableDNSSupport    	true
  	Shared              	false
  	Tags                	{Name: Kubernetes, KubernetesCluster: kops-staging-test2.k8s.local, kubernetes.io/cluster/kops-staging-test2.k8s.local: owned, TWApp: CrossProduct}

  VPCDHCPOptionsAssociation/kops-staging-test2.k8s.local
  	VPC                 	name:kops-staging-test2.k8s.local
  	DHCPOptions         	name:kops-staging-test2.k8s.local

Will modify resources:
  AutoscalingGroup/master-eu-west-1b.masters.kops-staging-test2.k8s.local
  	Subnets             	 [id:subnet-0374bed40e060c920] -> [name:eu-west-1b.kops-staging-test2.k8s.local]

  AutoscalingGroup/nodes-eu-west-1a.kops-staging-test2.k8s.local
  	Subnets             	 [id:subnet-0ae3ba5725f688600] -> [name:eu-west-1a.kops-staging-test2.k8s.local]

  AutoscalingGroup/nodes-eu-west-1b.kops-staging-test2.k8s.local
  	Subnets             	 [id:subnet-0374bed40e060c920] -> [name:eu-west-1b.kops-staging-test2.k8s.local]

  ClassicLoadBalancer/api.kops-staging-test2.k8s.local
  	Subnets             	 [id:subnet-01e89047a3a9d0634, id:subnet-0706d880f67dedee8] -> [name:utility-eu-west-1a.kops-staging-test2.k8s.local, name:utility-eu-west-1b.kops-staging-test2.k8s.local]
  	SecurityGroups      	 [id:sg-0965a782af37afa5d] -> [name:api-elb.kops-staging-test2.k8s.local]
  	ForAPIServer        	 true -> true

  LaunchTemplate/master-eu-west-1b.masters.kops-staging-test2.k8s.local
  	SecurityGroups      	 [id:sg-032530ef7fa784f96] -> [name:masters.kops-staging-test2.k8s.local]

  LaunchTemplate/nodes-eu-west-1a.kops-staging-test2.k8s.local
  	SecurityGroups      	 [id:sg-0f0ae0098416c91b9] -> [name:nodes.kops-staging-test2.k8s.local]

  LaunchTemplate/nodes-eu-west-1b.kops-staging-test2.k8s.local
  	SecurityGroups      	 [id:sg-0f0ae0098416c91b9] -> [name:nodes.kops-staging-test2.k8s.local]

  NatGateway/eu-west-1a.kops-staging-test2.k8s.local
  	Name                	 Kubernetes -> eu-west-1a.kops-staging-test2.k8s.local

  NatGateway/eu-west-1b.kops-staging-test2.k8s.local
  	Name                	 Kubernetes -> eu-west-1b.kops-staging-test2.k8s.local

  Route/0.0.0.0/0
  	InternetGateway     	 id:igw-013ed16fc2764b32d -> name:kops-staging-test2.k8s.local

  RouteTable/kops-staging-test2.k8s.local
  	VPC                 	 id:vpc-0578455ddd28a5e75 -> name:kops-staging-test2.k8s.local

  RouteTable/private-eu-west-1a.kops-staging-test2.k8s.local
  	VPC                 	 id:vpc-0578455ddd28a5e75 -> name:kops-staging-test2.k8s.local

  RouteTable/private-eu-west-1b.kops-staging-test2.k8s.local
  	VPC                 	 id:vpc-0578455ddd28a5e75 -> name:kops-staging-test2.k8s.local

6. What did you expect to happen?

Expected the cluster update to return No changes need to be applied

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

$ kops get --name kops-staging-test2.k8s.local -oyaml
apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
  creationTimestamp: "2021-01-16T13:04:51Z"
  name: kops-staging-test2.k8s.local
spec:
  api:
    loadBalancer:
      class: Classic
      type: Public
  authorization:
    rbac: {}
  channel: stable
  cloudLabels:
    Name: Kubernetes
    TWApp: CrossProduct
  cloudProvider: aws
  configBase: s3://teamwork-staging-kubernetes/kops-staging-test2.k8s.local
  containerRuntime: docker
  etcdClusters:
  - cpuRequest: 200m
    etcdMembers:
    - instanceGroup: master-eu-west-1b
      name: b
    memoryRequest: 100Mi
    name: main
  - cpuRequest: 100m
    etcdMembers:
    - instanceGroup: master-eu-west-1b
      name: b
    memoryRequest: 100Mi
    name: events
  iam:
    allowContainerRegistry: true
    legacy: false
  kubelet:
    anonymousAuth: false
  kubernetesApiAccess:
  - 0.0.0.0/0
  kubernetesVersion: 1.19.6
  masterPublicName: api.kops-staging-test2.k8s.local
  networkCIDR: 172.20.0.0/16
  networking:
    calico:
      majorVersion: v3
  nonMasqueradeCIDR: 100.64.0.0/10
  sshAccess:
  - 0.0.0.0/0
  subnets:
  - cidr: 172.20.32.0/19
    name: eu-west-1a
    type: Private
    zone: eu-west-1a
  - cidr: 172.20.64.0/19
    name: eu-west-1b
    type: Private
    zone: eu-west-1b
  - cidr: 172.20.0.0/22
    name: utility-eu-west-1a
    type: Utility
    zone: eu-west-1a
  - cidr: 172.20.4.0/22
    name: utility-eu-west-1b
    type: Utility
    zone: eu-west-1b
  topology:
    dns:
      type: Public
    masters: private
    nodes: private

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2021-01-16T13:04:53Z"
  labels:
    kops.k8s.io/cluster: kops-staging-test2.k8s.local
  name: master-eu-west-1b
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20201112.1
  machineType: r4.large
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-eu-west-1b
  role: Master
  subnets:
  - eu-west-1b

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2021-01-16T13:04:53Z"
  labels:
    kops.k8s.io/cluster: kops-staging-test2.k8s.local
  name: nodes-eu-west-1a
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20201112.1
  machineType: r4.xlarge
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: nodes-eu-west-1a
  role: Node
  subnets:
  - eu-west-1a

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2021-01-16T13:04:53Z"
  labels:
    kops.k8s.io/cluster: kops-staging-test2.k8s.local
  name: nodes-eu-west-1b
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20201112.1
  machineType: r4.xlarge
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: nodes-eu-west-1b
  role: Node
  subnets:
  - eu-west-1b

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

kops update cluster --name kops-staging-test2.k8s.local -v 10 > trace.log 2>&1

trace.log

9. Anything else do we need to know?

Observation 1

Creating a cluster and omitting the --cloud-labels parameter works fine..... very odd :-(

export NAME=kops-staging-test1.k8s.local
kops create cluster --name $NAME \
   --cloud aws \
   --master-zones eu-west-1b \
   --master-size r4.large \
   --zones eu-west-1a,eu-west-1b \
   --node-size r4.xlarge \
   --node-count 2 \
   --authorization RBAC \
   --topology private   \
   --networking calico    \
   --dry-run \
   -o yaml > $NAME.clusterspec.yaml

Observation 2

We first noticed this problem after upgrading our production cluster from v1.15 -> v1.17

kops v1.16.4 appears to work fine.

Observation 3

No known work-around for afflicted clusters.... This blocks us from making cluster changes in production :-(

@olemarkus
Copy link
Member

Thanks for the report. Can confirm the bug

@rifelpet rifelpet added the kind/bug Categorizes issue or PR as related to a bug. label Jan 16, 2021
@olemarkus
Copy link
Member

The reason why this happens is because you set the Name label. This propagates to resources and overrides tags kOps uses for identifying resources it owns.

The quick fix here is to remove/rename the tags you are setting. You may have to manually change the Name tag of resources back to the cluster name kops-staging-test2.k8s.local in the example above.

We will add some validation on reserved tags.

@myspotontheweb
Copy link
Author

Appreciate this timely diagnosis and fix. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants