Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Horizontal Pod Autoscaler Label clash #10862

Closed
timothyclarke opened this issue Feb 17, 2021 · 20 comments · Fixed by #10910
Closed

Horizontal Pod Autoscaler Label clash #10862

timothyclarke opened this issue Feb 17, 2021 · 20 comments · Fixed by #10910
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@timothyclarke
Copy link
Contributor

timothyclarke commented Feb 17, 2021

1. What kops version are you running? The command kops version, will display
this information.

Version 1.20.0-alpha.2
I'm working on adding features under #10706 however I think this bug is independant

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.7", GitCommit:"1dd5338295409edcfff11505e7bb246f0d325d15", GitTreeState:"clean", BuildDate:"2021-01-13T13:23:52Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
I do not believe the client version is applicable as the error is occurring as I am performing kops edit cluster

3. What cloud provider are you using?
AWS

4. What commands did you run? What is the simplest way to reproduce this issue?

kops create cluster your normal options + --cloud-labels "k8s.io/cluster-autoscaler/${KUBE_NAME}=owned"
Edit the cluster and replace the Classic Loadbalancer with a Network Loadbalancer.

In my case

#!/bin/bash
K8S_ZONES="eu-west-1b,eu-west-1c"
K8S_VPC_ID="vpc-0XXXXXXX1"
K8S_CLUSTER_NAME="common-dev-euw1-cluster"
K8S_SUBNETS='subnet-01aXXXXXb0c,subnet-0418XXXXXXX1f7'
K8S_UTILITY_SUBNETS="subnet-04dXXXXX417,subnet-0a4XXXXX0b"
K8S_BASTION="ssh-rsa   REMOVED=="
K8S_VPC_CIDR="172.19.192.0/20"


./kops create cluster --state "s3://${S3_BUCKET}" \
            --cloud aws \
            --vpc ${K8S_VPC_ID} \
            --subnets ${K8S_SUBNETS} \
            --zones ${K8S_ZONES} \
            --utility-subnets ${K8S_UTILITY_SUBNETS} \
            --topology private \
            --master-count 1 \
            --master-size t3a.large \
            --node-count 2 \
            --node-size r5.xlarge \
            --networking calico \
            --ssh-public-key="${ENV}_key.pub" \
            --bastion=false \
            --dns-zone ${KUBE_NAME} \
            --encrypt-etcd-storage \
            --cloud-labels "HostType=k8s,TargetEnvironment=${ENV},k8s.io/cluster-autoscaler/${KUBE_NAME}=owned" \
            --admin-access ${CORP_IP_CIDR} \
            --admin-access ${K8S_VPC_CIDR} \
            --admin-access ${NAT_GW_1}/32 \
            --admin-access $(NAT_GW_2)/32 \
            --dry-run \
            -o yaml ${KUBE_NAME} > cluster-${KUBE_NAME}.yaml

I'm editing the yaml to add additional instance groups, set the auth etc

Followed by

kops --state "s3://${S3_BUCKET}" create -f ${KUBE_NAME}.config.yaml

After the cluster was up I edited the cluster to change the loadbalancer

5. What happened after the commands executed?
I received the error message

spec.cloudLabels.k8s.io/cluster-autoscaler/test.example.com: Forbidden: "k8s.io/" is a reserved label prefix and cannot be used as a custom label

6. What did you expect to happen?
The cluster edit to not complain. i'd then need to apply the changes with kops update cluster --yes

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

9. Anything else do we need to know?

@timothyclarke
Copy link
Contributor Author

/kind bug
cc @rifelpet

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 17, 2021
@bharath-123
Copy link
Contributor

bharath-123 commented Feb 21, 2021

Hi @timothyclarke , I think kOps does not want you to provide labels with prefix k8s.io? Based on this code https://github.com/kubernetes/kops/master/pkg/apis/kops/validation/cluster.go#L128. I don't think changing the load balancer type should be of any issue over here.

@olemarkus
Copy link
Member

The validation is probably a bit overeager. The intention is to prevent people from setting labels that kOps depend on being set to a specific value. I do think people should be able to enable CAS auto discovery though.

@timothyclarke would you be able to do a PR against the validation logic? I believe we probably want this in before kOps 1.20 GA.

@timothyclarke
Copy link
Contributor Author

timothyclarke commented Feb 22, 2021

@olemarkus I'm sorry I'm not that skilled as a developer to review and fix this issue.

I've been able to simplify the steps to replicate the issue that is completely independent of my PR.

  1. Download the current 1.20.0-alpha.2 build
  2. Perform a dry-run create of the cluster with the cloud labels arg and redirect to a file
    I've only kept the args I believe are pertinent
export KUBE_NAME="test.example.com"
kops create cluster -- cloud aws --cloud-labels "k8s.io/cluster-autoscaler/${KUBE_NAME}=owned" --dry-run -o yaml ${KUBE_NAME} > cluster-${KUBE_NAME}.yaml
  1. Create (but not update) the cluster from the above file kops create -f cluster-${KUBE_NAME}.yaml
  2. Edit the cluster and change the loadbalancer type from Classic to Network eg kops-linux-amd64 edit cluster --name ${KUBE_NAME}

There's probably a simpler method to reproduce but the above didn't need to create any of the resources in AWS to generate the error. The above steps only stages the changes.
In step 4 note that the following is defined

spec:
  cloudLabels:
    k8s.io/cluster-autoscaler/test.example.com: owned

If that label is removed then the changes will be applied (however the cluster autoscaler will no longer pickup the instance groups).
I do note that I can apply the above spec.cloudLabels directly to an instance group without generating errors (kops edit ig). While this would work around the issue, it means that there is more admin overhead to appliy it to every IG instead of this cascading down from the cluster config

@olemarkus
Copy link
Member

Thanks for the info. That is what I need to fix this. PR incoming.

@dmcnaught
Copy link
Contributor

@olemarkus
I am seeing a similar error with kops 1.20.0
I have this in my cluster config:

cloudLabels:
       kubernetes.io/cluster/<myclustername>: owned

and when I finish kops edit cluster to move from 1.19 to 1.20, I get this error:

spec.cloudLabels.kubernetes.io/cluster/<myclustername>: Forbidden: "kubernetes.io/cluster/" is a reserved label prefix and cannot be used as a custom label

@olemarkus
Copy link
Member

What is the reason for setting this label explicitly?

@dmcnaught
Copy link
Contributor

It's for the AWS Ingress controller @olemarkus - following kops instructions here:
https://github.com/kubernetes/kops/blob/97869057129b30ea284c3ed1bdf1db36e752701d/addons/kube-ingress-aws-controller/README.md#modify-an-existing-cluster
"Cloud Labels are required to make Kube AWS Ingress Controller work, because it has to find the AWS Application Load Balancers it manages by AWS Tags, which are called cloud Labels in Kops."

@olemarkus
Copy link
Member

Ouch. That looks dated and should be removed. https://kops.sigs.k8s.io/addons/#aws-load-balancer-controller should be used instead

@dmcnaught
Copy link
Contributor

This is tight - kops 1.19 wasn't recognizing

awsLoadBalancerController:
    enabled: true

for me, so I am planning to revisit the migration to the aws load balancer controller in 1.20, but I'm blocked in upgrading the cluster spec to kops 1.20 because of this reserved label issue.

@dmcnaught
Copy link
Contributor

We need to leave some time to deprecate the AWS Ingress Controller. the link you sent says aws load balancer controller introduced to kops in 1.20

@olemarkus
Copy link
Member

You don't actually need the cloud labels. this is handled by kOps and k8s, which is why it is marked as reserved. it is rather dangerous to add it like that. Those docs are quite old and I'd consider the content deprecated already.

@dmcnaught
Copy link
Contributor

You mean I can just remove that label and it won't affect the ingress controller?

@olemarkus
Copy link
Member

Yes. That should work

@prasanththorati
Copy link

Hii @olemarkus The kubernetes cluster that i am running is 1.18.14 we are running airflow,postgres,dbt on it. We want to upgrade it to the latest version using the kops tool. My manager suggested that if we take such a huge leap from 1.18.14. to 1.23.5 our applications running on it might break and told me that we shall upgrade it to 1.18.17 first and we shall do it step by step. When i tried to edit the cluster i get the below error:
Forbidden: "kubernetes.io/cluster/" is a reserved label prefix and cannot be used as a custom label

Please let me how to resolve this and also please suggest me whether upgrading the kubernetes from 1.18.14 to 1.23.5 is a good option or will we break our applications?

@olemarkus
Copy link
Member

Hey. The recommendation is:

  • Always use the latest kops supporting the given kubernetes version you are running
  • Always upgrade kubernetes minor version by minor version

Doing a jump from k8s 1.18 to 1.23 will surely break things.

@prasanththorati
Copy link

Hi olemarkus,

When i try to edit the cluster file in kops it says it is forbidden, so not able to change the version from 1.18.14 to 1.18.17.

@olemarkus
Copy link
Member

Can you file a new issue following the issue template?

@timothyclarke
Copy link
Contributor Author

timothyclarke commented Apr 27, 2022

When i try to edit the cluster file in kops it says it is forbidden, so not able to change the version from 1.18.14 to 1.18.17.

@prasanththorati
Were you using the appropriate KOPs binary or just trying to edit the text ?
Typically the upgrade process would be

  1. Download the latest minor version to match your major version eg 1.18.17
  2. Upgrade to that version eg kops upgrade && kops rolling update
  3. Download the latest minor version of the next major version
  4. Upgrade to that version

Repeat steps 3&4 until you're on a desired version (eg latest)

@olemarkus
Copy link
Member

As mentioned above, the appropriate kops version is always the latest that supports the current k8s version. For k8s 1.18 that is kops 1.23.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants