Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IG: "kops.k8s.io/instancegroup" property missing under "nodeLabels" for instance groups created via "kops create cluster" command #16378

Closed
salavessa opened this issue Feb 24, 2024 · 7 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@salavessa
Copy link

/kind bug

1. What kops version are you running? The command kops version, will display
this information.

Tested with Client version: 1.28.4 (git-v1.28.4) and Client version: 1.27.3 (git-v1.27.3)

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

N/A

3. What cloud provider are you using?
AWS

4. What commands did you run? What is the simplest way to reproduce this issue?

$ kops create cluster --cloud=aws --dns=private --zones=us-west-2a --name kops.example.com --dry-run -o yaml
# [...]
---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: null
  labels:
    kops.k8s.io/cluster: kops.example.com
  name: control-plane-us-west-2a
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20240126
  machineType: t3.medium
  maxSize: 1
  minSize: 1
  role: Master
  subnets:
  - us-west-2a

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: null
  labels:
    kops.k8s.io/cluster: kops.example.com
  name: nodes-us-west-2a
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20240126
  machineType: t3.medium
  maxSize: 1
  minSize: 1
  role: Node
  subnets:
  - us-west-2a
$ kops --name kops.example.com create instancegroup zzz --dry-run -oyaml
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: null
  labels:
    kops.k8s.io/cluster: kops.example.com
  name: zzz
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20240126
  kubelet:
    anonymousAuth: false
    nodeLabels:
      node-role.kubernetes.io/node: ""
  machineType: t3.medium
  manager: CloudGroup
  maxSize: 2
  minSize: 2
  nodeLabels:
    kops.k8s.io/instancegroup: zzz
  role: Node
  subnets:
  - us-west-2a

5. What happened after the commands executed?
Check Answer 4.

6. What did you expect to happen?
I would expect that all properties, specially kops.k8s.io/instancegroup under nodeLabels, to also be created when using kops create cluster command, the same way kops create instancegroup does.
The whole kubelet property is missing when creating cluster, so maybe the ideal would be to have all "default" properties aligned between create cluster and create instancegroup commands.

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

Check Answer 4.

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

9. Anything else do we need to know?
I have many existing clusters (k8s v1.27 which have been upgraded many times) with the kops.k8s.io/instancegroup node label set, so this may have been working before, or it may have been set as part of a previous kops upgrade.

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 24, 2024
@teocns
Copy link

teocns commented Mar 2, 2024

Is this a bug?

Without too much understanding of the codebase design, I notice there's a fallback to "node" role type. If I am not off-track, this is more of an enhancement for a null check anti-pattern due to a hotspot in the codebase, so perhaps nothing to worry about.

@salavessa
Copy link
Author

salavessa commented Mar 2, 2024

@teocns Not sure if I understand your comment but the issue is not related with the actual node type, that works just fine.

The issue is that the label (at node level) which identifies the KOPs instance group for each specific node is missing when using the kops create cluster command - we can manually add it afterwards but I wouldn't expect to have to do that, specially because when you create a new instance group the nodeLabels is automatically injected (as described in point 4), and also because this was "working" at some point before (as I have nodes from clusters created with older KOPs versions contain the label).

For our environments that was a breaking change (and we had to manually update+rollout the IGs) because we actively use things like kubernetes anti/affinity rules and prometheus metrics which rely in the value of the kops.k8s.io/instancegroup node label.

The actual yaml/property that I would expect to be present for each IG created via kops create cluster command is 👇

  nodeLabels:
    kops.k8s.io/instancegroup: <IG_NAME>

@teocns
Copy link

teocns commented Mar 2, 2024

Gootcha, you rely on the label as an affinity selector within your own workflow, while my observation was oriented more towards kops' own functional integrity. Thanks for clarifying -

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 31, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 30, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 30, 2024
@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

4 participants