Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: . is removed in nodeName #2373

Closed
wants to merge 4 commits into from

Conversation

CatherineF-dev
Copy link
Contributor

@CatherineF-dev CatherineF-dev commented Apr 17, 2024

What this PR does / why we need it:

How does this change affect the cardinality of KSM: (increases, decreases or does not change cardinality) No

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes # an issue raised in slack chat We see the fieldSelector get created with the node-name … but it’s missing the .

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 17, 2024
@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Apr 17, 2024
@CatherineF-dev CatherineF-dev changed the title Fix bug that . is removed in nodeName Fix . is removed in nodeName Apr 17, 2024
@CatherineF-dev CatherineF-dev changed the title Fix . is removed in nodeName fix: . is removed in nodeName Apr 17, 2024
@diranged
Copy link

This should fix #2374. Thank you, I hope this can get merged quickly!

Copy link
Contributor

@LaikaN57 LaikaN57 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Alternatively, it looks like the , in the pattern might have been a typo. We could just replace that with a ..

@CatherineF-dev CatherineF-dev mentioned this pull request Apr 18, 2024
@logicalhan
Copy link
Member

/assign @dgrisonnet
/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 18, 2024
@mrueg
Copy link
Member

mrueg commented Apr 18, 2024

There was a conversation here: https://github.com/kubernetes/kube-state-metrics/pull/2217/files#r1356982673 https://github.com/kubernetes/kube-state-metrics/pull/2217/files#r1357024577

Do we still need this to differentiate between "" and nil?

@CatherineF-dev
Copy link
Contributor Author

Good point. Updated to only add \\.

Tested: https://go.dev/play/p/K5GtruNWjO0

@CatherineF-dev
Copy link
Contributor Author

@CatherineF-dev
Copy link
Contributor Author

Updated. Need review again.

@dgrisonnet
Copy link
Member

dgrisonnet commented Apr 19, 2024

I don't get why we have this NodeType structure from #2217. Pods in scheduling state could be handled separately, even maybe with a dedicated CLI option. I think that the code could be simplified.

Also one thing I am curious about is what would happen once the pod is scheduled on the node. Do we continue to produce a stale metrics with the state of the pod when it wasn't schedule? Or do we have a mechanism to check that this pod shouldn't be watched by this shard anymore?

Copy link

@diranged diranged left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve - but I'm not sure that means anything. :)

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: CatherineF-dev, diranged, LaikaN57
Once this PR has been reviewed and has the lgtm label, please ask for approval from dgrisonnet. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@CatherineF-dev
Copy link
Contributor Author

Do we continue to produce a stale metrics with the state of the pod when it wasn't schedule?

No. I tested for both daemonset (watch scheduled pods) and deployment (watch not scheduled pods).

# Test for the deployment

# 1. deploy a pod
kubectl apply -f https://raw.githubusercontent.com/CatherineF-dev/k8s-hello-world/main/unschedule_pod.yaml

kubectl get pods --field-selector spec.nodeName= -A
NAMESPACE   NAME                          READY   STATUS    RESTARTS   AGE
default     hello-node-7d78fd84d9-rqbjh   0/1     Pending   0          38s

# 2. can see metrics when this pod is not scheduled

curl localhost:8080/metrics | grep hello-node | grep schedule

kube_pod_status_scheduled{namespace="default",pod="hello-node-7d78fd84d9-7grpc",uid="1c9982aa-e99b-46a0-97c0-b03bf1b94b47",condition="true"} 0
kube_pod_status_scheduled{namespace="default",pod="hello-node-7d78fd84d9-7grpc",uid="1c9982aa-e99b-46a0-97c0-b03bf1b94b47",condition="false"} 1
kube_pod_status_scheduled{namespace="default",pod="hello-node-7d78fd84d9-7grpc",uid="1c9982aa-e99b-46a0-97c0-b03bf1b94b47",condition="unknown"} 0
kube_pod_scheduler{namespace="default",pod="hello-node-7d78fd84d9-7grpc",uid="1c9982aa-e99b-46a0-97c0-b03bf1b94b47",name="default-scheduler"} 1

# 3. can't see metrics when this pod is scheduled 
curl localhost:8080/metrics | grep hello-node | grep schedule

@CatherineF-dev
Copy link
Contributor Author

@dgrisonnet need review again.

@mrueg
Copy link
Member

mrueg commented May 16, 2024

Closing this in favor of #2388

Feel free to reopoen if anything changes.

@mrueg mrueg closed this May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants