Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch node status conditions and unschedulable state from API server rather than KSM #194

Merged
merged 10 commits into from
Sep 1, 2021

Conversation

roobre
Copy link
Contributor

@roobre roobre commented Aug 20, 2021

Replaces implementation in #155.

Previous implementation fetched this information from KSM metrics. This caused those metrics to be reported for one node by a different one on multi-node clusters, since there is only one instance of the DaemonSet fetching data for all the cluster objects. In turn, this caused some attributes to have unexpected values, such as hostname not matching nodeName.

This implementation fetches this information from the local node, as an additional step to parsing data from the kubelet.

Additionally, the unschedulable metric is also fetched this way.

@roobre roobre changed the title WIP: kubelete/group: fetch conditions from API server WIP: Fetch node status conditions from API server rather than KSM Aug 20, 2021
@roobre roobre force-pushed the simpler-node-status-conditions branch 3 times, most recently from d700bfb to c390404 Compare August 20, 2021 11:53
@roobre roobre changed the title WIP: Fetch node status conditions from API server rather than KSM Fetch node status conditions from API server rather than KSM Aug 20, 2021
@roobre roobre marked this pull request as ready for review August 20, 2021 12:06
@roobre roobre requested review from invidian and a team August 20, 2021 12:06
@roobre roobre force-pushed the simpler-node-status-conditions branch from c390404 to 2345229 Compare August 20, 2021 12:09
@roobre roobre self-assigned this Aug 20, 2021
Copy link
Contributor

@invidian invidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks mostly good, except some nits like refactoring, imports, dots and typos.

I also left some suggestions how commits could be improved to optimize for nice git history and for reviewing.

continue
}

// Since conditions is a list, there could be duplicate conditions. Check ff we have added this condition before
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the condition really be duplicate? I was not able to observe that in my tests, though I couldn't find code preventing it either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the API should disallow that, but took the defensive programming approach here and since the input is a list, there should be possible for that to happen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would make sense to log when this happen? oh I see no log here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I thought the same but definitely not worth passing a logger this deep down :(

Rumbles about global loggers

src/apiserver/client.go Show resolved Hide resolved
src/kubelet/group.go Show resolved Hide resolved
src/kubelet/group.go Outdated Show resolved Hide resolved
src/apiserver/client.go Show resolved Hide resolved
src/kubelet/metric/transform.go Outdated Show resolved Hide resolved
src/kubelet/metric/transform.go Outdated Show resolved Hide resolved
src/kubelet/group_test.go Show resolved Hide resolved
@roobre roobre changed the title Fetch node status conditions from API server rather than KSM Fetch node status conditions and unschedulable state from API server rather than KSM Aug 24, 2021
@roobre roobre force-pushed the simpler-node-status-conditions branch from 2345229 to f58ddb4 Compare August 24, 2021 11:09
Copy link
Contributor

@gsanchezgavier gsanchezgavier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

src/kubelet/metric/pods.go Outdated Show resolved Hide resolved
continue
}

// Since conditions is a list, there could be duplicate conditions. Check ff we have added this condition before
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would make sense to log when this happen? oh I see no log here

invidian and others added 7 commits August 25, 2021 19:45
Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
So metrics can be generated using API server data instead of KSM.
@invidian invidian force-pushed the simpler-node-status-conditions branch from f58ddb4 to cd66ac7 Compare August 26, 2021 08:42
Copy link
Contributor

@invidian invidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

invidian and others added 3 commits August 26, 2021 11:03
To be used for kubelet metrics.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
So all transform functions are in a single file.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
@invidian invidian force-pushed the simpler-node-status-conditions branch from cd66ac7 to 7ff306f Compare August 26, 2021 09:03
@roobre roobre merged commit 17cf002 into main Sep 1, 2021
@roobre roobre deleted the simpler-node-status-conditions branch September 1, 2021 15:29
roobre added a commit that referenced this pull request Sep 6, 2021
…rather than KSM (#194)

* src/kubelet/group_test.go: fix imports

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/kubelet/group_test.go: fix formatting

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/kubelet/metric/testdata: fix imports

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/kubelet/metric/testdata: fix comments formatting

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/kubelet/group: fetch conditions and unschedulable from API server

So metrics can be generated using API server data instead of KSM.

* cmd/kubernetes-static: add conditions to mocked API server

* src/metric: remove node condition fetching from KSM metrics

* src/kubelet/metric: add PrefixFromMapInt function

To be used for kubelet metrics.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/metric: add condition and unschedulable metrics to kubelet definitions

* src/kubelet/metric: move OneMetricPerLabel to separate file

So all transform functions are in a single file.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

Co-authored-by: Mateusz Gozdek <mgozdek@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants