Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet service in windows is pause/kubeadm join is failing #123173

Closed
yahbouss opened this issue Feb 7, 2024 · 5 comments
Closed

kubelet service in windows is pause/kubeadm join is failing #123173

yahbouss opened this issue Feb 7, 2024 · 5 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/support Categorizes issue or PR as a support question. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@yahbouss
Copy link

yahbouss commented Feb 7, 2024

What happened?

Hello community.

I am trying to deploy an on prem Kubernetes cluster with ubuntu and windows machine.
the setup I have is the following:

Kubernetes version: 1.28

containerD version: 1.7(windows) and 1.6(linux)

container runtime: ContainerD

windows-sorker: Windows Server 2022 21H2

Contorl-node: ubuntu 22.04

linux worker: ubuntu 22.04

initialization by kubeadm

Issue starts here:
After installing containerd using the powershell script provided by ContainerD Setup
And installing kubernetes tools using the sig-windows-tools powershell script.
I ran the join script provided by my control node:

kubeadm join 192.168.x.x:6443 --token TOKEN --discovery-token-ca-cert-hash sha256:HASH --cri-socket="npipe:////./pipe/containerd-containerd"

here is the output:

[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W0207 14:55:30.736204    3388 initconfiguration.go:120] Usage of CRI endpoints without URL scheme is deprecated and can cause kubelet errors in the future. Automatically prepending scheme "npipe" to the "criSocket" with value "unix:///var/run/unknown.sock". Please update your configuration!
W0207 14:55:30.740074    3388 utils.go:69] The recommended value for "authentication.x509.clientCAFile" in "KubeletConfiguration" is: \etc\kubernetes\pki\ca.crt; the provided value is: /etc/kubernetes/pki/ca.crt
[kubelet-start] Writing kubelet configuration to file "\\var\\lib\\kubelet\\config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "\\var\\lib\\kubelet\\kubeadm-flags.env"
[kubelet-start] Starting the kubelet
W0207 14:55:41.047667    3388 kubelet.go:43] [kubelet-start] WARNING: unable to start the kubelet service: [couldn't start service kubelet: timeout waiting for kubelet service to start]
[kubelet-start] Please ensure kubelet is reloaded and running manually.
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connectex: No connection could be made because the target machine actively refused it..
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connectex: No connection could be made because the target machine actively refused it..
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connectex: No connection could be made because the target machine actively refused it..
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connectex: No connection could be made because the target machine actively refused it..
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connectex: No connection could be made because the target machine actively refused it..

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'
error execution phase kubelet-start: timed out waiting for the condition
To see the stack trace of this error execute with --v=5 or higher

Investigation:
after investigating a bit i found that the kublet service paused. I tried to get an output of the error by running kubelet command, and this is the error i found:

E0207 15:09:40.843756    7096 run.go:74] "command failed" err="failed to validate kubelet configuration, error: [invalid configuration: CgroupsPerQOS (--cgroups-per-qos) true is not supported on Windows, invalid configuration: EnforceNodeAllocatable (--enforce-node-allocatable) [pods] is not supported on Windows], path: &TypeMeta{Kind:,APIVersion:,}"

What did you expect to happen?

Kubelet was supposed to join seamlessly since i sued the best practices

How can we reproduce it (as minimally and precisely as possible)?

kubelet

E0207 15:09:40.843756    7096 run.go:74] "command failed" err="failed to validate kubelet configuration, error: [invalid configuration: CgroupsPerQOS (--cgroups-per-qos) true is not supported on Windows, invalid configuration: EnforceNodeAllocatable (--enforce-node-allocatable) [pods] is not supported on Windows], path: &TypeMeta{Kind:,APIVersion:,}"

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Unable to connect to the server: dial tcp 127.0.0.1:6443: connectex: No connection could be made because the target machine actively refused it.

Cloud provider

On prem

OS version

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
BuildNumber  Caption                                   OSArchitecture  Version
20348        Microsoft Windows Server 2022 Datacenter  64-bit          10.0.20348

Install tools

Kubeadm

Container runtime (CRI) and version (if applicable)

containerd v1.6

Related plugins (CNI, CSI, ...) and versions (if applicable)

@yahbouss yahbouss added the kind/bug Categorizes issue or PR as related to a bug. label Feb 7, 2024
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Feb 7, 2024
@k8s-ci-robot
Copy link
Contributor

There are no sig labels on this issue. Please add an appropriate label by using one of the following commands:

  • /sig <group-name>
  • /wg <group-name>
  • /committee <group-name>

Please see the group list for a listing of the SIGs, working groups, and committees available.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Feb 7, 2024
@neolit123
Copy link
Member

sig-windows-tools tickets should be logged in that repo, not here.
also try asking for help in the #sig-windows slack channel of k8s.

generally, we don't provide support on github.
https://github.com/kubernetes/kubernetes/blob/master/SUPPORT.md

/close
/kind support

@k8s-ci-robot k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Feb 7, 2024
@k8s-ci-robot
Copy link
Contributor

@neolit123: Closing this issue.

In response to this:

sig-windows-tools tickets should be logged in that repo, not here.
also try asking for help in the #sig-windows slack channel of k8s.

generally, we don't provide support on github.
https://github.com/kubernetes/kubernetes/blob/master/SUPPORT.md

/close
/kind support

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@neolit123
Copy link
Member

neolit123 commented Feb 7, 2024

E0207 15:09:40.843756 7096 run.go:74] "command failed" err="failed to validate kubelet configuration, error: [invalid configuration: CgroupsPerQOS (--cgroups-per-qos) true is not supported on Windows, invalid configuration: EnforceNodeAllocatable (--enforce-node-allocatable) [pods] is not supported on Windows], path: &TypeMeta{Kind:,APIVersion:,}"

FWIW, that powershel script should work as it has these options configured correctly on WIndows. or maybe it's a different script
https://github.com/kubernetes-sigs/sig-windows-tools/blob/master/hostprocess/PrepareNode.ps1#L73

also i have a PR for the kubelet to stop erroring on these exact options
#123137

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/support Categorizes issue or PR as a support question. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

3 participants