Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check Kubelet is running with correct Windows Permissions #96616

Merged

Conversation

perithompson
Copy link
Contributor

@perithompson perithompson commented Nov 16, 2020

What type of PR is this?
/kind bug

What this PR does / why we need it:
This PR adds a Windows-specific permissions check on the kubelet startup. Currently, the log file outputs an error on windows stating that the UID is -1 as UID does not exist in a windows context.
Which issue(s) this PR fixes:

Fixes #96512

Special notes for your reviewer:
As part of this PR I have moved the existing checkPermissions function to server_others.go to replicate similar behaviour in kube-proxy as the checkPermissions is not linux specific it should be build whenever not running on windows.

Does this PR introduce a user-facing change?:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

The token.IsMember(sid) returns TRUE if the caller's process is a member of the Administrators local group. Caller is NOT
expected to be impersonating anyone and is expected to be able to open its own process and process token. In order for the process to "Run as administrator" the excuting user must be in the BUILTIN\Administrators Group so I have added a check for that too.

More information about this can be found in in the c++ implementation routine and this go issue

Release Notes

Kubelet.exe on Windows now checks that the process running as administrator and the executing user account is listed in the built-in administrators group.  This is the equivalent to checking the process is running as uid 0.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress kind/bug size/L do-not-merge/release-note-label-needed do-not-merge/needs-sig cncf-cla: yes needs-triage labels Nov 16, 2020
@k8s-ci-robot
Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Nov 16, 2020

Welcome @perithompson!

It looks like this is your first PR to kubernetes/kubernetes 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kubernetes has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test label Nov 16, 2020
@k8s-ci-robot
Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Nov 16, 2020

Hi @perithompson. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority label Nov 16, 2020
@k8s-ci-robot k8s-ci-robot requested review from vishh and yujuhong Nov 16, 2020
@jayunit100
Copy link
Member

@jayunit100 jayunit100 commented Nov 16, 2020

@perithompson ok, next steps get your CLAs sorted internally (i.e. at vmware and sign the cla contributor stuff)

@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node size/M and removed do-not-merge/needs-sig size/L labels Nov 16, 2020
@perithompson perithompson force-pushed the windows-check-permissions branch from 5d4a15f to af6a2ad Compare Nov 16, 2020
@k8s-ci-robot k8s-ci-robot added size/L and removed size/M labels Nov 17, 2020
@perithompson perithompson force-pushed the windows-check-permissions branch from bf096c6 to 0c73106 Compare Nov 17, 2020
@perithompson
Copy link
Contributor Author

@perithompson perithompson commented Nov 17, 2020

/sig windows

@k8s-ci-robot k8s-ci-robot added the sig/windows label Nov 17, 2020
@perithompson perithompson marked this pull request as ready for review Nov 17, 2020
@perithompson perithompson changed the title (WIP) Check Kubelet is running with correct Windows Permissions Check Kubelet is running with correct Windows Permissions Nov 17, 2020
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress label Nov 17, 2020
@ehashman ehashman added this to Triage in SIG Node PR Triage Jan 6, 2021
@@ -135,7 +137,7 @@ go_library(
],
"@io_bazel_rules_go//go/platform:windows": [
"//pkg/windows/service:go_default_library",
"//vendor/golang.org/x/sys/windows:go_default_library",
"//vendor/golang.org/x/sys/windows:go_default_library",
Copy link
Member

@pacoxu pacoxu Jan 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments here: Extra space.

@marosset
Copy link
Contributor

@marosset marosset commented Mar 9, 2021

/test pull-kubernetes-e2e-aks-engine-azure-windows
/test pull-kubernetes-e2e-aks-engine-windows-containerd

@perithompson
Copy link
Contributor Author

@perithompson perithompson commented Mar 9, 2021

@perithompson can you rebase the PR?

Not sure how you did it @marosset but it suddenly realised it was ok 🙂

@marosset
Copy link
Contributor

@marosset marosset commented Mar 9, 2021

/retest

1 similar comment
@perithompson
Copy link
Contributor Author

@perithompson perithompson commented Mar 9, 2021

/retest

@ehashman ehashman moved this from Needs Reviewer to Needs Approver in SIG Node PR Triage Mar 9, 2021
@marosset
Copy link
Contributor

@marosset marosset commented Mar 10, 2021

/retest

cmd/kubelet/app/server_windows.go Outdated Show resolved Hide resolved
cmd/kubelet/app/server_windows.go Outdated Show resolved Hide resolved
@adisky
Copy link
Contributor

@adisky adisky commented Mar 10, 2021

@perithompson
Copy link
Contributor Author

@perithompson perithompson commented Mar 10, 2021

@perithompson The e2e test failures looks legit failures https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/96616/pull-kubernetes-e2e-aks-engine-azure-windows/1369474116045770752

From talking with some of the MSFT guys this was test infra related, but agree it does always seem like the same problem and I was thinking it was something to do with CSI that was causing it. I wouldn't have expected the kubelet to even start with where the checkPermissions function is called, if there was an error in that code then kubelet would fail to start and all the e2es would go red

@perithompson perithompson force-pushed the windows-check-permissions branch from ad8de38 to 23eac3c Compare Mar 10, 2021
@k8s-ci-robot k8s-ci-robot removed the lgtm label Mar 10, 2021
@perithompson perithompson force-pushed the windows-check-permissions branch from 23eac3c to f1e2041 Compare Mar 10, 2021
@perithompson perithompson force-pushed the windows-check-permissions branch from f1e2041 to 46738b7 Compare Mar 10, 2021
@perithompson
Copy link
Contributor Author

@perithompson perithompson commented Mar 10, 2021

/retest

@perithompson
Copy link
Contributor Author

@perithompson perithompson commented Mar 10, 2021

/test pull-kubernetes-e2e-kind

@perithompson
Copy link
Contributor Author

@perithompson perithompson commented Mar 10, 2021

@jsturtevant guessing/hoping these are still the same failures?

Mar 10 14:32:59.730: INFO: 
Latency metrics for node k8s-master-36380154-0
Mar 10 14:32:59.730: INFO: Waiting up to 3m0s for all (but 0) nodes to be ready
STEP: Destroying namespace "azuredisk-9336" for this suite.
• Failure [413.600 seconds]
Dynamic Provisioning
/home/prow/go/src/sigs.k8s.io/azuredisk-csi-driver/test/e2e/dynamic_provisioning_test.go:38
  [single-az]
  /home/prow/go/src/sigs.k8s.io/azuredisk-csi-driver/test/e2e/dynamic_provisioning_test.go:41
    should create a statefulset object, write and read to it, delete the pod and write and read to it again [kubernetes.io/azure-disk] [disk.csi.azure.com] [Windows] [It]
    /home/prow/go/src/sigs.k8s.io/azuredisk-csi-driver/test/e2e/dynamic_provisioning_test.go:702
    Unexpected error:
      
        <*errors.errorString | 0xc0003a22b0>: {
            s: "timed out waiting for the condition",
        }
        timed out waiting for the condition
    occurred
    /home/prow/go/src/sigs.k8s.io/azuredisk-csi-driver/test/e2e/testsuites/testsuites.go:634
------------------------------
SSSSSSSSSSSSSSSSSSS
[Fail] Dynamic Provisioning [single-az] [It] should create a statefulset object, write and read to it, delete the pod and write and read to it again [kubernetes.io/azure-disk] [disk.csi.azure.com] [Windows] 
/home/prow/go/src/sigs.k8s.io/azuredisk-csi-driver/test/e2e/testsuites/testsuites.go:634
Ran 9 of 44 Specs in 1796.376 seconds
FAIL! -- 8 Passed | 1 Failed | 0 Pending | 35 Skipped
--- FAIL: TestE2E (1796.38s)
FAIL
FAIL	sigs.k8s.io/azuredisk-csi-driver/test/e2e	1796.452s
FAIL
make: *** [Makefile:112: e2e-test] Error 1
2021/03/10 14:33:06 process.go:155: Step 'make e2e-test' finished in 31m43.364986236s
2021/03/10 14:33:06 aksengine_helpers.go:424: downloading /root/tmp863472555/log-dump.sh from https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/hack/log-dump/log-dump.sh
2021/03/10 14:33:06 util.go:68: curl https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/hack/log-dump/log-dump.sh
2021/03/10 14:33:06 process.go:153: Running: chmod +x /root/tmp863472555/log-dump.sh
 Mar 10 14:16:26.559: INFO: 
Latency metrics for node k8s-master-23042803-0
Mar 10 14:16:26.559: INFO: Waiting up to 3m0s for all (but 0) nodes to be ready
STEP: Destroying namespace "azurefile-5802" for this suite.
• Failure [305.802 seconds]
Dynamic Provisioning
/home/prow/go/src/sigs.k8s.io/azurefile-csi-driver/test/e2e/dynamic_provisioning_test.go:40
  should create a volume on demand with useDataPlaneAPI [kubernetes.io/azure-file] [file.csi.azure.com] [Windows] [It]
  /home/prow/go/src/sigs.k8s.io/azurefile-csi-driver/test/e2e/dynamic_provisioning_test.go:646
[Fail] Dynamic Provisioning [It] should create a volume on demand with useDataPlaneAPI [kubernetes.io/azure-file] [file.csi.azure.com] [Windows] 
/home/prow/go/src/sigs.k8s.io/azurefile-csi-driver/test/e2e/testsuites/testsuites.go:219
Ran 7 of 28 Specs in 709.103 seconds
FAIL! -- 6 Passed | 1 Failed | 0 Pending | 21 Skipped
--- FAIL: TestE2E (709.10s)
FAIL
FAIL	sigs.k8s.io/azurefile-csi-driver/test/e2e	709.155s
FAIL
make: *** [Makefile:84: e2e-test] Error 1
2021/03/10 14:16:38 process.go:155: Step 'make e2e-test' finished in 13m7.41104087s
2021/03/10 14:16:38 aksengine_helpers.go:424: downloading /root/tmp863250335/log-dump.sh from https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/hack/log-dump/log-dump.sh
2021/03/10 14:16:38 util.go:68: curl https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/hack/log-dump/log-dump.sh
2021/03/10 14:16:39 process.go:153: Running: chmod +x /root/tmp863250335/log-dump.sh 

@jayunit100
Copy link
Member

@jayunit100 jayunit100 commented Mar 10, 2021

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Mar 10, 2021
@marosset
Copy link
Contributor

@marosset marosset commented Mar 10, 2021

/retest

@k8s-ci-robot
Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Mar 10, 2021

@perithompson: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-kubernetes-e2e-aks-engine-azure-windows ad8de38 link /test pull-kubernetes-e2e-aks-engine-azure-windows
pull-kubernetes-e2e-azure-file-windows 46738b7 link /test pull-kubernetes-e2e-azure-file-windows
pull-kubernetes-e2e-azure-disk-windows 46738b7 link /test pull-kubernetes-e2e-azure-disk-windows

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@mrunalp
Copy link
Contributor

@mrunalp mrunalp commented Mar 11, 2021

/approve

@k8s-ci-robot
Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Mar 11, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jayunit100, mrunalp, perithompson

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved label Mar 11, 2021
@mrunalp mrunalp moved this from Needs Approver to Done in SIG Node PR Triage Mar 11, 2021
@k8s-ci-robot k8s-ci-robot merged commit a8ac0c9 into kubernetes:master Mar 12, 2021
13 of 16 checks passed
SIG-Windows automation moved this from In Review (v1.21) to Done (v1.21) Mar 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved area/kubelet cncf-cla: yes kind/bug lgtm ok-to-test priority/important-longterm release-note sig/node sig/windows size/L triage/accepted
Projects
SIG-Windows
  
Done (v1.21)
Development

Successfully merging this pull request may close these issues.