migration: match SELinux level of source pod on target pod #9246

jean-edouard · 2023-02-14T15:59:03Z

What this PR does / why we need it:
When VMIs use RWX disks based on some storage classes, like cephfs, both the source and target pods of a migration need to have access to the files.
There are 2 ways to achieve that:

either remove the SELinux categories on the files for the duration of the migration (see WIP: host-disk: remove SELinux categories during migration #9190)
or match the source categories of the source pod on the target pod

Both solutions have negative security implications, but they're the only way to deal with shared resources.
I believe this is the best approach, as it doesn't mess with the disk and doesn't expose files to the entire node/cluster for the duration of the migration.
The only downside here is that the target node could (have) create(d) a pod with the same categories.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Release note:

Fixed migration issue for VMIs that have RWX disks backed by filesystem storage classes.

jean-edouard · 2023-02-15T02:43:15Z

/retest

jean-edouard · 2023-02-15T13:25:12Z

/retest-required

jean-edouard · 2023-02-22T21:28:27Z

A functest was added and this is now ready for reviews!

jean-edouard · 2023-02-23T02:06:13Z

/retest

akalenyu

Looks good!

This does not happen in the RWX Block case, right?
I am just trying to understand why this was never an issue with our CI lanes
running ceph rbd migrations

akalenyu · 2023-03-01T15:51:31Z

pkg/virt-controller/watch/migration.go

+			templatePod.Spec.SecurityContext = &k8sv1.PodSecurityContext{}
+		}
+		templatePod.Spec.SecurityContext.SELinuxOptions = &k8sv1.SELinuxOptions{
+			Level: seFields[3],


just a suggestion, may be worth to try construct the context and grab the level out of it?

kubevirt/vendor/github.com/opencontainers/selinux/go-selinux/selinux_linux.go

Lines 724 to 739 in 897d63c

func newContext(label string) (Context, error) {

c := make(Context)

if len(label) != 0 {

con := strings.SplitN(label, ":", 4)

if len(con) < 3 {

return c, InvalidLabel

}

c["user"] = con[0]

c["role"] = con[1]

c["type"] = con[2]

if len(con) > 3 {

c["level"] = con[3]

}

}

return c, nil

Good point, will do!

jean-edouard · 2023-03-01T17:31:54Z

Looks good!

This does not happen in the RWX Block case, right? I am just trying to understand why this was never an issue with our CI lanes running ceph rbd migrations

This is actually specifically for RWX, but it's needed only for FS CSIs (at least from what I've seen so far). Ceph rbd shouldn't need this.

akalenyu · 2023-03-01T17:45:24Z

Looks good!
This does not happen in the RWX Block case, right? I am just trying to understand why this was never an issue with our CI lanes running ceph rbd migrations

This is actually specifically for RWX, but it's needed only for FS CSIs (at least from what I've seen so far). Ceph rbd shouldn't need this.

Yeah I am just trying to understand why this isn't a problem for ceph rbd, interesting

jean-edouard · 2023-03-01T17:52:31Z

Looks good!
This does not happen in the RWX Block case, right? I am just trying to understand why this was never an issue with our CI lanes running ceph rbd migrations

This is actually specifically for RWX, but it's needed only for FS CSIs (at least from what I've seen so far). Ceph rbd shouldn't need this.

Yeah I am just trying to understand why this isn't a problem for ceph rbd, interesting

Don't quote me on that, but I think dev nodes for the block volumes get their own label inside the pod mount namespace thanks to overlayfs.
However, for FS volumes, the labels are handled at the filesystem level inside each volume, and there's not much k8s/overlayfs can do there.
As you can probably see I'm no k8s storage expert, but hopefully that helps!

akalenyu · 2023-03-05T13:52:44Z

/lgtm

jean-edouard · 2023-03-15T13:55:40Z

/cc @xpivarc @vladikr
/retest

xpivarc · 2023-03-24T15:02:14Z

Hey @jean-edouard ,
Did you have a chance to explore how this works in Kubernetes? I mean what happens if I want ReplicaSet with shared pvc? Will only one of these Pods get to read and write to pvc?

Also how is possible we did hit this only now? Is it possible this is a problem specific to a subset of CSIs?

Sorry for the delay I should be more available moving forward.

jean-edouard · 2023-03-24T16:55:15Z

Hey @jean-edouard , Did you have a chance to explore how this works in Kubernetes? I mean what happens if I want ReplicaSet with shared pvc? Will only one of these Pods get to read and write to pvc?

With cephfs, I believe so yes. Unless some privileged entity auto-relabels new files with the level s0 (removing the categories).

Also how is possible we did hit this only now? Is it possible this is a problem specific to a subset of CSIs?

Yes, so far this issue has only been seen with cephfs. It is discussed here:
ceph/ceph-csi#3562

jean-edouard · 2023-05-26T18:53:49Z

/retest

jean-edouard · 2023-05-31T17:01:25Z

@akalenyu could you please re-review and maybe re-lgtm? Thanks!
@xpivarc can we unhold?

akalenyu · 2023-06-01T07:57:56Z

pkg/virt-controller/watch/migration.go

@@ -631,6 +633,32 @@ func (c *MigrationController) createTargetPod(migration *virtv1.VirtualMachineIn
 		}
 	}

+	matchLevelOnTarget := c.clusterConfig.GetMigrationConfiguration().MatchSELinuxLevelOnMigration
+	if matchLevelOnTarget == nil || *matchLevelOnTarget {


Is this condition correct? If the tunable for this is omitted, it's an opt-in?
Also, just nitpicking, it may be nice to return early if the tunable is false, avoiding the extra indentation
Would probably mean this chunk needs to get extracted to it's own func

Yeah, it's true by default, as discussed there: #9246 (comment)
I'll address the second part asap, thank you!

Done, PTAL!

xpivarc · 2023-06-01T08:26:28Z

/hold cancel

akalenyu · 2023-06-01T17:44:25Z

/lgtm

kubevirt-commenter-bot · 2023-06-01T22:46:28Z

/retest-required
This bot automatically retries required jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

kubevirt-commenter-bot · 2023-06-02T03:46:29Z

/retest-required
This bot automatically retries required jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

kubevirt-commenter-bot · 2023-06-02T07:46:35Z

/retest-required
This bot automatically retries required jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

brianmcarey · 2023-06-02T07:49:42Z

@jean-edouard - just a heads up. It looks like the SRIOV lane has never passed on this PR so there might be an issue there.
https://prow.ci.kubevirt.io/pr-history/?org=kubevirt&repo=kubevirt&pr=9246

Not sure if there is a point in leaving this to just retest.

xpivarc

/hold

xpivarc · 2023-06-02T08:06:22Z

pkg/virt-controller/watch/migration.go

+	// Therefore, it needs to share the same SELinux categories to inherit the same permissions
+	// Note: there is a small probablility that the target pod will share the same categories as another pod on its node.
+	//   It is a slight security concern, but not as bad as removing categories on all shared objects for the duration of the migration.
+	if vmiSeContext == "" {


Seems we do "selinuxContext": "none",

Really nice catch, thank you! Should be fixed now.

I agree @brianmcarey

Signed-off-by: Jed Lejosne <jed@redhat.com>

xpivarc · 2023-06-02T20:33:22Z

/lgtm

xpivarc · 2023-06-02T20:35:08Z

@jean-edouard maybe we can introduce helpers for this so we don't get this wrong again in future

jean-edouard · 2023-06-05T11:17:13Z

@jean-edouard maybe we can introduce helpers for this so we don't get this wrong again in future

@xpivarc good idea, ok to do in a later PR and unhold this one? Thanks you!

xpivarc · 2023-06-05T11:30:15Z

/hold cancel
@jean-edouard Sorry I have missed the hold!

jean-edouard · 2023-06-05T20:03:58Z

/test pull-kubevirt-e2e-kind-1.27-sriov

kubevirt-bot · 2023-06-06T03:18:17Z

@jean-edouard: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-kubevirt-e2e-kind-1.23-sriov	`7bbf98d`	link	true	`/test pull-kubevirt-e2e-kind-1.23-sriov`
pull-kubevirt-fossa	`cae7333`	link	false	`/test pull-kubevirt-fossa`
pull-kubevirt-e2e-k8s-1.24-sig-compute	`80254d4`	link	true	`/test pull-kubevirt-e2e-k8s-1.24-sig-compute`
pull-kubevirt-e2e-k8s-1.26-sig-compute	`8c11802`	link	unknown	`/test pull-kubevirt-e2e-k8s-1.26-sig-compute`

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

jean-edouard · 2023-06-06T05:37:47Z

/retest

kubevirt-bot requested review from dhiller and mhenriks February 14, 2023 16:01

jean-edouard force-pushed the catsontarget branch from 717b639 to 5ec3762 Compare February 14, 2023 17:35

jean-edouard force-pushed the catsontarget branch from 5ec3762 to 897d63c Compare February 22, 2023 21:26

kubevirt-bot added size/M and removed size/S labels Feb 22, 2023

jean-edouard changed the title ~~WIP: migration: match SELinux level of source pod on target pod~~ migration: match SELinux level of source pod on target pod Feb 22, 2023

kubevirt-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 22, 2023

jean-edouard mentioned this pull request Feb 22, 2023

WIP: host-disk: remove SELinux categories during migration #9190

Closed

akalenyu reviewed Mar 1, 2023

View reviewed changes

jean-edouard force-pushed the catsontarget branch from 897d63c to 7bbf98d Compare March 2, 2023 13:18

kubevirt-bot assigned akalenyu Mar 5, 2023

kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Mar 5, 2023

kubevirt-bot requested review from vladikr and xpivarc March 15, 2023 13:55

jean-edouard force-pushed the catsontarget branch from a31a240 to 2c7e864 Compare May 26, 2023 13:27

akalenyu reviewed Jun 1, 2023

View reviewed changes

kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 1, 2023

jean-edouard force-pushed the catsontarget branch from 2c7e864 to bdab680 Compare June 1, 2023 17:39

kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Jun 1, 2023

xpivarc reviewed Jun 2, 2023

View reviewed changes

kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 2, 2023

migration: match SELinux level of source pod on target pod

8c11802

Signed-off-by: Jed Lejosne <jed@redhat.com>

jean-edouard force-pushed the catsontarget branch from bdab680 to 8c11802 Compare June 2, 2023 12:54

kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 2, 2023

kubevirt-bot assigned xpivarc Jun 2, 2023

kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Jun 2, 2023

kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 5, 2023

kubevirt-bot merged commit 28a3e16 into kubevirt:main Jun 6, 2023
35 of 36 checks passed

kubevirt-bot mentioned this pull request Jul 26, 2023

Macvtap binding mode for pod network #7648

Open

	func newContext(label string) (Context, error) {
	c := make(Context)

	if len(label) != 0 {
	con := strings.SplitN(label, ":", 4)
	if len(con) < 3 {
	return c, InvalidLabel
	}
	c["user"] = con[0]
	c["role"] = con[1]
	c["type"] = con[2]
	if len(con) > 3 {
	c["level"] = con[3]
	}
	}
	return c, nil

migration: match SELinux level of source pod on target pod #9246

migration: match SELinux level of source pod on target pod #9246

Conversation

jean-edouard commented Feb 14, 2023 • edited

jean-edouard commented Feb 15, 2023

jean-edouard commented Feb 15, 2023

jean-edouard commented Feb 22, 2023

jean-edouard commented Feb 23, 2023

akalenyu left a comment

Choose a reason for hiding this comment

akalenyu Mar 1, 2023 • edited

Choose a reason for hiding this comment

jean-edouard Mar 1, 2023

Choose a reason for hiding this comment

jean-edouard commented Mar 1, 2023

akalenyu commented Mar 1, 2023

jean-edouard commented Mar 1, 2023

akalenyu commented Mar 5, 2023

jean-edouard commented Mar 15, 2023

xpivarc commented Mar 24, 2023

jean-edouard commented Mar 24, 2023

jean-edouard commented May 26, 2023

jean-edouard commented May 31, 2023

akalenyu Jun 1, 2023

Choose a reason for hiding this comment

jean-edouard Jun 1, 2023

Choose a reason for hiding this comment

jean-edouard Jun 1, 2023

Choose a reason for hiding this comment

xpivarc commented Jun 1, 2023

akalenyu commented Jun 1, 2023

kubevirt-commenter-bot commented Jun 1, 2023

kubevirt-commenter-bot commented Jun 2, 2023

kubevirt-commenter-bot commented Jun 2, 2023

brianmcarey commented Jun 2, 2023

xpivarc left a comment

Choose a reason for hiding this comment

xpivarc Jun 2, 2023

Choose a reason for hiding this comment

jean-edouard Jun 2, 2023

Choose a reason for hiding this comment

xpivarc Jun 2, 2023

Choose a reason for hiding this comment

xpivarc commented Jun 2, 2023

xpivarc commented Jun 2, 2023

jean-edouard commented Jun 5, 2023

xpivarc commented Jun 5, 2023

jean-edouard commented Jun 5, 2023

kubevirt-bot commented Jun 6, 2023

jean-edouard commented Jun 6, 2023

jean-edouard commented Feb 14, 2023 •

edited

akalenyu Mar 1, 2023 •

edited