Allow specifying custom Ceph user and secret name for mounting #2216

galexrt · 2018-10-14T10:25:51Z

Description of your changes:

Introduce mount security mode for basic multi tenancy

This adds three new parameters/options to StorageClass/flexvolume entry:

mountUser
mountSecret
mountSecretNamespace

Will test in a minute or two but I'm currently "rebuilding" my vendor directory which takes quite an amount of time.

Which issue is resolved by this Pull Request:
Resolves #2164.

Checklist:

Documentation has been updated, if necessary.
Pending release notes updated with breaking and/or notable changes, if necessary.
Upgrade from previous release is tested and upgrade user guide is updated, if necessary.
Code generation (make codegen) has been run to update object specifications, if necessary.
Comments have been added or updated based on the standards set in CONTRIBUTING.md

/cc @dimm0 took me a bit longer than said in the community meeting but here it is

galexrt · 2018-10-14T21:31:17Z

One thing to add here, this change does not only affect filesystem mounting but also block storage or to put it differently I thought why stop at filesystem.

There seems to be one more quirk I have to fix in the code to get the tests green. Will look on Monday, but in general this is ready for review.

travisn

How much simpler would this be if it was only supported for CephFS? If there isn't a request for block to support this, seems like we should keep it simple. The doc example also only discusses how to use it for the file system, right?

Since mounting cephfs requires using flex directly instead of a storage class seems like this would simplify a number of places.

The username and secret would be set here
If they are not set, the admin account would be used, which is the behavior today and would be the new default
No need for an env var in the operator.yaml that controls the policy

Documentation/advanced-configuration.md

cluster/examples/kubernetes/ceph/operator.yaml

cmd/rookflex/cmd/root.go

pkg/daemon/ceph/agent/flexvolume/controller.go

travisn · 2018-10-16T14:26:00Z

pkg/daemon/ceph/agent/flexvolume/manager/ceph/manager.go

@@ -155,14 +174,30 @@ func (vm *VolumeManager) Detach(image, pool, clusterNamespace string, force bool
 		return nil
 	}

+	if id == "" && key == "" {
+		return fmt.Errorf("no id nor keyring given, can't unmount without credentials")


ummapping requires the keyring?

I need to verify, though the unmap command has the keyring flag set on it so I just added it to make sure the credentials are set as in the map command.

Documentation/advanced-configuration.md

travisn

Per discussion in huddle, if nobody needs the feature for block, let's keep it simple and only support it for cephfs.

Documentation/advanced-configuration.md

cluster/examples/kubernetes/ceph/operator.yaml

bassam · 2018-10-17T14:56:48Z

cluster/examples/kubernetes/ceph/storageclass.yaml

+  # (Optional) Specify an existing Ceph user that will be used for mounting storage with this StorageClass.
+  #mountUser: user1
+  # (Optional) Specify an existing Kubernetes Secret name containing just one key holding the Ceph user secret.
+  # The secret must exist in each namespace(s) where the storage will be consumed!


should there also be a secret namespace?

@bassam No, the namespace of the Pod mounting the storage will be used to get the secret. This is in "conformance" with some existing in-tree plugins looking up the secret in the namespace of the pod using storage.

all storage classes that have a secret also specify the namespace. why this different? https://kubernetes.io/docs/concepts/storage/storage-classes/#ceph-rbd

That seems to be new for the user provided mount secret ..
https://v1-11.docs.kubernetes.io/docs/concepts/storage/storage-classes/#ceph-rbd
Sure, I'll add that too though would default it to pod namespace.

galexrt · 2018-10-18T19:28:55Z

@bassam @travisn I updated the PR with the doc changes and the mountSecretNamespace paramter/option. PTAL

Documentation/advanced-configuration.md

pkg/daemon/ceph/agent/flexvolume/controller.go

galexrt · 2018-10-24T13:40:53Z

@bassam I just realized a security problem with the mountSecretNamespace option.
Example:

Namespace ABC has a Secret my-ceph-secret.
User A has only access to namespace XYZ.
Because the agent needs to have access to "all" namespaces in which a mount secret will be, the agent has access to namespace ABC and XYZ.
If User A wants to use the secret for Namespace ABC, the user just needs to set mountSecretNamespace: ABC and would be able to use the secret from that namespace.

I would be for removing the mountSecretNamespace option as it is too much of a security issue.

travisn · 2018-10-26T19:53:41Z

@galexrt what would it take to update a filesystem integration test to generate and mount with the user creds? It seems like it wouldn't be too big a work item.

dimm0 · 2018-10-31T16:14:39Z

Came here to whine about authentication...

travisn · 2018-11-06T23:27:14Z

@galexrt looking at the integration tests, here are a few suggestions. It's not clear yet what the underlying issue is for deleting the mds pod. I am not able to repro the issue locally, similarly to your findings. Only jenkins seems to be having the issue anytime deleting the mds pods. It doesn't seem to be inter-related between the integration tests because each test uses the labels to only look for mds pods that it created.

To troubleshoot further, i would suggest printing the output of pod describe for the mds pod(s). For example, add the following line at the end of the ** method where we see the message about timing out waiting for the pod to be deleted around line 518 in k8s_helper.go. Another idea is to print the logs for an mds pod.

	k8sh.PrintPodDescribe(namespace, "-l", label)

Perhaps on an unrelated note, the operator log shows we are calling an obsolete command to deactivate the mds.

2018-11-06 23:09:06.859733 I | exec: Running command: ceph mds deactivate smoke-test-fs:1 --cluster=smoke-ns --conf=/var/lib/rook/smoke-ns/smoke-ns.config --keyring=/var/lib/rook/smoke-ns/client.admin.keyring --format json --out-file /tmp/994480105
2018-11-06 23:09:07.158089 I | exec: Error ENOTSUP: command is obsolete; please check usage and/or man page

jjgraham · 2018-11-15T02:41:50Z

Any chance we will have mount security for basic multi tenancy soon ?

galexrt · 2018-12-02T12:03:13Z

@travisn I just had three green, restarted CI..

https://jenkins.rook.io/blue/organizations/jenkins/rook%2Frook/detail/PR-2216/60/pipeline/

If it fails another time in 1.8 and 1.9 I'll take a closer look at why it failed.
To already document:

1.8 failed with no mon pods coming up error (https://jenkins.rook.io/blue/rest/organizations/jenkins/pipelines/rook/pipelines/rook/branches/PR-2216/runs/60/nodes/46/steps/104/log/?start=0, search for "giving up").
1.9 failed with RGW Pods not coming up in the Helm Suite (https://jenkins.rook.io/blue/rest/organizations/jenkins/pipelines/rook/pipelines/rook/branches/PR-2216/runs/60/nodes/47/steps/115/log/?start=0).

galexrt · 2018-12-02T13:51:27Z

Oh well now I have 4 of 5 green..

aws 1.9 failed with Expected nil, but got: &errors.errorString{s:"rgw did not start via crd. Giving up waiting for pod with label rook_object_store=default in namespace helm-ns to be running"}, I'll look into if I can get some clues from the logs of what causes this error right now.

Documentation/advanced-configuration.md

travisn · 2018-12-03T20:56:37Z

Jenkinsfile

@@ -154,7 +154,7 @@ def RunIntegrationTest(k, v) {
                              export PATH="/tmp/rook-tests-scripts-helm/linux-amd64:$PATH" \
                                  KUBECONFIG=$HOME/admin.conf
                              kubectl config view
-                              _output/tests/linux_amd64/integration -test.v -test.timeout 2400s --host_type '''+"${k}"+''' --helm /tmp/rook-tests-scripts-helm/linux-amd64/helm 2>&1 | tee _output/tests/integrationTests.log'''
+                              _output/tests/linux_amd64/integration -test.v -test.timeout 7200s --host_type '''+"${k}"+''' --helm /tmp/rook-tests-scripts-helm/linux-amd64/helm 2>&1 | tee _output/tests/integrationTests.log'''


a timeout of 2h seems too long. how about 60 minutes for now?

I think I had problems with a timeout of "just" 60 minutes in the CI. As written below, I just "cranked" it up.

the typical time for the integration tests is currently 35 minutes. If it goes over 60 minutes, seems like jenkins should kill it because the tests are surely failing. The longer we wait, the jenkins backlog just has the potential to get longer. Hopefully we can fix the resource issue with jenkins soon...

travisn · 2018-12-03T20:57:46Z

build/makelib/golang.mk

-	@CGO_ENABLED=0 $(GOHOST) test -v -i $(GO_STATIC_FLAGS) $(GO_INTEGRATION_TEST_PACKAGES)
-	@CGO_ENABLED=0 $(GOHOST) test -v $(GO_TEST_FLAGS) $(GO_STATIC_FLAGS) $(GO_INTEGRATION_TEST_PACKAGES) $(TEST_FILTER_PARAM) 2>&1 | tee $(GO_TEST_OUTPUT)/integration-tests.log
+	CGO_ENABLED=0 $(GOHOST) test -v -i $(GO_STATIC_FLAGS) $(GO_INTEGRATION_TEST_PACKAGES)
+	CGO_ENABLED=0 $(GOHOST) test -v -timeout 7200s $(GO_TEST_FLAGS) $(GO_STATIC_FLAGS) $(GO_INTEGRATION_TEST_PACKAGES) $(TEST_FILTER_PARAM) 2>&1 | tee $(GO_TEST_OUTPUT)/integration-tests.log


why do we need this timeout? if someone is running the integration tests locally, they could just cancel them.

@travisn Well.. during local testing when I ran the tests I got test timeouts, so I just "cranked" it up.

cluster/examples/kubernetes/ceph/operator.yaml

cluster/examples/kubernetes/ceph/storageclass.yaml

pkg/daemon/ceph/agent/flexvolume/controller.go

tests/framework/installer/ceph_installer.go

tests/framework/utils/k8s_helper.go

Introduce mount security mode for basic multi tenancy Fixes rook#2164. This adds three new parameters/options to StorageClass/flexvolume entry: * `mountUser` * `mountSecret` * `mountSecretNamespace` Signed-off-by: Alexander Trost <galexrt@googlemail.com>

galexrt · 2018-12-04T22:30:15Z

@travisn please merge the PR when the CI is green now, as I'm heading to bed now and want to get the PR finally merged.

dimm0 · 2018-12-04T22:44:09Z

Crossed all the fingers I have

galexrt · 2018-12-04T22:54:22Z

@travisn it's green! Finally.

baseyou · 2018-12-13T02:20:50Z

So,If the Node reboot,Then will lost security mode env?

galexrt · 2018-12-13T09:34:06Z

@planeo1105 No. You are probably running a master image which is a moving tag which can and will 100% cause issues at one point.
If you switch to a release tagged image, e.g., v0.9.0 there env var should be there and understandable for all Rook Ceph Pods.

Please create a new issue with your problem next time.

galexrt added ceph main ceph tag operator flexvolume labels Oct 14, 2018

galexrt force-pushed the fix_2164 branch 4 times, most recently from cfe9d03 to 7610d5c Compare October 14, 2018 21:31

galexrt requested a review from travisn October 15, 2018 08:54

travisn requested changes Oct 16, 2018

View reviewed changes

travisn reviewed Oct 16, 2018

View reviewed changes

Documentation/advanced-configuration.md Outdated Show resolved Hide resolved

cluster/examples/kubernetes/ceph/operator.yaml Show resolved Hide resolved

bassam reviewed Oct 17, 2018

View reviewed changes

galexrt force-pushed the fix_2164 branch 3 times, most recently from 548d0c2 to 5688c1d Compare October 18, 2018 19:28

travisn reviewed Oct 19, 2018

View reviewed changes

Documentation/advanced-configuration.md Show resolved Hide resolved

pkg/daemon/ceph/agent/flexvolume/controller.go Show resolved Hide resolved

galexrt force-pushed the fix_2164 branch 5 times, most recently from e374797 to 76a074c Compare October 29, 2018 05:37

galexrt force-pushed the fix_2164 branch 2 times, most recently from 52caf79 to edd2fb2 Compare November 4, 2018 14:16

galexrt mentioned this pull request Nov 5, 2018

Filesystem mounts limited to a namespace #1350

Closed

galexrt force-pushed the fix_2164 branch 2 times, most recently from 2081085 to d2b2700 Compare November 10, 2018 22:18

galexrt force-pushed the fix_2164 branch 3 times, most recently from 7738eb7 to b3b078c Compare November 28, 2018 14:40

travisn force-pushed the fix_2164 branch from b3b078c to e49f0d0 Compare November 29, 2018 19:29

galexrt force-pushed the fix_2164 branch from e49f0d0 to dcb6004 Compare November 29, 2018 23:43

travisn force-pushed the fix_2164 branch 2 times, most recently from b9f1c27 to c3ca160 Compare November 30, 2018 23:38

galexrt force-pushed the fix_2164 branch from c3ca160 to eef42e3 Compare December 2, 2018 09:46

galexrt force-pushed the fix_2164 branch from eef42e3 to 6b8cac2 Compare December 3, 2018 19:07

travisn requested changes Dec 3, 2018

View reviewed changes

galexrt force-pushed the fix_2164 branch from 0e2e04e to 5f77079 Compare December 4, 2018 14:03

galexrt force-pushed the fix_2164 branch from 5dbb24b to ac19f85 Compare December 4, 2018 19:17

travisn approved these changes Dec 4, 2018

View reviewed changes

galexrt merged commit 18b2da5 into rook:master Dec 4, 2018

galexrt deleted the fix_2164 branch December 4, 2018 22:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow specifying custom Ceph user and secret name for mounting #2216

Allow specifying custom Ceph user and secret name for mounting #2216

galexrt commented Oct 14, 2018 •

edited

Loading

galexrt commented Oct 14, 2018 •

edited

Loading

travisn left a comment

travisn Oct 16, 2018

galexrt Oct 16, 2018

travisn left a comment

bassam Oct 17, 2018 •

edited

Loading

galexrt Oct 17, 2018

bassam Oct 17, 2018

galexrt Oct 17, 2018

galexrt commented Oct 18, 2018

galexrt commented Oct 24, 2018 •

edited

Loading

travisn commented Oct 26, 2018

dimm0 commented Oct 31, 2018

travisn commented Nov 6, 2018

jjgraham commented Nov 15, 2018

galexrt commented Dec 2, 2018

galexrt commented Dec 2, 2018

travisn Dec 3, 2018

galexrt Dec 3, 2018 •

edited

Loading

travisn Dec 3, 2018

travisn Dec 3, 2018

galexrt Dec 3, 2018

galexrt commented Dec 4, 2018

dimm0 commented Dec 4, 2018

galexrt commented Dec 4, 2018

baseyou commented Dec 13, 2018

galexrt commented Dec 13, 2018

Allow specifying custom Ceph user and secret name for mounting #2216

Allow specifying custom Ceph user and secret name for mounting #2216

Conversation

galexrt commented Oct 14, 2018 • edited Loading

galexrt commented Oct 14, 2018 • edited Loading

travisn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

travisn left a comment

Choose a reason for hiding this comment

bassam Oct 17, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

galexrt commented Oct 18, 2018

galexrt commented Oct 24, 2018 • edited Loading

travisn commented Oct 26, 2018

dimm0 commented Oct 31, 2018

travisn commented Nov 6, 2018

jjgraham commented Nov 15, 2018

galexrt commented Dec 2, 2018

galexrt commented Dec 2, 2018

Choose a reason for hiding this comment

galexrt Dec 3, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

galexrt commented Dec 4, 2018

dimm0 commented Dec 4, 2018

galexrt commented Dec 4, 2018

baseyou commented Dec 13, 2018

galexrt commented Dec 13, 2018

galexrt commented Oct 14, 2018 •

edited

Loading

galexrt commented Oct 14, 2018 •

edited

Loading

bassam Oct 17, 2018 •

edited

Loading

galexrt commented Oct 24, 2018 •

edited

Loading

galexrt Dec 3, 2018 •

edited

Loading