-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mds: check relevant caps for fs include root_squash #57192
Conversation
This PR is under test in https://tracker.ceph.com/issues/65771. |
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A speedy turnaround with this one, great job, Patrick!
As usual, I'm cautious about adding a managed cache to the internal state where const calculation is cheap. We can afford to scan a list of a few tens of sessions once in 5 seconds and thus avoid inflation of the managed state and potential bugs this can bring.
src/mds/mdstypes.cc
Outdated
n += bits_per_block; | ||
n /= bits_per_block; | ||
_vec.resize(n, 0); | ||
n += bits_per_block; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: something's wrong with the indentation in this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
and a client upgrade is required. | ||
|
||
This is a HEALTH_ERR warning because of the danger of inconsistency and lost | ||
data. It is recommended to either upgrade your clients, discontinue using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lost data
What sort of scenario would this happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually: https://tracker.ceph.com/issues/56067
@batrick I know it's going to be squashed so it doesn't matter, but sign-off in last commit is missing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
self.update_attrs(**kwargs) | ||
|
||
retval = self.mount(mntopts=mntopts, check_status=check_status) | ||
retval = self.mount(mntopts=mntopts, check_status=check_status, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing kwargs
is redundant because each k-v pair in kwargs are set object attributes by update_attrs()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See commit message in qa: pass kwargs to mount from remount
.
# should succeed | ||
with self.assert_cluster_log("with broken root_squash implementation"): | ||
keyring_path = self.mount_a.client_remote.mktemp(data=keyring) | ||
self.mount_a.remount(client_id=self.client_id, client_keyring_path=keyring_path, mntargs=mntargs, cephfs_name=self.fs.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same, will be better if wrap such lines before 81 chars.
|
||
CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK = 21 | ||
# all but CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK | ||
features = ",".join([str(i) for i in range(CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same, will be better if wrap such lines before 81 chars.
qa/tasks/cephfs/test_admin.py
Outdated
|
||
captester = CapTester(self.mount_a, '/') | ||
captester.conduct_pos_test_for_read_caps() | ||
captester.conduct_pos_test_for_open_caps() | ||
|
||
def test_rootsquash_nofeature(self): | ||
""" | ||
That having root_squash on an fs without the feature bit raises an HEALTH_ERR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How this line would currently look in teuthology.log
-
2024-05-01T05:49:57.520 INFO:tasks.cephfs_test_runner:Starting test: test_add_already_in_use_data_pool (tasks.cephfs.test_admin.TestAddDataPool)
That command try to add data pool which is already in use with another fs.
So I suggest slight modification -
That having root_squash on an fs without the feature bit raises an HEALTH_ERR. | |
Test that having root_squash on an fs without the feature bit raises an HEALTH_ERR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Test ..." is implicit in this construction. I'm using the style begun by John Spray.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, but it's not printed in teuthology.log. I've copied an example from recent run.
Anyways, I don't really mind. Since this PR is urgent, I've approved it regardless of these nits.
@@ -1507,13 +1507,72 @@ def test_multifs_rootsquash_nofeature(self): | |||
mntargs = [f"--client_debug_inject_features={features}"] | |||
|
|||
# should succeed | |||
keyring_path = self.mount_a.client_remote.mktemp(data=keyring) | |||
self.mount_a.remount(client_id=self.client_id, client_keyring_path=keyring_path, mntargs=mntargs, cephfs_name=self.fs1.name) | |||
with self.assert_cluster_log("report clients with broken root_squash", present=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same, will be better if wrap such lines before 81 chars.
self.mount_a.remount(client_id=self.client_id, client_keyring_path=keyring_path, mntargs=mntargs, cephfs_name=self.fs1.name) | ||
with self.assert_cluster_log("report clients with broken root_squash", present=False): | ||
keyring_path = self.mount_a.client_remote.mktemp(data=keyring) | ||
self.mount_a.remount(client_id=self.client_id, client_keyring_path=keyring_path, mntargs=mntargs, cephfs_name=self.fs1.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same, will be better if wrap such lines before 81 chars.
Folks, I'm not considering further nits on this PR in the interest of time as this PR is blocking v18.2.3. |
I will be rebasing and squashing fixes now. |
For testing purposes. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/57192/head: PendingReleaseNotes: add note on the client incompatibility health warning and feature bit doc/cephfs: add client_mds_auth_caps client feature bit doc/cephfs: add missing client feature bits doc/cephfs: document MDS_CLIENTS_BROKEN_ROOTSQUASH health error qa: add tests for MDS_CLIENTS_BROKEN_ROOTSQUASH mds: raise health warning if client lacks feature for root_squash mon/MDSMonitor: add note about missing metadata inclusion mds: check relevant caps for fs include root_squash mds: refactor out fs_name match in MDSAuthCaps qa: test for root_squash with multiple caps qa: pass kwargs to mount from remount qa: simplify update_attrs and only update relevant keys client: allow overriding client features
Agreed and +1 for this. |
So we can just pass the caller's kwargs to update_attrs. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
So we can pass mntargs. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Where the client has root_squash for one cap but not for another. The fs without root_squash should not necessarily reject the client. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
When denying client reconnects because the MDS caps include root_squash and the client features do not include CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK, ensure those caps are only for the file system the MDS is joined to. Fixes: https://tracker.ceph.com/issues/65733 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
There is a "client_count" metadata on the health warning that apparently was intended to be used for aggregating warnings but never was. Add a TODO item for that. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Rather than evict all clients lacking this feature bit, raise a health error that pushes the administrator to address it. This avoids the surprise of having all affected clients suddenly evicted in the cluster. Fixes: https://tracker.ceph.com/issues/65733 Fixes: 954ed30 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
…rning and feature bit Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
jenkins test make check arm64 |
jenkins test make check |
jenkins test make check arm64 |
I'm not satisfied with this approach completely:
Will think about this more tomorrow and start some discussions with the team.
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows
jenkins test rook e2e