-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mgr/nfs: use object_format decorators to simplify response handling #46209
Conversation
CC: @adk3798 @rkachach @epuertat FWIW I still need to test parts of this. I'll leave it in draft until that's been done. Note: I picked the nfs module because I recently worked in this area and was somewhat familiar with the module. If people have suggestions for other modules that would benefit from being updated to use object_format, please let me know and we can discuss that. I find that only when the "rubber hits the road" do we find out stuff that could be missing from the general approach of #45467. |
8c8adca
to
9fee8c2
Compare
f9d56bd
to
695762b
Compare
695762b
to
e3159fc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems pretty great. Even the conversion process from returning the old tuple to using the decorators looks fairly straightforward. Just a few minor things and questions from me.
self._delete_export(cluster_id=cluster_id, pseudo_path=None, | ||
export_obj=export) | ||
except Exception as e: | ||
raise NFSException(f"Failed to delete export {export.export_id}: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the fact that we're not raising an ErrorResponse
wrapping the other exception here like in a lot of other cases related to the fact that we're called by a function using the EmptyResponder
decorator rather than the regular Responder
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should find most (all?) of the ErrorResponse exceptions are being added where the code used to return error-condition tuples. In this case the code was already raising an NFSException. AFAICT, the only place that calls delete_all_exports is delete_nfs_cluster
in cluster.py, which will end up catching all exceptions and using ErrorResponse.wrap on the caught exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until I looked just now I assumed that some other code might call delete_all_exports and catch NFSException. I still think its worth changing as little as possible, though.
src/pybind/mgr/nfs/cluster.py
Outdated
return 0, "", "" | ||
return 0, "", "Cluster does not exist" | ||
return | ||
raise ErrorResponse("Cluster does not exist", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that this previously returned a a return code 0 even while including an error message. Is the change to now return -errno.ENOENT
instead for a return code on purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. That was probably an oversight, but I'm not sure what's best, retaining the somewhat strange old behavior of sending the success error code when this is clearly an error or trying to avoid any behavior change here. Would be happy to hear your opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it depends if we think it's more important for the command to be idempotent or to make sure users are aware that the cluster they deleted isn't there anymore. We actually had a command like this in cephadm that would just print success if you tried to delete a nonexistent service, but eventually somebody requested we have it print an error when you do that. So I guess maybe raising here is the right thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, my subconscious was probably encouraging the idea that errors should be errors. Otherwise it is not worth reporting at all. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajarr flagged this in our standup today. See my review comment. :)
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
e64a882
to
0f1fa6b
Compare
I've applied a number of small, what I figured would be noncontroversial, changes. There are a few open sub-threads that I have not taken before I hear back from the reviewers. Threads I've acted upon should be marked as such. |
Signed-off-by: John Mulligan <jmulligan@redhat.com>
…ator Signed-off-by: John Mulligan <jmulligan@redhat.com>
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Signed-off-by: John Mulligan <jmulligan@redhat.com>
The "export apply" functionality is unusual in that it allows either one or multiple nested requests to change or create an export. The previous implementation would concatenate the results of multiple change operations into a single string. It also would continue making changes if one case failed, adding the error to the string and setting a non-zero error code. The updated version keeps the general behavior but returns structured JSON (or other formatted data) with one entry per change request. In order to accomplish this and match the old behavior as closely as possible we add an intermediate type (AppliedExportResults) that can return both the response data (the `to_simplified` method) and track if there was a failure mixed in with the various updates (the `mgr_return_value` method). Signed-off-by: John Mulligan <jmulligan@redhat.com>
Signed-off-by: John Mulligan <jmulligan@redhat.com>
…rator Signed-off-by: John Mulligan <jmulligan@redhat.com>
…decorator Signed-off-by: John Mulligan <jmulligan@redhat.com>
This decorator is no longer needed as equivalent functionality is handled internally by the class. Signed-off-by: John Mulligan <jmulligan@redhat.com>
This function is now unused as we no longer need to coerce exceptions into response tuples at the layer in the code. Signed-off-by: John Mulligan <jmulligan@redhat.com>
These formatting changes are made by autopep8 when running tox. Signed-off-by: John Mulligan <jmulligan@redhat.com>
352d5d8
to
aa10a9d
Compare
The patches that add object formatting / decorators to the nfs module also made error handling more generic when accessing an nfs cluster and now returns a nonzero exit code. A test was after the PR adding the object format support that only checked an error message. Update the test to match the new nfs module behavior as well as fixing a typo. Signed-off-by: John Mulligan <jmulligan@redhat.com>
aa10a9d
to
e03331e
Compare
@adk3798 when do you think we can next retry merging this? |
as soon as centos 8 builds are possible again, which I think is when pushing to and pulling from the quay.ceph.io container registry works again. I tried testing it last weekend but that's when the builds broke. Since this seemed to cause a test failure in the last run we can't really merge it without checking the patch you added for the test works. |
Lots of failures (13) but all accounted for
Overall, PRs in the run should be okay to merge other than the mon crush location one causing failures. Will start to try and clean up the test suite now that we're able to make builds and run tests again. |
@phlogistonjohn were we planning to backport this t oquincy? I know we did backport the original PR that added the decorator. |
No, IIRC we discussed this in one of the Weekly meetings and opted not to backport it. If we find out that not backporting it means we have difficulties backporting other things we can "easily" change course and backport this series later. |
when you create an nfs export from dashboard it leaves this traceback and error ``` Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard ERROR taskexec] Error while calling Task(ns=nfs/create, md={'path': 'e2e.nfs.bucket', 'fsal': 'RGW', 'cluster_id': 'testnfs'}) Traceback (most recent call last): File "/usr/share/ceph/mgr/dashboard/tools.py", line 550, in _run val = self.task.fn(*self.task.fn_args, **self.task.fn_kwargs) # type: ignore File "/usr/share/ceph/mgr/dashboard/controllers/nfs.py", line 148, in create ret, _, err = export_mgr.apply_export(cluster_id, json.dumps(raw_ex)) TypeError: 'AppliedExportResults' object is not iterable Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard INFO taskmgr] finished Task(ns=nfs/create, md={'path': 'e2e.nfs.bucket', 'fsal': 'RGW', 'cluster_id': 'testnfs'}) Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard INFO request] [::ffff:192.168.100.1:43896] [POST] [500] [0.767s] [admin] [172.0B] /api/nfs-ganesha/export ``` This started after ceph#46209, so dashboard code needs to be adapted Fixes: https://tracker.ceph.com/issues/58681 Signed-off-by: Nizamudeen A <nia@redhat.com>
when you create an nfs export from dashboard it leaves this traceback and error ``` Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard ERROR taskexec] Error while calling Task(ns=nfs/create, md={'path': 'e2e.nfs.bucket', 'fsal': 'RGW', 'cluster_id': 'testnfs'}) Traceback (most recent call last): File "/usr/share/ceph/mgr/dashboard/tools.py", line 550, in _run val = self.task.fn(*self.task.fn_args, **self.task.fn_kwargs) # type: ignore File "/usr/share/ceph/mgr/dashboard/controllers/nfs.py", line 148, in create ret, _, err = export_mgr.apply_export(cluster_id, json.dumps(raw_ex)) TypeError: 'AppliedExportResults' object is not iterable Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard INFO taskmgr] finished Task(ns=nfs/create, md={'path': 'e2e.nfs.bucket', 'fsal': 'RGW', 'cluster_id': 'testnfs'}) Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard INFO request] [::ffff:192.168.100.1:43896] [POST] [500] [0.767s] [admin] [172.0B] /api/nfs-ganesha/export ``` This started after ceph#46209, so dashboard code needs to be adapted Fixes: https://tracker.ceph.com/issues/58681 Signed-off-by: Nizamudeen A <nia@redhat.com>
when you create an nfs export from dashboard it leaves this traceback and error ``` Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard ERROR taskexec] Error while calling Task(ns=nfs/create, md={'path': 'e2e.nfs.bucket', 'fsal': 'RGW', 'cluster_id': 'testnfs'}) Traceback (most recent call last): File "/usr/share/ceph/mgr/dashboard/tools.py", line 550, in _run val = self.task.fn(*self.task.fn_args, **self.task.fn_kwargs) # type: ignore File "/usr/share/ceph/mgr/dashboard/controllers/nfs.py", line 148, in create ret, _, err = export_mgr.apply_export(cluster_id, json.dumps(raw_ex)) TypeError: 'AppliedExportResults' object is not iterable Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard INFO taskmgr] finished Task(ns=nfs/create, md={'path': 'e2e.nfs.bucket', 'fsal': 'RGW', 'cluster_id': 'testnfs'}) Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard INFO request] [::ffff:192.168.100.1:43896] [POST] [500] [0.767s] [admin] [172.0B] /api/nfs-ganesha/export ``` This started after ceph#46209, so dashboard code needs to be adapted Fixes: https://tracker.ceph.com/issues/58681 Signed-off-by: Nizamudeen A <nia@redhat.com>
when you create/edit an nfs export from dashboard it leaves this traceback and error ``` Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard ERROR taskexec] Error while calling Task(ns=nfs/create, md={'path': 'e2e.nfs.bucket', 'fsal': 'RGW', 'cluster_id': 'testnfs'}) Traceback (most recent call last): File "/usr/share/ceph/mgr/dashboard/tools.py", line 550, in _run val = self.task.fn(*self.task.fn_args, **self.task.fn_kwargs) # type: ignore File "/usr/share/ceph/mgr/dashboard/controllers/nfs.py", line 148, in create ret, _, err = export_mgr.apply_export(cluster_id, json.dumps(raw_ex)) TypeError: 'AppliedExportResults' object is not iterable Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard INFO taskmgr] finished Task(ns=nfs/create, md={'path': 'e2e.nfs.bucket', 'fsal': 'RGW', 'cluster_id': 'testnfs'}) Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard INFO request] [::ffff:192.168.100.1:43896] [POST] [500] [0.767s] [admin] [172.0B] /api/nfs-ganesha/export ``` This started after ceph#46209, so dashboard code needs to be adapted Fixes: https://tracker.ceph.com/issues/58681 Signed-off-by: Nizamudeen A <nia@redhat.com>
when you create/edit an nfs export from dashboard it leaves this traceback and error ``` Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard ERROR taskexec] Error while calling Task(ns=nfs/create, md={'path': 'e2e.nfs.bucket', 'fsal': 'RGW', 'cluster_id': 'testnfs'}) Traceback (most recent call last): File "/usr/share/ceph/mgr/dashboard/tools.py", line 550, in _run val = self.task.fn(*self.task.fn_args, **self.task.fn_kwargs) # type: ignore File "/usr/share/ceph/mgr/dashboard/controllers/nfs.py", line 148, in create ret, _, err = export_mgr.apply_export(cluster_id, json.dumps(raw_ex)) TypeError: 'AppliedExportResults' object is not iterable Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard INFO taskmgr] finished Task(ns=nfs/create, md={'path': 'e2e.nfs.bucket', 'fsal': 'RGW', 'cluster_id': 'testnfs'}) Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard INFO request] [::ffff:192.168.100.1:43896] [POST] [500] [0.767s] [admin] [172.0B] /api/nfs-ganesha/export ``` This started after ceph#46209, so dashboard code needs to be adapted Fixes: https://tracker.ceph.com/issues/58681 Signed-off-by: Nizamudeen A <nia@redhat.com>
when you create/edit an nfs export from dashboard it leaves this traceback and error ``` Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard ERROR taskexec] Error while calling Task(ns=nfs/create, md={'path': 'e2e.nfs.bucket', 'fsal': 'RGW', 'cluster_id': 'testnfs'}) Traceback (most recent call last): File "/usr/share/ceph/mgr/dashboard/tools.py", line 550, in _run val = self.task.fn(*self.task.fn_args, **self.task.fn_kwargs) # type: ignore File "/usr/share/ceph/mgr/dashboard/controllers/nfs.py", line 148, in create ret, _, err = export_mgr.apply_export(cluster_id, json.dumps(raw_ex)) TypeError: 'AppliedExportResults' object is not iterable Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard INFO taskmgr] finished Task(ns=nfs/create, md={'path': 'e2e.nfs.bucket', 'fsal': 'RGW', 'cluster_id': 'testnfs'}) Feb 09 14:15:54 ceph-node-00 ceph-mgr[3235]: [dashboard INFO request] [::ffff:192.168.100.1:43896] [POST] [500] [0.767s] [admin] [172.0B] /api/nfs-ganesha/export ``` This started after ceph#46209, so dashboard code needs to be adapted Fixes: https://tracker.ceph.com/issues/58681 Signed-off-by: Nizamudeen A <nia@redhat.com>
Depends on: #45467
The changes above add a new python module to the mgr - a general way to handle mgr command responses and formatting. As yet it is unused. This series of changes converts the nfs mgr module to use the object_format decorators. This serves as both a demonstration of how the decorators can be used and how they benefit a mgr module - the code within the module becomes simpler and more pythonic. Only at the "outermost" layer do we concern ourselves with creating mgr response tuples. Because the nfs module is one of sufficient complexity and history not all existing APIs can use the simplest object_format.Responder decorator. A few edge cases (discussed below) were found and this required the addition of two simpler decorators "EmptyResponder" and "ErrorResponseHandler".
There are approximately three kinds of conversion that were done:
I am filing this PR in draft state until the predecessor pr is merged. As it'll be in draft, discussion of things like the general approach and naming are quite welcome.
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows