Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgr/dashboard: Directories Menu Can't Use on Ceph File System Dashboard #44849

Merged

Conversation

Sarthak0702
Copy link
Member

@Sarthak0702 Sarthak0702 commented Jan 31, 2022

Added exception handling to opendir() in cephfs.py for directories with no execute permission.

To reproduce the error:

  • docker exec -it ceph /bin/bash in ceph folder
  • pip install cmd2
  • python3 src/tools/cephfs/cephfs-shell to access the cephfs-shell
  • mkdir testfs -m 600 to create a directory (testfs) without execute (x bit) permission
  • Try accessing the directory in DashBoard -> File System

Before

image

After

image

Fixes: https://tracker.ceph.com/issues/51611
Signed-off-by: Sarthak0702 sarthak.0702@gmail.com

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@Sarthak0702 Sarthak0702 requested a review from a team as a code owner January 31, 2022 22:15
@Sarthak0702 Sarthak0702 requested review from aaSharma14 and nizamial09 and removed request for a team January 31, 2022 22:15
@Sarthak0702 Sarthak0702 added this to In progress in Dashboard via automation Jan 31, 2022
@Sarthak0702 Sarthak0702 changed the title Directories Menu Can't Use on Ceph File System Dashboard mgr/dashboard: Directories Menu Can't Use on Ceph File System Dashboard Jan 31, 2022
@Sarthak0702
Copy link
Member Author

jenkins test api

Copy link
Member

@epuertat epuertat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want this error to persist in the notification drawer, we'll need to modify the dashboard_exception_handler (in dashboard/services/exceptions.py) and send the error to the a NotificationQueue (tools.py)? I'm not sure if that will be enough...

Comment on lines 59 to 64
except cephfs.OSError:
dirpath = dirpath.decode().replace("/.snap", "")
raise DashboardException(code='invalid_dir_permission',
msg="Unable to open directory: {} due to invalid permissions.\
Try adding the +x permission to the directory".format(dirpath),
component='cephfs')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you grep the cephfs Python bindings (src/pybind/cephfs) for OSError exceptions you'll get:

class OSError(Error):
        super(OSError, self).__init__(errno, strerror)
class PermissionError(OSError):
class ObjectNotFound(OSError):
class NoData(OSError):
class ObjectExists(OSError):
class IOError(OSError):
class NoSpace(OSError):
class InvalidValue(OSError):
class OperationNotSupported(OSError):
class WouldBlock(OSError):
class OutOfRange(OSError):
class ObjectNotEmpty(OSError):
class NotDirectory(OSError):
class DiskQuotaExceeded(OSError):
        return OSError(ret, msg)

Therefore, you cannot assume that all OSError exceptions are permission or directory related. However, you can capture PermissionError instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was my concern too. FYI, the error on logging was: cephfs.OSError: opendir failed: Permission denied [Errno 13]. As except cephfs.PermissionError: does not seem to catch the error how can I verify that an error is a PermissionError? Should I add an If else inside except cephfs.OSError: block and raise different exceptions based on Errno code (13 in this case)?

Comment on lines 61 to 64
raise DashboardException(code='invalid_dir_permission',
msg="Unable to open directory: {} due to invalid permissions.\
Try adding the +x permission to the directory".format(dirpath),
component='cephfs')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to enforce ourselves/future-Dashboard-devels to follow the "title-what happened-how to resolve" error reporting pattern, perhaps we need to extend the DashboardException or derive a new exception (NotifyError()?) with a couple of new fields (msg could be the what happened): DashboardException(..., title=None, resolution=None)=

Dashboard automation moved this from In progress to Review in progress Feb 1, 2022
@Sarthak0702
Copy link
Member Author

Sarthak0702 commented Feb 2, 2022

https://github.com/rhcs-dashboard/ceph/blob/c6aaceb44926d646f9d3f8e68fe3db8a41b8cd6b/src/pybind/mgr/dashboard/frontend/src/app/shared/services/api-interceptor.service.ts#L54-L69

@epuertat @nizamial09 @avanthakkar Non of the 400 Errors are saved to Tasks and Notifications. Either we can start saving 400 errors considering 500 errors are being saved( which have no relevant information for the user) or we can write a different logic for a new error code (probably in this case 405: Method Not Allowed is the closest and most appropriate). Please let me know what you think.

@nizamial09
Copy link
Member

nizamial09 commented Feb 3, 2022

Either we can start saving 400 errors considering 500 errors are being saved

As @epuertat once mentioned, is it possible to save the error on demand only. I think we often get a 400 code and if we starts saving all the generic error messages, it'll only increase the annoyance. So its the more better if we can find a way to store only the notification we chose to be saved on the Notification bar.

405: Method Not Allowed is the closed and most appropriate).

Not sure! 403 looks good to me. I don't know if its inappropriate here. If it is, then a new error code makes sense.

@Sarthak0702
Copy link
Member Author

Sarthak0702 commented Feb 3, 2022

Not sure! 403 looks good to me. I don't know if its inappropriate here. If it is, then a new error code makes sense.

@nizamial09 But 403:Forbidden will give access denied and that not the case here.
403Error

405:Error
405Error

And saving the error on demand only wold mean sending a flag (saveToTaskBar) in api response, right?

@nizamial09
Copy link
Member

And saving the error on demand only wold mean sending a flag (saveToTaskBar) in api response, right

yeah, something like that.

@Sarthak0702 Sarthak0702 force-pushed the bugfix-cephfs-invalid-dir-permission branch from b255736 to 61e48fc Compare February 6, 2022 17:53
@Sarthak0702 Sarthak0702 requested a review from a team February 6, 2022 17:53
@Sarthak0702 Sarthak0702 force-pushed the bugfix-cephfs-invalid-dir-permission branch 2 times, most recently from ed8cc4c to c1e696f Compare February 6, 2022 18:23
Copy link
Member

@epuertat epuertat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better now, but there are still some things that could be improved/discussed. What do you think?

Comment on lines 386 to 390
location = f' at {err.args[2]}' if len(err.args) > 2 else ''
message = f'{type(err).__name__} : {str(err)}{location}'
raise DashboardException(http_status_code=400,
msg=message,
component='cephfs')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is perfectly valid Python code, but there's this Python motto ("better ask forgiveness than permission"), so this could be rather written like this way:

Suggested change
location = f' at {err.args[2]}' if len(err.args) > 2 else ''
message = f'{type(err).__name__} : {str(err)}{location}'
raise DashboardException(http_status_code=400,
msg=message,
component='cephfs')
try:
message = f'{type(err).__name__} : {str(err)} at {err.args[2]}'
except IndexError:
message = f'{type(err).__name__} : {str(err)}'
raise DashboardException(http_status_code=400,
msg=message,
component='cephfs')

@@ -382,6 +382,12 @@ def ls_dir(self, fs_id, path=None, depth=1):
paths = cfs.ls_dir(path, depth)
except (cephfs.PermissionError, cephfs.ObjectNotFound): # pragma: no cover
paths = []
except cephfs.OSError as err:
location = f' at {err.args[2]}' if len(err.args) > 2 else ''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seems that Python OSError actually accepts a filename as constructor argument, but the cephfs.OSError doesn't seems to (only errno and strerror).

So I'd stick to the existing interface (errno and strerror):

except cephfs.OSError as e:
    raise DashboardException(http_status_code=400, msg=e.strerror, component='cephfs')

While checking for the args property is valid (exception inspection), given not all OSError exceptions will include filenames, it'll force you to do some conditional inspection (as you're doing here), and probably we're not gaining so much and we're increasing complexity/error-proneness...

BTW, every time you raise an exception from another exception (exception chaining), you can use the syntax raise new_exception from another_exception to tell Python that one exception is triggering the other ("The above exception was the direct cause of the following exception", otherwise you'll see this "During handling of the above exception, another exception occurred:")

If we really want to include more information, and since you have to duplicate the new code in 2 different places, I'd go for:

  • New derived exception DashboardCephfsError(DashboardException) (actually all derived exceptions are named *Errors, I don't know why we called it DashboardException) that already sets http_status_code=400, component="cephfs"
  • Or some class method (factory-like): raise DashboardException.from_cephfs_error(err)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are not re rasing the OSError in opendir() services/cephfs.py then this might just work:

#in src/pybind/mgr/dashboard/services/cephfs.py -> opendir()
except cephfs.OSError as e:
    raise DashboardException(http_status_code=400, msg=f'{err.strerror} at {dirpath}' component='cephfs') from e

And then raising this DashBoradException again in controller/cephfs.py to get a toasty like this:

toasty

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's bad. It fulfills all (or almost) needed for an error message: what's failing (CephFS opendir /testfs), why (permission denied). The only (bonus) thing missing would be the "how to fix it", but the cost-benefit of adding that here does not make it worth.

So I'm fine with that message. Alternatively, you can create a higher-level (more fine-grained) exception that allows specifying the filename, as the Python OSError one.

BTW, do you need to capture and reraise the exception twice, is it not enough with letting that surface from the inner service?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"how to fix it" cannot be added as of now because the back-end is returning a generic cephfs.OSError (cephfs.OSError: opendir failed: Permission denied [Errno 13]) instead of cephfs.PermissionError. A work arround (for now) would be:

except cephfs.OSError as e:
    fix=''
    if e.errno=='13':
        fix=' Try setting the excute permission (+x bit) for the directory.' 
    raise DashboardException(code=err.errno, http_status_code=400, msg=f'{err.strerror} at {dirpath}.{fix}', component='cephfs') from e

Capturing the exception in controller was needed in order to save the toasty in the Notification tab. But as we have decided not to show 400 error in Notification, so for now re raising the exception is not required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general in OOP languages it's always desirable to encapsulate/hide logic inside classes, rather than spreading it. Thus my suggestion to use a factory class method DashboardCephfsError.from_ceph_error(e).

Comment on lines 59 to 64
except cephfs.OSError as err:
if isinstance(dirpath, bytes):
dirpath = dirpath.decode()
dirpath = dirpath.replace("/.snap", "")
err.args = (*err.args, dirpath)
raise err
Copy link
Member

@epuertat epuertat Feb 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While Python is extremely liberal regarding what you can do with an object, I'd suggest that we treat object as immutable entities (I'm not advocating here for functional programming, I'm just saying that an object should not contain properties/members that are not declared in their originating class).

In this very case, while Python OSError allows you to define the filename, cephfs.OSError does not, hence we're extending implicitly the interface of the exception at object level (not class level), and a golden rule in Python (and many other languages) is "explicit over implicit". If we want to have a OSError class let's have a new exception inspired in DashboardExcetpion and Python OSError:

# dashboard.exceptions file
class DashboardCephfsError(DashboardException):
  def __init__(self, errno, strerror, filename=None, **kwargs):
    super().__init__(code=errno, component='cephfs', http_status_code=400, msg=strerror)
    self.filename = filename

  @classmethod
  from_cephfs_error(cls, e):
    return cls(errno=e.errno, strerror=e.strerror, filename=e.filename)

@@ -133,7 +140,7 @@ def dir_exists(self, path):
try:
with self.opendir(path):
return True
except cephfs.ObjectNotFound:
except (cephfs.OSError, cephfs.ObjectNotFound):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we want this behaviour? When there's an error we'll tell the user that the directory doesn't exist... Isn't that a bit misleading?

Suggested change
except (cephfs.OSError, cephfs.ObjectNotFound):
except cephfs.ObjectNotFound:
return False
except cephfs.OSError as e:
raise DashboardCephfsError.from_cephfs_error(e) from e

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, this is misleading. will fix this. Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function call stack for this is as follows( in reverse order): opendir() ->dir_exists() -> mk_dirs()/rm_dir() -> mk_tree/rm_tree.[API]
So here two possibilities can arise while checking if a a directory exists (done by checking if opendir() is successful) while created or removing directories :

  • directory exists but has wrong permission set (+x bit unset) -> thus giving cephfs.OSError: opendir failed: Permission denied [Errno 13] in opendir() -> dir_exists() should raise an error
  • directory doesn't exist -> giving cephfs.OSError: opendir failed: No such file or directory [Errno 2] in opendir() -> dir_exists() should return false. As cephfs.ObjectNotFound is derived from cephfs.OSError, it will get caught by the except cephfs.OSError as err: block in opendir()

(Both are supposed to be handled as suggested here: #44849 (comment) in opendir())

This can be tackled in any of the two ways:

  • adding a check by errno code in dir_exists()
except DashboardException as err:
         #for ObjectNotFound
          if(err.code == '2'):
              return False
          else:
              raise err
  • OR, add except block for cephfs.ObjectNotFound before except cephfs.OSError
except cephfs.ObjectNotFound as err:
         raise err

What do you think about this?. Both solution look far from ideal to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last one is better IMHO. Exceptions should be handled the closest to where they happen. At that level you'd treat differently the Not Found from the rest of the exceptions, which you can raise as DashboardException. How does that sound?

Dashboard automation moved this from In progress to Review in progress Feb 7, 2022
@Sarthak0702
Copy link
Member Author

Sarthak0702 commented Feb 8, 2022

@epuertat A better solution to this problem could be

in src/pybind/mgr/dashboard/services/cephfs.py -> opendir()
except cephfs.OSError as err:
    # TODO: Expected cephfs.PremissionError # pylint: disable=fixme
    if(err.errno == 13):
        if isinstance(dirpath, bytes):
            dirpath = dirpath.decode()
        dirpath = dirpath.replace("/.snap", "")
        fix=' Try setting the excute permission (+x bit) for the directory.' 
        raise DashboardException(code=err.errno, http_status_code=400, msg=f'{err.strerror} at {dirpath}.{fix}', component='cephfs') from err 
    else:
        raise err 
        
in src/pybind/mgr/dashboard/services/cephfs.py ->  dir_exists()
except cephfs.ObjectNotFound:
    return False
except DashboardException as err:
    raise err        
        

This will still work even after the backed starts returning cephfs.PermissionError

And when we start getting cephfs.PermissionError from the back end, the else block of if else can be dropped and except cephfs.OSError can be replaced with except cephfs.Permission as err:.

What do think about this?

@epuertat
Copy link
Member

epuertat commented Feb 8, 2022

@Sarthak0702 let's not overcomplicate it... I think we're starting to derail here (don't worry this happens). So let's get back at the initial bug report: we are reporting as 500 (Dashboard unexpected error) when the error is expectable and attributable to the user, so 400 (tried to access a directory whose permissions were wrong).

Please send your proposal for the simplest code that give us this (later we can talk about specific code improvements):
image

@Sarthak0702 Sarthak0702 force-pushed the bugfix-cephfs-invalid-dir-permission branch from c1e696f to eadf813 Compare February 9, 2022 11:57
Added exception handling to opendir() in cephfs.py for directories with no execute permission.

Fixes: https://tracker.ceph.com/issues/51611
Signed-off-by: Sarthak0702 <sarthak.0702@gmail.com>
@Sarthak0702 Sarthak0702 force-pushed the bugfix-cephfs-invalid-dir-permission branch from eadf813 to ea1af54 Compare February 9, 2022 11:59
@Sarthak0702
Copy link
Member Author

jenkins test api

1 similar comment
@Sarthak0702
Copy link
Member Author

jenkins test api

@Sarthak0702
Copy link
Member Author

jenkins test dashboard

Dashboard automation moved this from Review in progress to Reviewer approved Feb 10, 2022
Copy link
Member

@epuertat epuertat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @Sarthak0702 ! Great, now it looks aligned to what other components do!

@epuertat
Copy link
Member

jenkins test dashboard

@nizamial09 nizamial09 moved this from Reviewer approved to Ready-to-merge in Dashboard Feb 11, 2022
@Sarthak0702 Sarthak0702 merged commit db3eb10 into ceph:master Feb 11, 2022
13 checks passed
Dashboard automation moved this from Ready-to-merge to Done Feb 11, 2022
@epuertat epuertat added needs-quincy-backport backport required for quincy and removed needs-quincy-backport backport required for quincy labels Feb 11, 2022
@epuertat epuertat deleted the bugfix-cephfs-invalid-dir-permission branch February 16, 2022 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Dashboard
  
Done
3 participants