Skip to content

Conversation

@neils-dev
Copy link
Contributor

@neils-dev neils-dev commented Feb 4, 2023

What changes were proposed in this pull request?

To make ofs symlink delete behavior similar to posix for ozone fs -rm command requests. Specifically to have,
i. $ ozonefs -rm -skipTrash ofs://<omserviceid>/volume/<linked bucket symlink>
delete symlink
ii. $ ozonefs -rm --R -f skipTrash ofs://<omserviceid>/volume/<linked bucket symlink>/
recursively delete the contents of the source bucket leaving the symlink and source bucket intact.

Currently, i.) results in an error indicating cannot remove directory and ii.) results in deleting the symlink.

To add posix like symlink -rm command behavior an ozone specific -rm command handler, OzoneDelete, is implemented and binded with the fs shell commands.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-7828

How was this patch tested?

Unit tests, for -rm command handler. Integration tests for symlink delete behavior supporting -rm removing symlink and -rm -R with trailing slash to remove symlink resolved contents.

Unit tests:
TestOzoneFsShell#testOzoneFsShellRegisterDeleteCmd
Integration tests:
TestRootedOzoneFileSystem.#testSymlinkPosixDelete
TestRootedOzoneFileSystem.#testSymlinkList (added test when testing symlink rm behavior)

Manual tests:
docker dev cluster -

# setup src bucket and linked bucket with key
bash-4.2$ ozone sh volume create /vol1
bash-4.2$ ozone sh bucket create /vol1/abc
bash-4.2$ ozone sh bucket link /vol1/abc /s3v/abc
bash-4.2$ ozone sh key put /vol1/abc/key1 ./README.md

# delete just symlink with -rm command
bash-4.2$ ozone fs -rm -f -skipTrash ofs://om/s3v/abc
# check with -ls that no linked bucket
bash-4.2$ ozone fs -ls ofs://om/s3v
# create symlink again
bash-4.2$ ozone sh bucket link /vol1/abc /s3v/abc
# delete contents of src bucket with -rm with trailing slash
bash-4.2$ ozone fs -rm -R -f -skipTrash ofs://om/s3v/abc/
# check symlink exists
bash-4.2$ ozone fs -ls ofs://om/s3v

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @neils-dev for working on this. The change in BasicRootedOzoneFileSystem looks good. I wonder if we can avoid duplicating the Delete handler from Hadoop by fixing the symlink argument being passed to FileStatusAdapter (currently hard-coded to null) for link buckets.

return new FileStatusAdapter(0L, 0L, path, true, (short)0, 0L,
ozoneBucket.getCreationTime().getEpochSecond() * 1000, 0L,
FsPermission.getDirDefault().toShort(),
owner, group, null, new BlockLocation[0],

@neils-dev
Copy link
Contributor Author

Thanks @adoroszlai for your comment and for reviewing this.

I wonder if we can avoid duplicating the Delete handler from Hadoop by fixing the symlink argument being passed to FileStatusAdapter

Unfortunately, supporting filestatus symlink does not let us avoid binding in an Ozone Delete handler. The handler is used to set the delete behavior for -rm (rm symlink) and -rm with trailing slash (to remove contents of symlink dir, through the call to the FileSystem delete with trailing slash).

I too wanted to enable symlink in the filestatus, currently null, however it wasn't directly possible due to FileStatus on create only allowing an item to be either a directory or a symlink (asserts if both). This causes problems with the PathData that expects directories for recursing paths, used in listStatus and other commands.

@neils-dev neils-dev added the gr label Feb 15, 2023
@prashantpogde prashantpogde self-requested a review March 9, 2023 17:58
Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good. nice test additions. minor comments inline.

Comment on lines 674 to 675
// if link, don't delete contents
if (isBucketLink) {
if (isBucketLink && !handleTrailingSlash) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to update the comment above accordingly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @smengcl for your review comments and for following up the PR. Latest commit to address review comments.

comment updated.

Comment on lines 681 to 682
// Handle delete bucket
if (ofsPath.isBucket()) {
if (ofsPath.isBucket() && !handleTrailingSlash) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment updated.

// commands, and then this method can be abstract
if (this.getClass().equals(OzoneFsShell.class)) {
factory.registerCommands(FsCommand.class);
// ozone delete rm command registration superscedes fs delete
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
// ozone delete rm command registration superscedes fs delete
// ozone delete rm command registration. supercedes fs delete

@InterfaceAudience.Private
@InterfaceStability.Evolving

public final class OzoneDelete {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about OzoneFsDelete?

Suggested change
public final class OzoneDelete {
public final class OzoneFsDelete {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, renamed. Updated the class name and all references to OzoneFsDelete

Comment on lines 135 to 141
// TODO: if the user wants the trash to be used but there is any
// problem (ie. creating the trash dir, moving the item to be deleted,
// etc), then the path will just be deleted because moveToTrash returns
// false and it falls thru to fs.delete. this doesn't seem right
if (moveToTrash(item) || !canBeSafelyDeleted(item)) {
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand this TODO here.

I think in this case (where skipTrash == false but moving to trash has issues) we should just return error to the user. So that they either need to deal with that trash issue first (maybe some ACL issue), or set -skipTrash.

I don't think falling back to to skipTrash would be the intended user behavior, cause they could lose keys when they think those would be moved to the trash.

Also it looks like moveToTrash() is already throwing exception appropriately? Do we need extra handling here? If not we can just fix the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked into the moveToTrash() comment.

just return error to the user. So that they either need to deal with that trash issue first (maybe some ACL issue), or set -skipTrash.

I suspect, like you that errors from creating the trash directory or renaming to trash will be handled properly with the exception thrown by trash - similar to

bash-4.2$ ozone fs -rm ofs://om/tmp/tmpfile
rm: Failed to move to trash: ofs://om/tmp/tmpfile: User testuser2/scm@EXAMPLE.COM 
doesn't have DELETE permission to access key Volume:tmp Bucket:tmp Key:tmpfile. 
Consider using -skipTrash option

Here errors occurring from trash unable to create trash dir or renaming file are handled by exception handling and the file is not outright deleted by fs.delete. File remains intact leaving the user to resolve the issue through permission settings or -skipTrash.

I'm looking to update the comment to reflect this.

Assert.assertEquals(0, res);

try {
objectStore.getVolume(destVolume).getBucket(destVolume);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo?

Suggested change
objectStore.getVolume(destVolume).getBucket(destVolume);
objectStore.getVolume(destVolume).getBucket(srcBucket);

Comment on lines +1668 to +1677
// test symlink -rm -R -f destVol/destBucket -> srcVol/srcBucket
// should delete only link
// run toolrunner ozone fs shell commands
linkPathStr = rootPath + destVolume + OZONE_URI_DELIMITER + srcBucket;
res = ToolRunner.run(shell, new String[]{"-rm", "-skipTrash",
"-f", "-R", linkPathStr});
Assert.assertEquals(0, res);

Assert.assertEquals(srcBucket, objectStore.getVolume(srcVolume)
.getBucket(srcBucket).getName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can also check /srcVolume/srcBucket/key existence

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the test condition for

/srcVolume/srcBucket/key existence

thanks!

…ete, minor fixes and comment fixes to unit tests, minor comment fixes to BasicRootedOzoneFilesystem. Ammended comment on handling trash withing OzoneFsDelete command handler.
@adoroszlai adoroszlai requested review from ChenSammi and smengcl April 17, 2023 10:30
@ChenSammi
Copy link
Contributor

ChenSammi commented Apr 18, 2023

@neils-dev , thanks for working on this. I will take a close look later. I tried 'rm -R ' in Linux on a symbolic link to a directory. It looks like that not only the files under the source directory are deleted, symbolic link is also deleted. Shall we follow this behavior too?

@neils-dev
Copy link
Contributor Author

@ChenSammi thanks. With the example you mentioned, are you using a trailing slash with the command ie. rm -R -f path_to_symlink_with_slash/ with trailing slash?

'rm -R ' in Linux on a symbolic link to a directory. It looks like that not only the files under the source directory are deleted, symbolic link is also deleted.

That behavior looks to be supported - delete contents of src directory leaving symlink intact.
$ rm -R -f ~/work/tmp/tmp2/
$ ls -alt ~/work/tmp
lrwxrwxrwx 1 user user 7 Apr 18 12:13 tmp2 -> ../tmp1

@ChenSammi
Copy link
Contributor

ChenSammi commented Apr 19, 2023

@neils-dev, I used the command "rm -R path-to-symbolic" without ending slash "/". So the path-to-symbolic is deleted.

You are correct, "rm -R path-to-symbolic/" will delete the content in source directory only.

Besides the "-rm" command, @neils-dev have you try the "-rmdir" command on the link bucket too? I'm not sure if it already works for link bucket now or not.

}

/** remove non-directory paths. */
public static class Rm extends FsCommand {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this Rm extend Delete.Rm? Except expandArgument and processPath are different, other parts of the implementation are the same as Delete.Rm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this Rm extend Delete.Rm?

Thanks. Due to the interface, I found that none of the Delete class or any of the command handlers for the fs shells can be extended outside of the hadoop fs shell package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neils-dev We can put the subclass in the org.apache.hadoop.fs.shell package if necessary. Similar solution is employed for reusing other classes from Hadoop (e.g. for EC).

Copy link
Contributor Author

@neils-dev neils-dev Apr 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this Rm extend Delete.Rm?

After repackaging under the hadoop fs shell package, org.apache.hadoop.fs.shell, it is possible to extend Delete.Rm to a certain extent, however many of the properties and methods used in the subclass overriden methods are private in the base class. The private props and methods include all of the parsed command arguments (private variables in base class), and the garbage collection trash routines needed in the subclass override for processPath , The resulting subclass redefines those private methods to an extent and limits code reuse. It does not appear to be advantageous. This FsCommand does not lend itself to subclassing. Our ozone command handler OzoneFsDelete appears to be flexible and cleaner. We can open a jira to follow-up on looking into this further if it looks worth pursuing, Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @neils-dev for checking. We'll have to live with duplication then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neils-dev , thanks for the checking. Then let's keep it.

The patch LGTM +1.

@ChenSammi
Copy link
Contributor

@neils-dev , the failed test looks like relevant.

…links. Fixes CI build failure as a result of the recent symlink orphan node patch.
@ChenSammi ChenSammi merged commit 6a7c070 into apache:master Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants