-
Notifications
You must be signed in to change notification settings - Fork 502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-3814. Drop a column family through debug cli tool #1083
Conversation
@sadanand48 Please take a look at this PR. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch @maobaolong. I have a question though: To let OM start successfully do we really need to drop the S3Table from the db?
List<byte[]> cfList = RocksDB.listColumnFamilies(new Options(), | ||
parent.getDbPath()); | ||
byte[] nameByte = name.getBytes(StandardCharsets.UTF_8); | ||
int deleteIndex = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the ArrayList.indexOf() method can be used here.
int deleteIndex = cfList.indexOf(nameByte)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@sadanand48 Thanks for your review.
HDDS-3614. Remove S3Table from OmMetadataManager. After this tickets merged, the S3Table has been removed from Ozone code, but it cannot startup successfully with the old om rocksdb, because the old om rocksdb contains S3Table, i just use the ldb of rocksdb to remove the S3Table from the om rocksdb, and start the OM successfully. That's doesn't matter, but the drop table cli command is useful i think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to create @maobaolong. Overall it looks good to me. Two comments:
-
Can we please use consistent naming for the column family parameters? In
DBScanner
, it's `--column_family. I would prefer to use the same naming for all the subcommands (if you need a short version, feel free to add it, but can we have the same long version here, too?) -
Other usability opinion: Until now we copied the interface of the rocksdb
ldb
command. Seems to be reasonable to do the same here and call the subcommand asdrop_column_family
to avoid some confusion (it drops the whole column family not just one key)
List<ColumnFamilyDescriptor> cfs = new ArrayList<>(); | ||
final List<ColumnFamilyHandle> columnFamilyHandleList = | ||
new ArrayList<>(); | ||
List<byte[]> cfList = RocksDB.listColumnFamilies(new Options(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: Some part of this code is duplicated vs. DBScanner
. I am not sure if it's better to move to a common method (as it's a real open
not just a readOnlyOpen
) but might be...
@maobaolong If OM cannot start successfully due to removal of a column family, then it is a bug and needs to be fixed. Does your cluster have the fix for HDDS-3668? IMO, adding capability to create/delete tables using a debug tool is risky. |
Agree with @avijayanhwx this looks risky to delete column families from OM/SCM DB. |
@elek Thanks for your review and suggestion, i have changed the name of the argument to --column_family to keep consistent with DBScanner, and i create a common method to avoid code duplicated.
I don't catch your point? If you mean to change the open to readOnlyOpen here, i have tested that it cannot work, because drop operation is a write operation for the rocksdb. Please take another look, thanks. |
@avijayanhwx @bharatviswa504 Thank you for telling me HDDS-3668 can help me to resolve my problem.
|
@maobaolong Ya agree, for now, this has caused trouble for existing deployments.
I don't think if someone goes to the metadata folder and deletes it directly, I am not sure there is a way to safeguard such a situation, and user is risking his data on the cluster. Because now we are exposing such a CLI tool to end-user, this seems risky.
Do you think having an admin check here would safeguard any user from doing this? |
Thank you for the explanation @maobaolong.
Removing useless column family is a schema change, and schema changes across versions are generally handled by Upgrade handlers in systems.
Yes, these types of destructive operations can be done possibly by naive users. But at least in that case, Ozone did not provide a direct way for the user to mess up its own DB. As @bharatviswa504 said, we need to put appropriate safeguards beforee this operation. Is it possible to move this to an "ozone admin" command, with an explicit confirmation by printing a warning message (with the table name), and the user to enter something like "delete" or "yes" to confirm? Also, in the case of OM HA, doesn't the user have to do it in every OM individually? |
@bharatviswa504 Thank you for discuss with me.
Another truth, without this PR, i can still create/delete the rocksdb by rocksdb/ldb, with this PR, i don't have to intall the rocksdb/ldb only. |
@avijayanhwx Thanks for your question, it let me think more.
In fact, the column family S3Table is a good example, the schema is removed in OM, but it still exist in the rocksdb, so it doesn't be removed by upgrade handlers, i have to remote it manually though rocksdb/ldb, i thinks this PR is nothing but only the substitute of rocksdb/ldb.
I just did the drop column family operation for S3Table last week on our Ozone cluster.
In our Ozone cluster, nobody can login with the admin user to the OM node, even someone can login with the developer user, they can only read the log and rocksdb file, no write permission at all, so they cannot do the drop table, create table, remove/move rocksdb. In this sense, i think it is safety.
|
@maobaolong So, just exposing such a tool, and dropping a table from DB and upgrade the OM, will not work. As currently, Ozone does has not have complete support for upgrades, once upgrade work is done, if any of these kinds of changes are performed, it is the developer's responsibility to handle these kinds of breaking changes. As said before, @avijayanhwx has discussed this in the upgrade design document in the section " Extending Layout Features for OM" which needs to be taken care of by LayoutFeature upgrade handlers. So, in my view exposing such a tool from Ozone is not a good idea.
If an admin is doing these kinds of steps, it means he is risking data on the cluster. Finally, I want to say, any kind of incompatible changes should be taken care of by upgrade handlers to handle properly, and it should not require an admin to perform these kinds of operations. If it requires, that means our upgrade path is not correct. |
@bharatviswa504 Ok, thank you for your clarification. Do you mind to move these tool into the admin subcommand? |
I would like to add my own opinion about the size of the risks: I wouldn't like to convince anybody, but couldn't see very high risk. Actually I think there is a higher risk to use a wrong
This is what we have now. If this were the only concern, we can solve it by removing this ldb subcommand when upgrade is implemented. Today we have incompatibilities where I can accept that we need tool to help admins. |
I agree with @elek here. |
My point here is just deleting a column family and assuming everything works with any incompatible change is wrong.
Yes, for now with an incompatible change this tool may help. But using this tool to solve such issues will not solve the problem completely, let's take an example of S3 incompatible change, we need other changes like HDDS-3563. So, I am not sure just by exposing such tool, and when we see a column family is dropped and thinking that dropping a column family will resolve my upgrade issue. So, I am not sure just using this tool will help how much here. I have explained the same in one of my above comment in detail regarding this issue. |
It was not a goal to solve all the upgrade issues, IMHO. Just provide a tool which can help to solve one, current issues. Where this tool can help. I agree, to solve all the issues we need the upgrade work but that's a bigger effort and independent of this patch. |
I am OK with this if it moves into admin command since it is a cluster "update" operation rather than a "debug" operation. |
HDDS-3668 has already solved this issue, but the only thing it has not taken care of is the deleting column family, but even after the upgrade it will open the column family which does not exist anymore in the newer version after the upgrade. Why can't we extend that to delete column families also, instead of providing this tool? (And we can remove this change, once after proper upgrade support is available). In this way, we don't need any admin intervention in deleting these column families during the upgrade. |
We agreed that this is a dangerous tool. I think it's better to do it explicit rather than implicit in the background. But this is just my opinion. |
Some more arguments. In fact, with this tool, in some situation, as an administrator of Ozone cluster, if i decided to do some operation with the rocksdb, i always have approach to modify the rocksdb, i can use the rocksdb/ldb to achieve my goal. What i want to say is this PR just give me convince, without this tool, cannot prevent me to do this risk things. So, I think, what we should do is to verify the role of user, and doing operation under the permission of this role. |
But if we do the backup of DB, and i don't see any issue. But if we want to be conservative or cautious it might be not to do this.
If admin, wants to do some modifications like this, and ozone is not providing such a tool, and admin still want to build ldb tool and then need to perform these operations which is a long process. So, in this case ozone is not providing such a tool, admin wants to still perform these, not sure in which cases we might need this when code can already handle these things internally. But it looks like as everyone is more interested in getting this tool, even if code can handle by doing minor change, I am okay to move this tool under admin, currently for admin commands, we don't perform any checks is this operation by admin is this missing? |
Reading through the comments and code, I'm OK with other readonly OM DB tools. However, I share the same concern with @bharatviswa504 and @avijayanhwx on exposing tool that can directly modify OM db. If there is a bug in Ozone that needs drop tables, we would better fix it via official code fixes over asking admin do perform hack workaround as it usually will not be approved in most customer production cluster. |
So what is the conclusion here. Majority seems to have concerns about providing generic R/W RocksDB tools. I created a related patch: #1276. It makes it possible to register sub-commands runtime: anybody (with special needs) can register sub-commands to existing CLI and provide additional functionality (it should be registered). What do you suggest with this PR?
Any opinion? |
/pending Any opinion? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marking this issue as un-mergeable as requested.
Please use /ready
comment when it's resolved.
Any opinion?
@elek Thanks for your conclusion and suggested options, and thank @bharatviswa504 @sadanand48 @avijayanhwx @bshashikant @xiaoyuyao for the discussion, I will go to check #1276 and go back soon. |
@elek Thanks for the ➜ ozone-1.1.0-SNAPSHOT git:(HDDS-3814) ✗ bin/ozone debug ldb
Missing required option: '--db=<dbPath>'
Usage: ozone debug ldb --db=<dbPath> [COMMAND]
Parse rocksdb file content
--db=<dbPath> Database File Path
Commands:
scan Parse specified metadataTable
list_column_families, ls list all column families in db.
➜ ozone-1.1.0-SNAPSHOT git:(HDDS-3814) ✗ cp -r /Users/mbl/projects/github/hadoop-ozone/hadoop-ozone/tools/target/classes/META-INF etc/hadoop/
➜ ozone-1.1.0-SNAPSHOT git:(HDDS-3814) ✗ bin/ozone debug ldb
Missing required option: '--db=<dbPath>'
Usage: ozone debug ldb --db=<dbPath> [COMMAND]
Parse rocksdb file content
--db=<dbPath> Database File Path
Commands:
scan Parse specified metadataTable
list_column_families, ls list all column families in db.
➜ ozone-1.1.0-SNAPSHOT git:(HDDS-3814) ✗ echo "org.apache.hadoop.ozone.debug.DropTable" >> etc/hadoop/META-INF/services/org.apache.hadoop.hdds.cli.SubcommandWithParent
➜ ozone-1.1.0-SNAPSHOT git:(HDDS-3814) ✗ bin/ozone debug ldb
Missing required option: '--db=<dbPath>'
Usage: ozone debug ldb --db=<dbPath> [COMMAND]
Parse rocksdb file content
--db=<dbPath> Database File Path
Commands:
scan Parse specified metadataTable
list_column_families, ls list all column families in db.
drop_column_family drop column family in db.
➜ ozone-1.1.0-SNAPSHOT git:(HDDS-3814) ✗ bin/ozone debug ldb --db=/tmp/metadata/scm.db/ ls
default
validCerts
deletedBlocks
pipelines
revokedCerts
containers
➜ ozone-1.1.0-SNAPSHOT git:(HDDS-3814) ✗ bin/ozone debug ldb --db=/tmp/metadata/scm.db/ drop_column_family --column_family=deletedBlocks
➜ ozone-1.1.0-SNAPSHOT git:(HDDS-3814) ✗ bin/ozone debug ldb --db=/tmp/metadata/scm.db/ ls
default
validCerts
pipelines
revokedCerts
containers
It's awesome, for the default, we don't need to add the @elek @bharatviswa504 @sadanand48 @avijayanhwx @bshashikant @xiaoyuyao thanks for your review and discussion, please take another look. |
/ready |
Blocking review request is removed.
@bharatviswa504 @avijayanhwx Are you fine with this approach? Command won't be visible unless you put a magic file to the right place (META-INF/services). It's a hidden helper class. Would like to merge if you have no objectinos. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for getting back to this late. I am happy with the guards around the delete operation. LGTM +1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for getting back late.
I have one comment.
break; | ||
} | ||
} | ||
rocksDB.dropColumnFamily(toBeDeletedCf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question:
If tablename is not a columnfamily in DB, we pass null for dropColumnFamily, do we need to print some error message if table is not in DB?
And also for the success scenario, do we want to add some message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks for the suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM.
Merging it now. Thanks the review @bharatviswa504 and @avijayanhwx and the patch @maobaolong |
* master: (23 commits) HDDS-4122. Implement OM Delete Expired Open Key Request and Response (apache#1435) HDDS-4336. ContainerInfo does not persist BCSID (sequenceId) leading to failed replica reports (apache#1488) Remove extra serialization from getBlockID (apache#1470) HDDS-4262. Use ClientID and CallID from Rpc Client to detect retry requests (apache#1436) HDDS-4285. Read is slow due to frequent calls to UGI.getCurrentUser() and getTokens() (apache#1454) HDDS-4312. findbugs check succeeds despite compile error (apache#1476) HDDS-4311. Type-safe config design doc points to OM HA (apache#1477) HDDS-3814. Drop a column family through debug cli tool (apache#1083) HDDS-3728. Bucket space: check quotaUsageInBytes when write key and allocate block. (apache#1458) HDDS-4316. Upgrade to angular 1.8.0 due to CVE-2020-7676 (apache#1481) HDDS-4325. Incompatible return codes from Ozone getconf -confKey (apache#1485). Contributed by Doroszlai, Attila. HDDS-4309. Fix inconsistency in recon config keys starting with recon and not ozone (apache#1478) HDDS-4310: Ozone getconf broke the compatibility (apache#1475) HDDS-4298. Use an interface in Ozone client instead of XceiverClientManager (apache#1460) HDDS-4280. Document notable configurations for Recon. (apache#1448) HDDS-4156. add hierarchical layout to Chinese doc (apache#1368) HDDS-4242. Copy PrefixInfo proto to new project hadoop-ozone/interface-storage (apache#1444) HDDS-4264. Uniform naming conventions of Ozone Shell Options. (apache#1447) HDDS-4271. Avoid logging chunk content in Ozone Insight (apache#1466) HDDS-4299. Display Ratis version with ozone version (apache#1464) ...
What changes were proposed in this pull request?
As it could involve an incompatible commit like HDDS-3614, we have to drop some table like
S3Table
to let OM can start successfully.Now, we have to build and install a ldb of rocksdb, use that tools, we can drop the table
S3Table
, it is not an easy job to build and install ldb.fortunately, we have debug tools which come in Ozone by HDDS-3622, so we can create a drop command to instead of ldb in rocksdb.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-3814
How was this patch tested?
An example as follow. We can verify the xxx table is dropped.