Skip to content

HDDS-15155. Add ozone debug ldb drop-column-family subcommand#10167

Closed
smengcl wants to merge 2 commits into
apache:masterfrom
smengcl:HDDS-15155-drop-cf
Closed

HDDS-15155. Add ozone debug ldb drop-column-family subcommand#10167
smengcl wants to merge 2 commits into
apache:masterfrom
smengcl:HDDS-15155-drop-cf

Conversation

@smengcl
Copy link
Copy Markdown
Contributor

@smengcl smengcl commented May 1, 2026

Generated-by: GPT-5.5

What changes were proposed in this pull request?

Add drop_column_family subcommand under ozone debug ldb.

e.g.

ozone debug ldb --db /path/to/om.db drop_column_family --cf snapshotInfoTable -y

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-15155

How was this patch tested?

  • Added unit test and integration test

@smengcl smengcl force-pushed the HDDS-15155-drop-cf branch from c789942 to a2721f7 Compare May 1, 2026 01:54
@smengcl smengcl added the tools Tools that helps with debugging label May 1, 2026
Copy link
Copy Markdown
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @smengcl for the patch.

@jojochuang
Copy link
Copy Markdown
Contributor

jojochuang commented May 1, 2026

looks good

@smengcl smengcl marked this pull request as ready for review May 1, 2026 18:49
Copilot AI review requested due to automatic review settings May 1, 2026 18:49
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new ozone debug ldb subcommand to drop a RocksDB column family, along with unit/integration tests and user documentation to support safe usage.

Changes:

  • Add drop-column-family subcommand to ozone debug ldb for dropping a specified column family (with interactive confirmation by default).
  • Add unit tests covering success, error cases, confirmation behavior, and failure when DB is already open.
  • Add an integration test validating dropping snapshotInfoTable from an OM DB checkpoint, plus documentation updates.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/shell/TestOzoneDebugShell.java Adds integration coverage for dropping snapshotInfoTable from an OM checkpoint DB.
hadoop-ozone/cli-debug/src/test/java/org/apache/hadoop/ozone/debug/ldb/TestDropColumnFamily.java Adds unit tests for the new drop-column-family command (including confirmations and DB-open failure).
hadoop-ozone/cli-debug/src/main/java/org/apache/hadoop/ozone/debug/ldb/RDBParser.java Registers DropColumnFamily as an ozone debug ldb subcommand.
hadoop-ozone/cli-debug/src/main/java/org/apache/hadoop/ozone/debug/ldb/DropColumnFamily.java Implements the new drop-column-family command.
hadoop-hdds/docs/content/tools/debug/Ldb.md Documents the new command and its confirmation behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@adoroszlai adoroszlai changed the title HDDS-15155. Add ozone debug ldb drop_column_family subcommand HDDS-15155. Add ozone debug ldb drop-column-family subcommand May 3, 2026
@errose28 errose28 self-requested a review May 4, 2026 16:17
Copy link
Copy Markdown
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is doing writes to the DB, therefore it should be under ozone repair, not ozone debug.

What is the use case for adding this? It is a dangerous command to have available on all column families when many have no valid reason to be dropped. If there is a specific issue that this is aiming to recover from, we should create a dedicated repair tool for that issue instead opening up deletion of all RocksDB data.

@smengcl
Copy link
Copy Markdown
Contributor Author

smengcl commented May 4, 2026

This is doing writes to the DB, therefore it should be under ozone repair, not ozone debug.

What is the use case for adding this? It is a dangerous command to have available on all column families when many have no valid reason to be dropped. If there is a specific issue that this is aiming to recover from, we should create a dedicated repair tool for that issue instead opening up deletion of all RocksDB data.

Hey @errose28 , this is not intended to be a repair tool. It belongs to troubleshooting when we are trying to load a RocksDB from another cluster but we don't have the snapshot DBs. Most of the times we won't have rocksdb tools binary handy for that cluster. rocksdbjni already has it but doesn't expose it.

This is no more dangerous than admin deleting the ozone metadata by accident TBH.

@errose28
Copy link
Copy Markdown
Contributor

errose28 commented May 4, 2026

Regardless of how dangerous it is, debug commands are supposed to be read only so if we add this it needs to be under ozone repair.

Regarding the use case, I'm not sure this is common enough to warrant a dangerous command like this being generally added to upstream.

This is no more dangerous than admin deleting the ozone metadata by accident TBH.

It's generally understood that doing manual filesystem operations on Ozone's internal state is dangerous and may result in corruption. It is not as well understood that commands added and endorsed by maintainers can do the same.

@smengcl
Copy link
Copy Markdown
Contributor Author

smengcl commented May 4, 2026

Regardless of how dangerous it is, debug commands are supposed to be read only so if we add this it needs to be under ozone repair.

Regarding the use case, I'm not sure this is common enough to warrant a dangerous command like this being generally added to upstream.

This is no more dangerous than admin deleting the ozone metadata by accident TBH.

It's generally understood that doing manual filesystem operations on Ozone's internal state is dangerous and may result in corruption. It is not as well understood that commands added and endorsed by maintainers can do the same.

Fair. I will close the PR and it will live as a patch file in the JIRA.

@smengcl smengcl closed this May 4, 2026
@smengcl
Copy link
Copy Markdown
Contributor Author

smengcl commented May 4, 2026

Thanks @adoroszlai , @jojochuang and @errose28 for the reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI-gen tools Tools that helps with debugging

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants