Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-8039. Allow container inspector to run from ozone debug. #4337

Merged
merged 7 commits into from Apr 5, 2023

Conversation

szetszwo
Copy link
Contributor

@szetszwo szetszwo commented Mar 2, 2023

What changes were proposed in this pull request?

The KeyValueContainerMetadataInspector currently can be run by the option
-Dozone.datanode.container.metadata.inspector=inspect on datanode startup.

In this JIRA, we will allow to run with the ozone debug command.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8039

How was this patch tested?

Will add some new tests.

@swagle swagle requested a review from errose28 March 3, 2023 04:38
Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for improving this tool @szetszwo. Can you add unit tests to TestKeyValueContainerMetadataInspector for the new pending delete byte count?

I think in a follow up we might want to remove the code that can run this on datanode startup and just have it run from CLI for simplicity. Also, it looks like this CLI version only supports inspect mode. What's the plan for supporting repair?

@szetszwo
Copy link
Contributor Author

szetszwo commented Mar 4, 2023

@errose28 , thanks a lot for reviewing this! Yes, added some new tests.

@szetszwo
Copy link
Contributor Author

szetszwo commented Mar 4, 2023

I think in a follow up we might want to remove the code that can run this on datanode startup and just have it run from CLI for simplicity. Also, it looks like this CLI version only supports inspect mode. What's the plan for supporting repair?

Do you mean removing the option -Dozone.datanode.container.metadata.inspector=inspect for running the KeyValueContainerMetadataInspector on datanode startup? Sounds good.

We may add a new command

ozone repair <subcommnads>

for running it in the repair mode.

@kerneltime
Copy link
Contributor

@tanvipenumudy can you please take a look as well?

Copy link
Contributor

@duongkame duongkame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

@tanvipenumudy
Copy link
Contributor

Thank you @szetszwo, LGTM +1

Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @szetszwo I think there's still one bug the tests aren't catching but the rest looks good.

Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update @szetszwo LGTM. The failed test is unrelated and documented in https://issues.apache.org/jira/browse/HDDS-8089. I'll retrigger then merge this.

@adoroszlai adoroszlai merged commit d4d90c7 into apache:master Apr 5, 2023
27 checks passed
@adoroszlai
Copy link
Contributor

Thanks @szetszwo for the patch, @duongkame, @errose28, @sumitagrawl, @tanvipenumudy for the review.

errose28 added a commit to errose28/ozone that referenced this pull request Apr 6, 2023
* master: (155 commits)
  update readme (apache#4535)
  HDDS-8374. Disable flaky unit test: TestContainerStateCounts
  HDDS-8016. updated the ozone doc for linked bucket and deletion async limitation (apache#4526)
  HDDS-8237. [Snapshot] loadDb() used by SstFiltering service creates extraneous directories. (apache#4446)
  HDDS-8035. Intermittent timeout in TestOzoneManagerHAWithData.testOMHAMetrics (apache#4362)
  HDDS-8039. Allow container inspector to run from ozone debug. (apache#4337)
  HDDS-8304. [Snapshot] Reduce flakiness in testSkipTrackingWithZeroSnapshot (apache#4487)
  HDDS-7974. [Snapshot] KeyDeletingService to be aware of Ozone snapshots (apache#4486)
  HDDS-8368. ReplicationManager: Create ContainerReplicaOp with correct target Datanode (apache#4532)
  HDDS-8358. Fix the space usage comparator in ContainerBalancerSelectionCriteria (apache#4527)
  HDDS-8359. ReplicationManager: Fix getContainerReplicationHealth() so that it builds ContainerCheckRequest correctly (apache#4528)
  HDDS-8361. Useless object in TestOzoneBlockTokenIdentifier (apache#4517)
  HDDS-8325. Consolidate and refine RocksDB metrics of services (apache#4506)
  HDDS-8135. Incorrect synchronization during certificate renewal in DefaultCertificateClient. (apache#4381)
  HDDS-8127. Exclude deleted containers from Recon container count (apache#4440)
  HDDS-8364. ReadReplicas may give wrong results with topology-aware read enabled (apache#4522)
  HDDS-8354. Avoid WARNING about ObjectEndpoint#get (apache#4515)
  HDDS-8324. DN data cache gets removed randomly asking for data from disk (apache#4499)
  HDDS-8291. Upgrade to Hadoop 3.3.5 (apache#4484)
  HDDS-8355. Mark TestOMRatisSnapshots#testInstallSnapshot as flaky
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants