Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-11285. cli to trigger quota repair and status #7104

Merged
merged 7 commits into from
Sep 5, 2024

Conversation

sumitagrawl
Copy link
Contributor

@sumitagrawl sumitagrawl commented Aug 21, 2024

What changes were proposed in this pull request?

CLI option to trigger repair and also check status:

ozone repair quota status [--service-host=<omHost> ] [--service-id, --om-service-id=<omServiceId>] 

ozone repair quota trigger [--buckets=<buckets>] [--service-host=<omHost>] [--service-id=<omServiceId>]

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11285

How was this patch tested?

  • Unit test case is added
./ozone repair quota trigger --service-host=localhost
ATTENTION: Running as user sumitagrawal. Make sure this is the same user used to run the Ozone process. Are you sure you want to continue (y/N)? y
Run as user: <supressed>
{"bucketCountDiffMap":{},"lastRunFinishedTime":"","lastRunStartTime":1725511849369,"taskId":1,"errorMsg":""}


./ozone repair quota status --service-host=localhost 
ATTENTION: Running as user sumitagrawal. Make sure this is the same user used to run the Ozone process. Are you sure you want to continue (y/N)? y
Run as user: <supressed>
 {"bucketCountDiffMap":{},"lastRunFinishedTime":"Thu Sep 05 10:29:00 IST 2024","lastRunStartTime":"Thu Sep 05 10:28:59 IST 2024","taskId":1,"errorMsg":"BUCKET_NOT_FOUND org.apache.hadoop.ozone.om.exceptions.OMException: no matching buckets"}


./ozone repair quota status --service-host=localhost 
ATTENTION: Running as user sumitagrawal. Make sure this is the same user used to run the Ozone process. Are you sure you want to continue (y/N)? y
Run as user: <supressed>
{"bucketCountDiffMap":{},"lastRunFinishedTime":"Thu Sep 05 10:29:00 IST 2024","lastRunStartTime":"Thu Sep 05 10:28:59 IST 2024","taskId":1,"errorMsg":""}

@sumitagrawl sumitagrawl marked this pull request as ready for review August 21, 2024 10:50
@ChenSammi
Copy link
Contributor

ChenSammi commented Aug 27, 2024

@sumitagrawl , is it allowed to submit multiple quota repair commands in the same time for different buckets?

@sumitagrawl
Copy link
Contributor Author

sumitagrawl commented Aug 27, 2024

@sumitagrawl , is it allowed to submit multiple quota repair commands in the same time for different buckets?

No, its not allowed. As this operation is resource intensive, so allowed one quota repair task at a point of time. Since this is not a normal operation, so its fine to have one request at a time.
Or user can trigger for multiple buckets at the time by passing list of buckets.

@@ -279,6 +279,8 @@ public static boolean isReadOnly(
case PrintCompactionLogDag:
case GetSnapshotInfo:
case GetServerDefaults:
case QuotaRepairStatus:
case QuotaRepairTrigger:
Copy link
Contributor

@ChenSammi ChenSammi Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QuotaRepairTrigger -> startQuotaRepair
QuotaRepairStatus -> getQuotaRepairStatus

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@@ -151,6 +151,8 @@ enum Type {
ListOpenFiles = 132;
QuotaRepair = 133;
GetServerDefaults = 134;
QuotaRepairStatus = 135;
Copy link
Contributor

@ChenSammi ChenSammi Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QuotaRepairStatus -> getQuotaRepairStatus
QuotaRepairTrigger -> startQuotaRepair

Most command enum names follow the "action" + "target" pattern. Let's keep the same pattern. Please also rename the request and response message name accordingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

// lock in progress operation and reject any other
if (!IN_PROGRESS.compareAndSet(false, true)) {
LOG.info("quota repair task already running");
return CompletableFuture.supplyAsync(() -> false);
throw new OMException("Operation in progress", OMException.ResultCodes.QUOTA_ERROR);
Copy link
Contributor

@ChenSammi ChenSammi Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change the exception message to something like "There is a quota repair task already running. Please try it later"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated as "Quota repair is already running"

@@ -279,6 +279,8 @@ public static boolean isReadOnly(
case PrintCompactionLogDag:
case GetSnapshotInfo:
case GetServerDefaults:
case GetQuotaRepairStatus:
case StartQuotaRepair:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This StartQuotaRepair is a write operation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It trigger a repair activity, but not directly a write operation to ratis --> and as part of this, it further trigger a write operation to ratis based on repair to be done.
Since can not be submitted to Ratis as no direct write action, so path of Read is followed.
This is similar to "RangerBGSync", "SnapshotDiff" where trigger happens.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for the explanation.

@ChenSammi
Copy link
Contributor

The last patch LGTM. Thanks @sumitagrawl. One further suggestion is can we display the lastRunFinishedTime and lastRunStartTime in a human readable format?

@sumitagrawl
Copy link
Contributor Author

The last patch LGTM. Thanks @sumitagrawl. One further suggestion is can we display the lastRunFinishedTime and lastRunStartTime in a human readable format?

Handled, now output sample,
{"bucketCountDiffMap":{},"lastRunFinishedTime":"","lastRunStartTime":"Thu Sep 05 10:28:59 IST 2024","taskId":1,"errorMsg":""}

{"bucketCountDiffMap":{},"lastRunFinishedTime":"Thu Sep 05 10:29:00 IST 2024","lastRunStartTime":"Thu Sep 05 10:28:59 IST 2024","taskId":1,"errorMsg":"BUCKET_NOT_FOUND org.apache.hadoop.ozone.om.exceptions.OMException: no matching buckets"}

@sumitagrawl sumitagrawl merged commit 3e1188a into apache:master Sep 5, 2024
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants