Skip to content

Conversation

@devmadhuu
Copy link
Contributor

Understand the age of data ( age of buckets, age of objects, age of  volumes) 

  1. This requirement will be limited by the number of buckets (say top 10) to show with their age information as a given volume can have a large number of buckets. Same holds true for showing a limited number of volumes (say top 10) with their age information if root path is given. 
  2. Top 10 criteria can be determined based on following factors:
  3. Age of Top 10 buckets can be listed based on either count of keys or size of buckets, if path is upto a specific volume.
  4. Age of top 10 volumes can be listed based on either count of buckets or size of volumes if path is given as root .
  5. If the path given by the user is upto bucket level,  then top 10 keys based on size will be given with their age details.
  6. If the path given by the user is upto key or leaf level, then only that key age details will be returned along with size.

https://issues.apache.org/jira/browse/HDDS-7650

How was this patch tested?

This patch is tested using postman by verifying backend responses:
API details:
If Path is Bucket:

Keys
TopN Keys with Age, Size details
By Count - Give user option to select 10, 20, 30 etc
By Size - Give user option to select With or Without Replica
Dirs
TopN Dirs with Age, Size, dirCount, keycount details
By Count - Give user option to select 10, 20, 30 etc
By Size - Give user option to select With or Without Replica

If Path is Volume:

Buckets
TopN Buckets with Age, Size, dirCount, keycount details
By Count - Give user option to select 10, 20, 30 etc
By Size - Give user option to select With or Without Replica

If Path is Directory:

Keys
TopN Keys with Age, Size details
By Count - Give user option to select 10, 20, 30 etc
By Size - Give user option to select With or Without Replica
Dirs
TopN Dirs with Age, Size, dirCount, keycount details
By Count - Give user option to select 10, 20, 30 etc
By Size - Give user option to select With or Without Replica

If Path is Key:

Display: Size, Age details

API End Point :
Request Type: GET
API Path: /api/v1/namespace/entitymetrics?path=<path of volume/bucket/dir/key>
Sample Response:
{
"entityInfo":

{ "type": "DIRECTORY", "numVolume": -1, "numBucket": -1, "numDir": 2, "numKey": 5, "status": "OK" }

,
"childEntityMetrics": [

{ "key": false, "path": "/volume1/bucket1/dir1/dir2", "size": 86060, "sizeWithReplica": 258180, "isKey": false, "keyCount": 4, "bucketCount": -1, "dirCount": 1, "volumeCount": -1, "type": "DIRECTORY", "creationTime": 1671801625290, "modificationTime": 1671801625290 }

,

{ "key": true, "path": "/volume1/bucket1/dir1/LICENSE.txt", "size": 21515, "sizeWithReplica": 64545, "isKey": true, "keyCount": 0, "bucketCount": -1, "dirCount": -1, "volumeCount": -1, "type": "KEY", "creationTime": 1671801625290, "modificationTime": 1671801625290 }

],
"status": "OK"
}

@devmadhuu
Copy link
Contributor Author

@sumitagrawl @dombizita @smengcl , pls review

/** Age of entity in
* human-readable format. */
@JsonProperty("age")
private long age;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can have same name as createTime, UI can calculate age and display name accordingly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumitagrawl - Thanks for reviewing. It has been fixed. Pls re-review.

@GET
@Path("/entitymetrics")
public Response getEntityMetrics(@QueryParam("path") String path,
@DefaultValue("size")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

orderby should include "age" also a filter and as default

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumitagrawl - Thanks for reviewing. It has been fixed. Pls re-review.

}
}
entityMetricsResponse.setNsSummaryResponse(nsSummaryData);
entityMetricsResponse.setChildMetricsListMap(childMetricsListMap);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

childMetricsList should be part of NsSummary as it belongs to it.

}

@GET
@Path("/entitymetrics")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uri name can be entityInfo or other better name, not a metrics as its not a metric data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumitagrawl - Thanks for reviewing. It has been fixed. Pls re-review.

@kerneltime kerneltime requested a review from fapifta January 9, 2023 17:26
@kerneltime
Copy link
Contributor

@duongkame can you please take a look?

@devmadhuu
Copy link
Contributor Author

@ChenSammi , @duongkame pls review

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, please add Unit cases

@devmadhuu
Copy link
Contributor Author

LGTM, please add Unit cases

@sumitagrawl - Added test cases. Pls review.

Copy link
Contributor

@fapifta fapifta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some doubts with the getEntityMetrics endpoint that we are defining in this patch, and also I am not sure why we need the EntityInfoResponse... So let's discuss this first, depending on the answers I might have some suggestions for the patch itself besides what I noted inline.

Probably this is just my ignorance, but it is unclear where we would like to show this data...

The full dataset that the EntityInfoResponse seem to provide can be calculated from the results of /summary and /du if I'm correct.
If during rendering the UI we call both endpoints that we call from getEntityMetrics as well, then it is pointless to have a separate endboint, as that just pushes calculation burden on Recon, while client side should be able to calculate the data in theory.

If the calculation can not be done on the client side for any reason...
The thing is that EntityInfoRespone, DUResponse and CountStats seems to be redundant, and it feels like they are storing the same information, what is the reason behind these three?
Also the getEntityMetrics method does not use the EntityHandlers, but it has a big if - elseif - else statement within each a fairly similar switch - case statement, should that be done in the EntityHandler implementations which would help to simplify this newly added method as I feel when I look at it. Is there a reason to keep these if-switch blocks?

@AdminOnly
public class NSSummaryEndpoint {

public static final String COUNT = "count";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we have createTime here? Also shouldn't that be cTime or creationTime?
As these three belongs together, wouldn't it worth it to create an enum for the values we accept, and validate and parse the parameters based on the enum?


@GET
@Path("/entityinfo")
public Response getEntityMetrics(@QueryParam("path") String path,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add API documentation for this method if we keep it?


private DUResponse getDuResponse(String path, boolean listFile,
boolean withReplica) throws IOException {
AtomicBoolean isError = new AtomicBoolean(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we can use a simple boolean here, even if the forEach runs in parallel mode, the only change to the value is to set it to true, and we just check after execution is done, so we don't need atomicity on it as it does not matter i one or 3k threads sets it to true at once, but correct me if I am wrong.

if (path == null || path.length() == 0) {
return Response.status(Response.Status.BAD_REQUEST).build();
}
return Response.ok(getDuResponse(path, listFile, withReplica)).build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After extracting the method here, but even before that, we can remove the @SuppressWarnings for the /du endpoint method in line 287

@smitajoshi12
Copy link
Contributor

smitajoshi12 commented Feb 2, 2023

Can we get mutiple Parent entityInfo records { "type": "DIRECTORY", "numVolume": -1, "numBucket": -1, "numDir": 2, "numKey": 5, "status": "OK" } ?

@DaveTeng0
Copy link
Contributor

Hey @devmadhuu ~ there are some new comments. Please help address, thanks!

@adoroszlai
Copy link
Contributor

/pending

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking this issue as un-mergeable as requested.

Please use /ready comment when it's resolved.

Please note that the PR will be closed after 21 days of inactivity from now. (But can be re-opened anytime later...)

/pending

@github-actions
Copy link

github-actions bot commented Feb 5, 2024

Thank you very much for the patch. I am closing this PR temporarily as there was no activity recently and it is waiting for response from its author.

It doesn't mean that this PR is not important or ignored: feel free to reopen the PR at any time.

It only means that attention of committers is not required. We prefer to keep the review queue clean. This ensures PRs in need of review are more visible, which results in faster feedback for all PRs.

If you need ANY help to finish this PR, please contact the community on the mailing list or the slack channel."

@github-actions github-actions bot closed this Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants