Skip to content

Conversation

@ArafatKhan2198
Copy link
Contributor

@ArafatKhan2198 ArafatKhan2198 commented Jan 17, 2023

What changes were proposed in this pull request?

The container Endpoint and ContainerKeyMapperTask have been updated to support both legacy and file-system optimized (FSO) buckets. Previously, only the KeyTable for legacy buckets was being referenced, but now both the KeyTable and FileTable will be utilised to fetch metadata

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-5463

How was this patch tested?

Manually tested out the API along with Unit-Testing
image

image

@ArafatKhan2198 ArafatKhan2198 marked this pull request as ready for review January 17, 2023 07:12
@ArafatKhan2198
Copy link
Contributor Author

@aryangupta1998 @devmadhuu @dombizita @sadanand48 Can you please take a look at this !!

Copy link
Contributor

@aryangupta1998 aryangupta1998 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't have any usage of the 'getBucketLayout()' function as we are directly passing the bucket layout in the 'getKeyTable()' function in ContainerEndpoint.java. Can we remove 'getBucketLayout()'?

@kerneltime
Copy link
Contributor

@GeorgeJahad can you please take a look as well.

Copy link
Contributor

@aryangupta1998 aryangupta1998 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM!

@GeorgeJahad
Copy link
Contributor

Is there a reason why no tests have been added to confirm this fix?

@GeorgeJahad
Copy link
Contributor

What about object store buckets? I know those are similar to legacy buckets but the way recon is coded they won't get handled, will they? Is that a separate PR?

@DaveTeng0
Copy link
Contributor

Hey @ArafatKhan2198 ~ there were some new comments, please help take a look.
Thanks!

@jojochuang
Copy link
Contributor

Actually, getKeyTable() returns fileTable rocksdb column family if it's FSO and returns keyTable otherwise, so no need to distinguish between OBJECT_STORE or LEGACY.

But good point on the test case. We need that.

@ArafatKhan2198
Copy link
Contributor Author

@GeorgeJahad @jojochuang can you please take a look!

Copy link
Contributor

@dombizita dombizita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for working on this @ArafatKhan2198, overall it looks good to me, I added a comment inline.

@GeorgeJahad
Copy link
Contributor

It would be good to add an fso test here as well:

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @ArafatKhan2198 Plz check below comments

@GeorgeJahad
Copy link
Contributor

I don't see any smoketests for this api here: https://github.com/apache/ozone/tree/master/hadoop-ozone/dist/src/main/smoketest/recon

@sumitagrawl
Copy link
Contributor

IMO, pseudocode,

  1. Get all keys (FSO/OBS) with limit and prvKey as starting point
  2. For each Key,
  • get bucket Type (OBS/FSO)
  • If OBS, as old code, get KeyInfo and extract metadata
  • If FSO, generate key based on path
  •         use generated Key, get KeyInfo and extract metadata
    

Few optimization, generate key based on path, can see avoid getting parentId again and again

@ArafatKhan2198
Copy link
Contributor Author

@GeorgeJahad @ChenSammi @jojochuang Could you kindly review this patch? It has been open for some time. Please let me know if there are any additional changes required to expedite the merge process. Thank you.

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArafatKhan2198 we need have implementation wrt pagination as used by Recond UI

@ArafatKhan2198
Copy link
Contributor Author

ArafatKhan2198 commented Mar 27, 2023

While working on this patch, I had a doubt regarding the endpoint GET /api/v1/containers/:id/keys, which is handled by the getKeysForContainer() method in the ContainerEndpoint class. This endpoint supports two optional query parameters, prevKey and limit.

Initially, we assumed that the prevKey parameter was used to filter the keys in a container based on a certain prefix (specified by prevKey) in their name, and only the keys with names that started with this prefix would be returned, while those that did not have this prefix would be excluded from the response. But it turned out to be wrong!

So after a discussion this is what we have understood about the prevKey parameter of the ContainerEndpoint in Recon :-

  • The method getKeysForContainers() can be used by the RECON-UI for Pagination by making use of the "prev-key" query parameter and the "limit" parameter.
  • The "prev-key" parameter is used to specify the last key seen on the previous page. The method then retrieves all the keys starting from the key that comes after the "prev-key" parameter up to the limit specified by the "limit" parameter.
  • For example, if the UI wants to retrieve the first 10 keys in a container, it will call this method with a limit of 10 and no "prev-key" parameter. The method will then retrieve the first 10 keys in the container and return them to the UI along with the last key seen (which will be the 10th key).
  • If the UI wants to retrieve the next 10 keys, it will call the method again with the same limit of 10 and the "prev-key" parameter set to the last key seen on the previous page (i.e., the 10th key). The method will then retrieve the next 10 keys in the container starting from the key that comes after the 10th key and return them to the UI along with the last key seen (which will be the 20th key).
  • This process can be repeated for as many pages as the UI wants to retrieve. The UI can also change the limit parameter to retrieve more or fewer keys per page.

@ArafatKhan2198
Copy link
Contributor Author

Therefore, to conclude, there is no need for any conversion from object-ID to vol/bucket Name. Our goal should be to ensure that both the FSO and OBS keys are stored in the ContainerKey table, as it serves as the source of information for this endpoint. For Example :-

ozone sh bucket create --layout FILE_SYSTEM_OPTIMIZED /s3v/fso-bucket
ozone sh key put s3v/fso-bucket/key1-fso NOTICE.txt
ozone sh key put s3v/fso-bucket/key2-fso NOTICE.txt

ozone sh bucket create --layout OBJECT_STORE /s3v/obs-bucket
ozone sh key put s3v/obs-bucket/key1-obs NOTICE.txt
ozone sh key put s3v/obs-bucket/key2-obs NOTICE.txt

ozone sh bucket create --layout LEGACY s3v/legacy-bucket
ozone sh key put s3v/legacy-bucket/key1-legacy NOTICE.txt
ozone sh key put s3v/legacy-bucket/key2-legacy NOTICE.txt

The ContainerKey Table generated for this data would be the below table :-
In which the FSO key information would be displayed in the form of Object-ID's

KeyPrefix Container ID
/-4611686018427388160/-9223372036854774528/-9223372036854774528/key1-fso 1
/s3v/obs-bucket/key2-obs 1
/-4611686018427388160/-9223372036854774528/-9223372036854774528/key2-fso 2
/s3v/legacy-bucket/key1-legacy 2
/s3v/legacy-bucket/key2-legacy 3
/s3v/obs-bucket/key1-obs 3

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArafatKhan2198 LGTM +1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.