HDDS-12835. Recon - Improve NSSummary tasks execution using custom lightweight codec.#8541
HDDS-12835. Recon - Improve NSSummary tasks execution using custom lightweight codec.#8541devmadhuu wants to merge 21 commits into
Conversation
|
Thanks @devmadhuu for fixing the javadoc. Note there are several test failures due to: |
Yeah @adoroszlai , I am aware of those and looking into it already. |
| import org.apache.hadoop.ozone.protocolPB.OMPBHelper; | ||
|
|
||
| /** | ||
| * OM database definitions. |
There was a problem hiding this comment.
Please don't duplicate javadoc from OMDBDefinition. Instead, add a short explanation of how this is different and why it is needed.
| /** volumeTable: /volume :- VolumeInfo. */ | ||
| public static final DBColumnFamilyDefinition<String, OmVolumeArgs> VOLUME_TABLE_DEF | ||
| = new DBColumnFamilyDefinition<>(VOLUME_TABLE, | ||
| StringCodec.get(), | ||
| OmVolumeArgs.getCodec()); |
There was a problem hiding this comment.
Can remove and use from OMDBDefinition?
| * Parses BucketInfo protobuf and creates OmBucketInfo without deserializing ACL list. | ||
| * @param bucketInfo | ||
| * @return instance of OmBucketInfo | ||
| */ | ||
| public static OmBucketInfo getOmBucketInfoFromProtobuf(OzoneManagerProtocolProtos.BucketInfo bucketInfo) { | ||
| OmBucketInfo.Builder obib = OmBucketInfo.newBuilder() | ||
| .setVolumeName(bucketInfo.getVolumeName()) | ||
| .setBucketName(bucketInfo.getBucketName()) | ||
| .setIsVersionEnabled(bucketInfo.getIsVersionEnabled()) | ||
| .setStorageType(StorageType.valueOf(bucketInfo.getStorageType())) | ||
| .setCreationTime(bucketInfo.getCreationTime()) | ||
| .setUsedBytes(bucketInfo.getUsedBytes()) | ||
| .setModificationTime(bucketInfo.getModificationTime()) | ||
| .setQuotaInBytes(bucketInfo.getQuotaInBytes()) | ||
| .setUsedNamespace(bucketInfo.getUsedNamespace()) | ||
| .setQuotaInNamespace(bucketInfo.getQuotaInNamespace()); |
There was a problem hiding this comment.
Instead of re-implementing most of the conversion logic without ACL, please:
- move this to
OmBucketInfoas a helper that returnsOmBucketInfo.Builder - let both "full" and "reduced" conversion call the helper and then
build(), with "full" adding further logic between the two steps
Apply the same to OmKeyInfo, etc.
There was a problem hiding this comment.
@adoroszlai These definition is specific for use case, to keep minimal whatever is required. There can be multiple variation of building the object. eg:
- may be only need decode parentId and usedSpace
- may be another need blockList information to be decoded
So I think keeping all in OMBucketInfo is not correct as it can keep changing, and these are to be used within that context, else can have issue.
Code duplicate handing / giving partial method is confusing as it does not define the usecase of the method. So we need keep more close to where its being used, like Recon.
There was a problem hiding this comment.
Which makes it trivial to forget about Recon's implementation when changing OMBucketInfo...
Instead of moving to Recon, define the use case in OMBucketInfo instead of just a generic name "...Partial".
There was a problem hiding this comment.
Pls review again. I have tried to refactor by reusing as much as possible.
|
Thanks @devmadhuu for updating the patch. |
| = new DBColumnFamilyDefinition<>(DELETED_DIR_TABLE, StringCodec.get(), CUSTOM_CODEC_FOR_KEY_TABLE); | ||
|
|
||
| //--------------------------------------------------------------------------- | ||
| public static final Map<String, DBColumnFamilyDefinition<?, ?>> COLUMN_FAMILIES |
There was a problem hiding this comment.
We should avoid creating another map of all, as if any new changes happens, it needs to be updated to Recon Map. The chances of missing is high. Instead, any get operation should get first from this map and then OM's map.
szetszwo
left a comment
There was a problem hiding this comment.
@devmadhuu , thanks for working on this! Please see the comments inlined.
| public static OmBucketInfo getFromProtobuf(BucketInfo bucketInfo, | ||
| BucketLayout buckLayout) { | ||
| Builder obib = OmBucketInfo.newBuilder() | ||
| public static Builder newBuilderFromProtobufPartial(BucketInfo bucketInfo) { |
There was a problem hiding this comment.
- Handle all fields in one method follow the same order as in the proto.
- Add a boolean to include/exclude the large fields.
- Add javadoc.
/**
* Create a builder from the given proto.
* @param includeLargeFields Omitted the large fields: acl, metadata, beinfo.
*/
public static Builder newBuilder(BucketInfo bucketInfo, boolean includeLargeFields) {
final Builder builder = OmBucketInfo.newBuilder()
.setVolumeName(bucketInfo.getVolumeName())
.setBucketName(bucketInfo.getBucketName());
if (includeLargeFields) {
// acl
builder.setAcls(bucketInfo.getAclsList().stream().map(
OzoneAcl::fromProtobuf).collect(Collectors.toList()));
}
builder.setIsVersionEnabled(bucketInfo.getIsVersionEnabled())
.setStorageType(StorageType.valueOf(bucketInfo.getStorageType()))
.setCreationTime(bucketInfo.getCreationTime());
if (includeLargeFields) {
// metadata
builder.addAllMetadata(KeyValueUtil.getFromProtobuf(bucketInfo.getMetadataList()));
if (bucketInfo.hasBeinfo()) {
// beinfo
builder.setBucketEncryptionKey(OMPBHelper.convert(bucketInfo.getBeinfo()));
}
}
if (bucketInfo.hasObjectID()) {
builder.setObjectID(bucketInfo.getObjectID());
}
if (bucketInfo.hasUpdateID()) {
builder.setUpdateID(bucketInfo.getUpdateID());
}
builder.setModificationTime(bucketInfo.getModificationTime());
if (bucketInfo.hasSourceVolume()) {
builder.setSourceVolume(bucketInfo.getSourceVolume());
}
if (bucketInfo.hasSourceBucket()) {
builder.setSourceBucket(bucketInfo.getSourceBucket());
}
builder.setUsedBytes(bucketInfo.getUsedBytes())
.setQuotaInBytes(bucketInfo.getQuotaInBytes())
.setQuotaInNamespace(bucketInfo.getQuotaInNamespace())
.setUsedNamespace(bucketInfo.getUsedNamespace())
if (bucketInfo.hasBucketLayout()) {
builder.setBucketLayout(BucketLayout.fromProto(bucketInfo.getBucketLayout()));
}
if (bucketInfo.hasOwner()) {
builder.setOwner(bucketInfo.getOwner());
}
if (bucketInfo.hasDefaultReplicationConfig()) {
builder.setDefaultReplicationConfig(
DefaultReplicationConfig.fromProto(bucketInfo.getDefaultReplicationConfig()));
}
return builder;
}| * @return instance of OmDirectoryInfo | ||
| */ | ||
| public static OmDirectoryInfo getFromProtobuf(DirectoryInfo dirInfo) { | ||
| public static Builder newBuilderFromProtobufPartial(DirectoryInfo dirInfo) { |
There was a problem hiding this comment.
Change it similarly as OmBucketInfo, i.e. use one method to include/exclude large fields.
| .setName(dirInfo.getName()) | ||
| .setCreationTime(dirInfo.getCreationTime()) | ||
| .setModificationTime(dirInfo.getModificationTime()); | ||
| if (dirInfo.getMetadataList() != null) { |
There was a problem hiding this comment.
Do we want to include metadata in this case?
| opib.addAllMetadata(KeyValueUtil | ||
| .getFromProtobuf(dirInfo.getMetadataList())); | ||
| .getFromProtobuf(dirInfo.getMetadataList())); |
| for (KeyLocationList keyLocationList : keyInfo.getKeyLocationListList()) { | ||
| for (OzoneManagerProtocolProtos.KeyLocationList keyLocationList : keyInfo.getKeyLocationListList()) { |
| /** | ||
| * Parses BucketInfo protobuf and creates OmBucketInfo without deserializing ACL list. | ||
| * @param bucketInfo | ||
| * @return instance of OmBucketInfo | ||
| */ |
There was a problem hiding this comment.
This javadoce does not provide any useful information and generate "tag description is missing" warning. Please remove it or rewrite it.
| /** | ||
| * Parses DirectoryInfo protobuf and creates OmPrefixInfo. | ||
| * @param dirInfo | ||
| * @return instance of OmDirectoryInfo | ||
| */ |
There was a problem hiding this comment.
This javadoce does not provide any useful information and generate "tag description is missing" warning. Please remove it or rewrite it.
| private static final Codec<OmBucketInfo> CUSTOM_CODEC_FOR_BUCKET_TABLE = new DelegatedCodec<>( | ||
| Proto2Codec.get(OzoneManagerProtocolProtos.BucketInfo.getDefaultInstance()), | ||
| ReconOMDBDefinition::getOmBucketInfoFromProtobuf, | ||
| null, |
There was a problem hiding this comment.
Use OmBucketInfo::getProtobuf ?
There was a problem hiding this comment.
Recon does not write or serialize data to bucket table. Only needed for forward.
| public static final DBColumnFamilyDefinition<String, OmBucketInfo> BUCKET_TABLE_DEF | ||
| = new DBColumnFamilyDefinition<>(BUCKET_TABLE, StringCodec.get(), CUSTOM_CODEC_FOR_BUCKET_TABLE); | ||
|
|
||
| public static final Codec<OmKeyInfo> CUSTOM_CODEC_FOR_KEY_TABLE = new DelegatedCodec<>( |
There was a problem hiding this comment.
It is not only for KEY_TABLE. Let's rename it to CUSTOM_OM_KEY_INFO_CODEC.
| Table<String, OmDirectoryInfo> dirTable = omMetadataManager.getStore() | ||
| .getTable(DIRECTORY_TABLE, StringCodec.get(), CUSTOM_CODEC_FOR_DIR_TABLE, TableCache.CacheType.NO_CACHE); |
There was a problem hiding this comment.
Use DEF:
final Table<String, OmDirectoryInfo> dirTable = ReconOMDBDefinition.DIRECTORY_TABLE_DEF
.getTable(omMetadataManager.getStore());|
Test failures in this PR doesn't seem related to the PR change. |
sumitagrawl
left a comment
There was a problem hiding this comment.
IMO, this PR is no longer needed as,
- ACL issue is fixed with most probable size is "1" only
- metadata size is mostly "0" except for eTag case where normally size is "1"
So this may not provide much signification improvement in performance.
Ok, then we can close this PR with this understanding. |
What changes were proposed in this pull request?
This PR is to add custom light weight codecs for Recon OM DB to improve the performance of Recon OM tasks including NSSummary tasks execution using custom lightweight codec, as most underlying Recon OM tasks like OMDBInsightTask , NSSummary tasks etc are where it just need very few fields like name, object id, size and parent object Id from KeyInfoTable. Lightweight custom codecs are excluding the de-serialization of ACLs and skipping them altogether as Recon OM tasks doesn't use them.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-12835
How was this patch tested?
This PR is being tested with existing junit and integration tests. Few additional test cases development is in progress.