Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#305][#306] improvement(core): Adjust id name mapping to support true deletion and modification of entities #309

Merged
merged 14 commits into from
Sep 5, 2023

Conversation

yuqi1129
Copy link
Contributor

@yuqi1129 yuqi1129 commented Sep 1, 2023

What changes were proposed in this pull request?

We change the key of name-id mapping pairs.

Why are the changes needed?

Current name-id mapping contains no namespace information and can hardly handle the following scenarios:

  • Rename an entity when there exists an entity with the same name under a different namespace.
  • Delete entities recursively.

Fix: #305 #306

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing UTs can cover this change.

@github-actions
Copy link

github-actions bot commented Sep 1, 2023

Code Coverage Report

Overall Project 63.02% -0.17% 🟢
Files changed 94.27% 🟢

Module Coverage
api 71.54% -0.34% 🔴
core 71.43% -0.42% 🟢
catalog-hive 58.43% 🟢
Files
Module File Coverage
api NonEmptyEntityException.java 0% 🔴
core EntityKeyEncoder.java 100% 🟢
EntityStore.java 100% 🟢
BinaryEntityKeyEncoder.java 99.47% 🟢
KvEntityStore.java 93.78% -1.28% 🟢
BaseTable.java 79.52% -1.51% 🟢
BaseSchema.java 79.23% -1.6% 🟢
RocksDBKvBackend.java 68.63% -3.99% 🟢
KvBackend.java 0% 🟢
catalog-hive HiveCatalogOperations.java 66.17% 🟢

@jerryshao
Copy link
Contributor

You'd better add more tests to cover different scenarios.

@jerryshao jerryshao changed the title [#305] improvement: Adjust id name mapping to support true deletion and modification of entities [#305] improvement(core): Adjust id name mapping to support true deletion and modification of entities Sep 4, 2023
@yuqi1129 yuqi1129 changed the title [#305] improvement(core): Adjust id name mapping to support true deletion and modification of entities [#305][#306] improvement(core): Adjust id name mapping to support true deletion and modification of entities Sep 4, 2023
@yuqi1129 yuqi1129 self-assigned this Sep 4, 2023
* @return
* @throws IOException
*/
String generateIdNameMappingKey(NameIdentifier nameIdentifier) throws IOException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be exposed to the outside uers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generateIdNameMappingKey was called by KvEntityStore temporarily. So what you mean outside users?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean other classes that use this EntityKeyEncoder.

Basically, I think exposing this method looks a little odd to me, as this seems should not be known by the users.

Besides, the name of this method and below are confused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, Let me see

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to move this to NameMappingService interface, but it's a method that's strongly associated with kv, and it didn't seem right to move it to that interface.

So I also move it to KvEntityStore, can you give some advice?

* @return
* @throws IOException
*/
List<T> encodeSubEntityPrefix(NameIdentifier identifier, EntityType type) throws IOException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be exposed to the outside users?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically, what's the meaning of this method, the doc written here is confused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically, what's the meaning of this method, the doc written here is confused.

I want to get the prefix of all sub-entities of an entities with the name identifier . E.g., Assuming the id of a metalake is 1, then prefix of all sub-entities (catalog, schema, table) are:

ca_{metalake_id} is the prefix a catalog entity in metalake 1
sc_{metalake_id} is the prefix a schema entity in metalake 1
ta_{metalake_id} is the prefix a table entity in metalake 1

should it be exposed to the outside users?

KvEntityStore needs this function to remove all sub-entities

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we redesign the interface to not expose so many details to the classes who use this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method encodeSubEntityPrefix seems to be closely related with KV Implementation(Get all prefix), So I move this to KvEntityStore

if (subEntityPrefix.isEmpty()) {
// has no sub-entities
return backend.delete(dataKey);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you delete name-id mappings also?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do not need to remove name-id mapping eagerly for the following reasons:

  1. name-id mappings can be reused later
  2. name-id mappings can be deleted offline for better performance

*/
private byte[] encodeEntity(NameIdentifier identifier, EntityType entityType) throws IOException {
private byte[] encodeEntity(
NameIdentifier identifier, EntityType entityType, boolean returnNullIfEntityNotFound)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd also change the name "returnNullIfEntityNotFound" to be short

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

generateIdNameMappingKey(ident), generateIdNameMappingKey(updatedE.nameIdentifier()));

// Update the entity to store
backend.put(key, serDe.serialize(updatedE), true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we directly call put?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scenario is changing the name of entity and the key already exists, so we should use overwrite option. If we use put without overwrite, a AlreadyExistsException will be thrown.

return StringUtils.isBlank(context) ? name : context + NAMESPACE_SEPARATOR + name;
}

public String generateIdNameMappingKey(NameIdentifier nameIdentifier) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generateIdNameMappingKey and generateKeyForNameMapping these two names are a bit long and confused, can you simplify the name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@jerryshao
Copy link
Contributor

You need to update the rfc-2 markdown file also.

@jerryshao jerryshao merged commit 44d041e into apache:main Sep 5, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask] Adjust id name mapping to support true deletion and modification of entiies
3 participants