Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#412] feat(core): Generic property system support #441

Merged
merged 13 commits into from
Sep 27, 2023

Conversation

mchades
Copy link
Contributor

@mchades mchades commented Sep 22, 2023

What changes were proposed in this pull request?

  1. Add the HasPropertyMetadata interface to BaseCatalog and CatalogOperations for acquiring PropertyMetadata
  2. Add PropertyMetadata interface for tableProperty, schemaProperty and catalogProperty can also implement it later on
  3. validate table properties when CatalogOperationDispatcher operates tables

Why are the changes needed?

Under the property framework, every catalog only needs to care about the table property metadata definition in itself, the work of property validation will be done by the framework.

Fix: #412

Does this PR introduce any user-facing change?

No

How was this patch tested?

UT added in TestCatalogOperationDispatcher

@github-actions
Copy link

github-actions bot commented Sep 22, 2023

Code Coverage Report

Overall Project 62.92% -0.2% 🟢
Files changed 90.42% 🟢

Module Coverage
core 77.02% -0.57% 🟢
catalog-lakehouse-iceberg 75.95% -0.23% 🟢
catalog-hive 60.7% -0.12% 🟢
Files
Module File Coverage
core TablePropertiesMetadata.java 98.57% -1.43% 🟢
PropertiesMetadata.java 97.1% -2.9% 🟢
EntityCombinedTable.java 92.52% 🟢
PropertyEntry.java 91.73% -8.27% 🟢
CatalogOperationDispatcher.java 90.82% -3.06% 🟢
CatalogManager.java 71.93% 🟢
BaseCatalog.java 71.2% -3.2% 🔴
catalog-lakehouse-iceberg IcebergCatalogOperations.java 71.73% -0.63% 🟢
IcebergTablePropertiesMetadata.java 60% -40% 🟢
catalog-hive HiveCatalogOperations.java 67.73% -0.22% 🟢
HiveTablePropertiesMetadata.java 60% -40% 🟢

@mchades mchades marked this pull request as ready for review September 25, 2023 13:05
@mchades mchades self-assigned this Sep 26, 2023
boolean required,
boolean immutable,
Class<T> javaType,
T defaultValue,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems missing defaultValue process logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defaultValue is handled by each catalog separately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why defaultValue is handled by each catalog? seems it's common

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because catalog maybe want to know has the user set this property, If we set defaultValue for all absent non-required properties in advance, the catalog can't tell if it was set by the user

Copy link
Contributor

@FANNG1 FANNG1 Sep 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any property meets the scene you described? why should catalog differences it's set by user or graviton?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance, the location is a non-required property for a Hive table. If the user does not specify a location, Hive will automatically assign different values to different tables. However, if null is set as the default value for the location, it will result in a NPE in HMS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

location is special and shouldn't have a default value. HiveCatalog should assign a table location. For most cases, such as table format, which have a fixed default value, I think it's more suitable to be processed in common code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HiveCatalog should assign a table location.

It's not necessary. For external table, location can also be assigned by user.

As you said, here does have some special properties that cannot be processed in common code, so I still think the defaultValue should be processed in catalog themselves

() ->
mergeTablePropertySpecs(
basicCapability.tablePropertySpecs(),
acquireCatalogCapability().tablePropertySpecs());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here, we mixed CatalogCapabity and TableCapabity, should we separate them? Take Iceberg for example,
catalogType is required when creating IcebergCatalog, while when creating IcebergTable, I don't want to specify catalogType again, it becomes reserved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HiveCatalog. metastore.uri maybe have same logic

Copy link
Contributor Author

@mchades mchades Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here, we mixed CatalogCapabity and TableCapabity

Here only process tablePropertySpecs, does not include CatalogPropertySpecs

@@ -87,6 +87,10 @@ public <R> R doWithTableOps(ThrowableFunction<TableCatalog, R> fn) throws Except
});
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when Create&Alter&Load catalog, should we check properties?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

catalog property validation work in another pr #452

@mchades
Copy link
Contributor Author

mchades commented Sep 26, 2023

I have refactored the HasPropertyMetadata. Can you please help to review it again when you have time? thanks.
@jerryshao @yuqi1129

@jerryshao
Copy link
Contributor

@mchades you need to update the PR description.

* @param hidden Whether this property is hidden from user, such as password
* @param reserved This property is reserved and cannot be set by user
*/
private PropertyEntry(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to change to builder pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

Comment on lines 84 to 153
public static boolean isReservedProperty(
String propertyName, Map<String, PropertyEntry<?>> propertySpecs) {
return propertySpecs.containsKey(propertyName) && propertySpecs.get(propertyName).isReserved();
}

public static boolean isRequiredProperty(
String propertyName, Map<String, PropertyEntry<?>> propertySpecs) {
return propertySpecs.containsKey(propertyName) && propertySpecs.get(propertyName).isRequired();
}

public static boolean isImmutableProperty(
String propertyName, Map<String, PropertyEntry<?>> propertySpecs) {
return propertySpecs.containsKey(propertyName) && propertySpecs.get(propertyName).isImmutable();
}

public static boolean isHiddenProperty(
String propertyName, Map<String, PropertyEntry<?>> propertySpecs) {
return propertySpecs.containsKey(propertyName) && propertySpecs.get(propertyName).isHidden();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these better to move to PropertyMetadata?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved

}

// need remove?
public static Map<String, String> encodeProperties(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not used, then we can remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@@ -31,7 +32,7 @@ public String shortName() {
*/
@Override
protected CatalogOperations newOps(Map<String, String> config) {
HiveCatalogOperations ops = new HiveCatalogOperations(entity());
HiveCatalogOperations ops = new HiveCatalogOperations(entity(), tableProperty());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you don't have to create xxxProperty() here and pass in to the specific HiveCatalogOperations. Instead, you can let HiveCatalogOperations implements HasPropertyMetadata interface, and in catalog, we can change to:

  @Override
  public PropertyMetadata tableProperty() throws UnsupportedOperationException {
    return ops.tableProperty();
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I have let CatalogOperations implements HasPropertyMetadata interface to make sure catalog developers implement the interface.

BTW, the PR description already updated.

fix Iceberg

rename ability to capability

fix typo

add illegal combination validation in PropertySpec

remove alterTable and loadTable wrapper

remove generics V

refact

split HasPropertyMetadata interface

PropertyEntry use builder pattern

move static method to PropertyMetadata
return decoder.apply(value);
}

public static PropertyEntry<String> stringProperty(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better changing to stringPropertyEntry?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I think you can add several help methods like stringRequiredProperty, stringReservedProperty to simplify the use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added stringRequiredProperty, stringReservedProperty and stringImmutablePropertyEntry method for current usage and test, and will add more help methods in catalog development later on, since there are many flag combinations.

return table;
}

private Table loadTableWithAllProperties(NameIdentifier ident) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to split to a new method, I think changing the loadTable should be enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the original loadTable method will verify the ID_KEY property, but the property framework will remove hidden property after loadTable, so I renamed the original loadTable to loadTableWithAllProperties for reuse the logic.

for (TableChange change : changes) {
if (change instanceof TableChange.SetProperty) {
String propertyName = ((TableChange.SetProperty) change).getProperty();
if (isReservedProperty(propertyName, tablePropertySpecs)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very clear here: Why here is not the same as line 514, as both SetProperty and RemoveProperty will change the property value. If the value of property is unchanging, it should fail here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImmutableProperty means the property cannot be changed by user, but if the property has not been set before, user can set it.

@mchades mchades closed this Sep 27, 2023
@mchades mchades reopened this Sep 27, 2023
@mchades mchades closed this Sep 27, 2023
@mchades mchades reopened this Sep 27, 2023
@jerryshao jerryshao merged commit 33a7d2b into apache:main Sep 27, 2023
2 checks passed
Clearvive pushed a commit that referenced this pull request Oct 17, 2023
### What changes were proposed in this pull request?

1. Add the `HasPropertyMetadata` interface to `BaseCatalog` and
`CatalogOperations` for acquiring `PropertyMetadata`
2. Add `PropertyMetadata` interface for `tableProperty`,
`schemaProperty` and `catalogProperty` can also implement it later on
3. validate table properties when `CatalogOperationDispatcher` operates
tables

### Why are the changes needed?

Under the property framework, every catalog only needs to care about the
table property metadata definition in itself, the work of property
validation will be done by the framework.

Fix: #412 

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
UT added in `TestCatalogOperationDispatcher`

---------

Co-authored-by: yuqi <yuqi@datastrato.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improvement] Property meta is needed for table properties
4 participants