[#8838] feat(catalogs): Support create/load/list table operation for lance table.#8879
Conversation
|
#8847 should be merged first. |
| } | ||
| } | ||
|
|
||
| private org.apache.arrow.vector.types.pojo.Schema convertColumnsToSchema(Column[] columns) { |
There was a problem hiding this comment.
Why use a fully qualified name?
There was a problem hiding this comment.
We already have classes Schema in Gravitino
| } catch (IOException ioe) { | ||
| throw new RuntimeException("Failed to load table entity " + ident, ioe); | ||
| } | ||
| } |
There was a problem hiding this comment.
I think you should place the table operations of GenericLakehouseCatalog in the corresponding catalog, rather than scattering them in TableOperationDispatcher.
There was a problem hiding this comment.
Let me check, there should be many places that need to be adjusted
There was a problem hiding this comment.
Check how ManagedSchemaOperations is implemented, don't do lots of special if..else check.
| .build(); | ||
| } | ||
|
|
||
| /** Custom JSON serializer for Index objects. */ |
There was a problem hiding this comment.
It's not proper to add the JSON serde in the API module. Should if be in the class like IndexDTO, and then convert the IndexDTO to the IndexImpl here?
| default boolean external() { | ||
| return false; | ||
| } | ||
| } |
There was a problem hiding this comment.
We can extend the current table interface, it's not good to add a new interface.
| public NameIdentifier[] listTables(Namespace namespace) throws NoSuchSchemaException { | ||
| return new NameIdentifier[0]; | ||
| } | ||
|
|
||
| @Override | ||
| public Table loadTable(NameIdentifier ident) throws NoSuchTableException { | ||
| // Should not come here. | ||
| return null; | ||
| } |
There was a problem hiding this comment.
If this will not be happened, we'd better throw an exception here, to avoid an unexpected call.
| return builder | ||
| .withName(ident.name()) | ||
| .withColumns(columns) | ||
| .withComment(comment) | ||
| .withProperties(properties) | ||
| .withDistribution(distribution) | ||
| .withIndexes(indexes) | ||
| .withAuditInfo( | ||
| AuditInfo.builder() | ||
| .withCreator(PrincipalUtils.getCurrentUserName()) | ||
| .withCreateTime(Instant.now()) | ||
| .build()) | ||
| .withPartitioning(partitions) | ||
| .withSortOrders(sortOrders) | ||
| .withFormat("lance") | ||
| .build(); |
There was a problem hiding this comment.
We can move this to the general table operation, and in this class we could mainly focus on the lance related operations.
| @@ -43,6 +43,7 @@ dependencies { | |||
| implementation(libs.commons.lang3) | |||
| implementation(libs.guava) | |||
| implementation(libs.hadoop3.client.api) | |||
There was a problem hiding this comment.
I think hadoop3 is not required.
| import org.apache.gravitino.rel.indexes.Index; | ||
|
|
||
| @Getter | ||
| public class GenericTableEntity extends TableEntity { |
There was a problem hiding this comment.
We don't have to create a new GenericTableEntity, we can extend the current TableEntity.
| col.name(), | ||
| org.apache.arrow.vector.types.pojo.FieldType.nullable( | ||
| converter.fromGravitino(col.dataType())), | ||
| null); |
There was a problem hiding this comment.
What's meaning of null here?
| LakehouseCatalogOperations lakehouseCatalogOperations = | ||
| SUPPORTED_FORMATS.compute( | ||
| format, | ||
| (k, v) -> | ||
| v == null | ||
| ? createLakehouseCatalogOperations( | ||
| format, properties, catalogInfo, propertiesMetadata) | ||
| : v); |
There was a problem hiding this comment.
This map seems have threading issue if the request coming in concurrently.
| HasPropertyMetadata propertiesMetadata) { | ||
| LakehouseCatalogOperations operations; | ||
| switch (format.toLowerCase()) { | ||
| case "lance": |
There was a problem hiding this comment.
We'd better not hardcode "lance" here and above, using a constant and enum value is better.
| import org.apache.gravitino.connector.CatalogOperations; | ||
| import org.apache.gravitino.rel.TableCatalog; | ||
|
|
||
| public interface LakehouseCatalogOperations extends CatalogOperations, TableCatalog {} |
There was a problem hiding this comment.
What's the difference between LakehouseCatalogOperations and GenericLakehouseCatalogOperations?
d093824
into
apache:branch-lance-namepspace-dev
### What changes were proposed in this pull request? This PR trys to resolve the comments that have not been addressed in #8879 ### Why are the changes needed? It's an improvement. Fix: #8915 ### Does this PR introduce _any_ user-facing change? N/A. ### How was this patch tested? Test locally, and we will add ITs in #8921
…n for lance table. (apache#8879) ### What changes were proposed in this pull request? Add support create and load table operations for lance table. ### Why are the changes needed? It's a need. Fix: apache#8838 Fix: apache#8837 ### Does this PR introduce _any_ user-facing change? N/A. ### How was this patch tested? Currently, I have only tested it locally ```shell ➜ [/Users/yuqi/Downloads] curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \ -H "Content-Type: application/json" -d '{ "name": "lance_table14", "comment": "This is an example table", "columns": [ { "name": "id", "type": "integer", "comment": "id column comment", "nullable": false, "autoIncrement": true, "defaultValue": { "type": "literal", "dataType": "integer", "value": "-1" } } ], "indexes": [ { "indexType": "primary_key", "name": "PRIMARY", "fieldNames": [["id"]] } ], "properties": { "format": "lance", "location": "/tmp/lance_catalog/schema/lance_table14" } }' http://localhost:8090/api/metalakes/test/catalogs/lance_catalog/schemas/schema/tables {"code":0,"table":{"name":"lance_table14","comment":"This is an example table","columns":[{"name":"id","type":"integer","comment":"id column comment","nullable":false,"autoIncrement":true,"defaultValue":{"type":"literal","dataType":"integer","value":"-1"}}],"properties":{"format":"lance","location":"/tmp/lance_catalog/schema/lance_table14/"},"audit":{"creator":"anonymous","createTime":"2025-10-23T03:18:39.123151Z"},"distribution":{"strategy":"none","number":0,"funcArgs":[]},"sortOrders":[],"partitioning":[],"indexes":[{"indexType":"PRIMARY_KEY","name":"PRIMARY","fieldNames":[["id"]]}]}} ➜ [/Users/yuqi/Downloads] curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \ -H "Content-Type: application/json" http://localhost:8090/api/metalakes/test/catalogs/lance_catalog/schemas/schema/tables/lance_table14 {"code":0,"table":{"name":"lance_table14","comment":"This is an example table","columns":[{"name":"id","type":"integer","comment":"id column comment","nullable":false,"autoIncrement":false,"defaultValue":{"type":"literal","dataType":"integer","value":"-1"}}],"properties":{"format":"lance","location":"/tmp/lance_catalog/schema/lance_table14/"},"audit":{"creator":"anonymous","createTime":"2025-10-23T03:18:39.123151Z"},"distribution":{"strategy":"none","number":0,"funcArgs":[]},"sortOrders":[],"partitioning":[],"indexes":[{"indexType":"PRIMARY_KEY","name":"PRIMARY","fieldNames":[["id"]]}]}} ➜ [/Users/yuqi/Downloads] curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \ -H "Content-Type: application/json" http://localhost:8090/api/metalakes/test/catalogs/lance_catalog/schemas/schema/tables {"code":0,"identifiers":[{"namespace":["test","lance_catalog","schema"],"name":"lance_table10"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table11"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table12"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table13"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table14"}]} ➜ [/Users/yuqi/Downloads] ``` And the lance location ```shell ➜ [/tmp/lance_catalog/schema] ls lance_table10 lance_table11 lance_table12 lance_table13 lance_table14 ➜ [/tmp/lance_catalog/schema] cd lance_table14 ➜ [/tmp/lance_catalog/schema/lance_table14] ls -al total 0 drwxr-xr-x@ 4 yuqi wheel 128 10 23 11:18 . drwxr-xr-x@ 7 yuqi wheel 224 10 23 11:18 .. drwxr-xr-x@ 3 yuqi wheel 96 10 23 11:18 _transactions drwxr-xr-x@ 3 yuqi wheel 96 10 23 11:18 _versions ➜ [/tmp/lance_catalog/schema/lance_table14] ls -al _versions total 8 drwxr-xr-x@ 3 yuqi wheel 96 10 23 11:18 . drwxr-xr-x@ 4 yuqi wheel 128 10 23 11:18 .. -rw-r--r--@ 1 yuqi wheel 225 10 23 11:18 1.manifest ➜ [/tmp/lance_catalog/schema/lance_table14] ```
…pache#8922) ### What changes were proposed in this pull request? This PR trys to resolve the comments that have not been addressed in apache#8879 ### Why are the changes needed? It's an improvement. Fix: apache#8915 ### Does this PR introduce _any_ user-facing change? N/A. ### How was this patch tested? Test locally, and we will add ITs in apache#8921
…n for lance table. (apache#8879) Add support create and load table operations for lance table. It's a need. Fix: apache#8838 Fix: apache#8837 N/A. Currently, I have only tested it locally ```shell ➜ [/Users/yuqi/Downloads] curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \ -H "Content-Type: application/json" -d '{ "name": "lance_table14", "comment": "This is an example table", "columns": [ { "name": "id", "type": "integer", "comment": "id column comment", "nullable": false, "autoIncrement": true, "defaultValue": { "type": "literal", "dataType": "integer", "value": "-1" } } ], "indexes": [ { "indexType": "primary_key", "name": "PRIMARY", "fieldNames": [["id"]] } ], "properties": { "format": "lance", "location": "/tmp/lance_catalog/schema/lance_table14" } }' http://localhost:8090/api/metalakes/test/catalogs/lance_catalog/schemas/schema/tables {"code":0,"table":{"name":"lance_table14","comment":"This is an example table","columns":[{"name":"id","type":"integer","comment":"id column comment","nullable":false,"autoIncrement":true,"defaultValue":{"type":"literal","dataType":"integer","value":"-1"}}],"properties":{"format":"lance","location":"/tmp/lance_catalog/schema/lance_table14/"},"audit":{"creator":"anonymous","createTime":"2025-10-23T03:18:39.123151Z"},"distribution":{"strategy":"none","number":0,"funcArgs":[]},"sortOrders":[],"partitioning":[],"indexes":[{"indexType":"PRIMARY_KEY","name":"PRIMARY","fieldNames":[["id"]]}]}} ➜ [/Users/yuqi/Downloads] curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \ -H "Content-Type: application/json" http://localhost:8090/api/metalakes/test/catalogs/lance_catalog/schemas/schema/tables/lance_table14 {"code":0,"table":{"name":"lance_table14","comment":"This is an example table","columns":[{"name":"id","type":"integer","comment":"id column comment","nullable":false,"autoIncrement":false,"defaultValue":{"type":"literal","dataType":"integer","value":"-1"}}],"properties":{"format":"lance","location":"/tmp/lance_catalog/schema/lance_table14/"},"audit":{"creator":"anonymous","createTime":"2025-10-23T03:18:39.123151Z"},"distribution":{"strategy":"none","number":0,"funcArgs":[]},"sortOrders":[],"partitioning":[],"indexes":[{"indexType":"PRIMARY_KEY","name":"PRIMARY","fieldNames":[["id"]]}]}} ➜ [/Users/yuqi/Downloads] curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \ -H "Content-Type: application/json" http://localhost:8090/api/metalakes/test/catalogs/lance_catalog/schemas/schema/tables {"code":0,"identifiers":[{"namespace":["test","lance_catalog","schema"],"name":"lance_table10"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table11"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table12"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table13"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table14"}]} ➜ [/Users/yuqi/Downloads] ``` And the lance location ```shell ➜ [/tmp/lance_catalog/schema] ls lance_table10 lance_table11 lance_table12 lance_table13 lance_table14 ➜ [/tmp/lance_catalog/schema] cd lance_table14 ➜ [/tmp/lance_catalog/schema/lance_table14] ls -al total 0 drwxr-xr-x@ 4 yuqi wheel 128 10 23 11:18 . drwxr-xr-x@ 7 yuqi wheel 224 10 23 11:18 .. drwxr-xr-x@ 3 yuqi wheel 96 10 23 11:18 _transactions drwxr-xr-x@ 3 yuqi wheel 96 10 23 11:18 _versions ➜ [/tmp/lance_catalog/schema/lance_table14] ls -al _versions total 8 drwxr-xr-x@ 3 yuqi wheel 96 10 23 11:18 . drwxr-xr-x@ 4 yuqi wheel 128 10 23 11:18 .. -rw-r--r--@ 1 yuqi wheel 225 10 23 11:18 1.manifest ➜ [/tmp/lance_catalog/schema/lance_table14] ```
…pache#8922) This PR trys to resolve the comments that have not been addressed in It's an improvement. Fix: apache#8915 N/A. Test locally, and we will add ITs in apache#8921
…pache#8922) This PR trys to resolve the comments that have not been addressed in It's an improvement. Fix: apache#8915 N/A. Test locally, and we will add ITs in apache#8921
What changes were proposed in this pull request?
Add support create and load table operations for lance table.
Why are the changes needed?
It's a need.
Fix: #8838
Fix: #8837
Does this PR introduce any user-facing change?
N/A.
How was this patch tested?
Currently, I have only tested it locally
And the lance location