Skip to content

[#8838] feat(catalogs): Support create/load/list table operation for lance table.#8879

Merged
jerryshao merged 8 commits intoapache:branch-lance-namepspace-devfrom
yuqi1129:issue_8838
Oct 24, 2025
Merged

[#8838] feat(catalogs): Support create/load/list table operation for lance table.#8879
jerryshao merged 8 commits intoapache:branch-lance-namepspace-devfrom
yuqi1129:issue_8838

Conversation

@yuqi1129
Copy link
Contributor

@yuqi1129 yuqi1129 commented Oct 22, 2025

What changes were proposed in this pull request?

Add support create and load table operations for lance table.

Why are the changes needed?

It's a need.

Fix: #8838
Fix: #8837

Does this PR introduce any user-facing change?

N/A.

How was this patch tested?

Currently, I have only tested it locally

➜  [/Users/yuqi/Downloads] curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" -d '{
  "name": "lance_table14",
  "comment": "This is an example table",
  "columns": [
    {
      "name": "id",
      "type": "integer",
      "comment": "id column comment",
      "nullable": false,
      "autoIncrement": true,
      "defaultValue": {
        "type": "literal",
        "dataType": "integer",
        "value": "-1"
      }
    }
  ],
  "indexes": [
    {
      "indexType": "primary_key",
      "name": "PRIMARY",
      "fieldNames": [["id"]]
    }
  ],
  "properties": {
    "format": "lance",
    "location": "/tmp/lance_catalog/schema/lance_table14"
  }
}' http://localhost:8090/api/metalakes/test/catalogs/lance_catalog/schemas/schema/tables
{"code":0,"table":{"name":"lance_table14","comment":"This is an example table","columns":[{"name":"id","type":"integer","comment":"id column comment","nullable":false,"autoIncrement":true,"defaultValue":{"type":"literal","dataType":"integer","value":"-1"}}],"properties":{"format":"lance","location":"/tmp/lance_catalog/schema/lance_table14/"},"audit":{"creator":"anonymous","createTime":"2025-10-23T03:18:39.123151Z"},"distribution":{"strategy":"none","number":0,"funcArgs":[]},"sortOrders":[],"partitioning":[],"indexes":[{"indexType":"PRIMARY_KEY","name":"PRIMARY","fieldNames":[["id"]]}]}}
➜  [/Users/yuqi/Downloads] curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" http://localhost:8090/api/metalakes/test/catalogs/lance_catalog/schemas/schema/tables/lance_table14
{"code":0,"table":{"name":"lance_table14","comment":"This is an example table","columns":[{"name":"id","type":"integer","comment":"id column comment","nullable":false,"autoIncrement":false,"defaultValue":{"type":"literal","dataType":"integer","value":"-1"}}],"properties":{"format":"lance","location":"/tmp/lance_catalog/schema/lance_table14/"},"audit":{"creator":"anonymous","createTime":"2025-10-23T03:18:39.123151Z"},"distribution":{"strategy":"none","number":0,"funcArgs":[]},"sortOrders":[],"partitioning":[],"indexes":[{"indexType":"PRIMARY_KEY","name":"PRIMARY","fieldNames":[["id"]]}]}}            ➜  [/Users/yuqi/Downloads] curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" http://localhost:8090/api/metalakes/test/catalogs/lance_catalog/schemas/schema/tables
{"code":0,"identifiers":[{"namespace":["test","lance_catalog","schema"],"name":"lance_table10"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table11"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table12"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table13"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table14"}]}                      
➜  [/Users/yuqi/Downloads]

And the lance location

➜  [/tmp/lance_catalog/schema] ls
lance_table10 lance_table11 lance_table12 lance_table13 lance_table14
➜  [/tmp/lance_catalog/schema] cd lance_table14
➜  [/tmp/lance_catalog/schema/lance_table14] ls -al
total 0
drwxr-xr-x@ 4 yuqi  wheel  128 10 23 11:18 .
drwxr-xr-x@ 7 yuqi  wheel  224 10 23 11:18 ..
drwxr-xr-x@ 3 yuqi  wheel   96 10 23 11:18 _transactions
drwxr-xr-x@ 3 yuqi  wheel   96 10 23 11:18 _versions
➜  [/tmp/lance_catalog/schema/lance_table14] ls -al _versions
total 8
drwxr-xr-x@ 3 yuqi  wheel   96 10 23 11:18 .
drwxr-xr-x@ 4 yuqi  wheel  128 10 23 11:18 ..
-rw-r--r--@ 1 yuqi  wheel  225 10 23 11:18 1.manifest
➜  [/tmp/lance_catalog/schema/lance_table14]

@yuqi1129
Copy link
Contributor Author

#8847 should be merged first.

@yuqi1129 yuqi1129 changed the title [#8833] feat(catalogs): Support create/load table operation for lance table. [#8833] feat(catalogs): Support create/load/list table operation for lance table. Oct 23, 2025
@yuqi1129 yuqi1129 changed the title [#8833] feat(catalogs): Support create/load/list table operation for lance table. [#8838] feat(catalogs): Support create/load/list table operation for lance table. Oct 23, 2025
@yuqi1129 yuqi1129 requested review from jerryshao and mchades October 23, 2025 09:07
}
}

private org.apache.arrow.vector.types.pojo.Schema convertColumnsToSchema(Column[] columns) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use a fully qualified name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have classes Schema in Gravitino

} catch (IOException ioe) {
throw new RuntimeException("Failed to load table entity " + ident, ioe);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should place the table operations of GenericLakehouseCatalog in the corresponding catalog, rather than scattering them in TableOperationDispatcher.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me check, there should be many places that need to be adjusted

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check how ManagedSchemaOperations is implemented, don't do lots of special if..else check.

.build();
}

/** Custom JSON serializer for Index objects. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not proper to add the JSON serde in the API module. Should if be in the class like IndexDTO, and then convert the IndexDTO to the IndexImpl here?

default boolean external() {
return false;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can extend the current table interface, it's not good to add a new interface.

Comment on lines +80 to +88
public NameIdentifier[] listTables(Namespace namespace) throws NoSuchSchemaException {
return new NameIdentifier[0];
}

@Override
public Table loadTable(NameIdentifier ident) throws NoSuchTableException {
// Should not come here.
return null;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this will not be happened, we'd better throw an exception here, to avoid an unexpected call.

Comment on lines +110 to +125
return builder
.withName(ident.name())
.withColumns(columns)
.withComment(comment)
.withProperties(properties)
.withDistribution(distribution)
.withIndexes(indexes)
.withAuditInfo(
AuditInfo.builder()
.withCreator(PrincipalUtils.getCurrentUserName())
.withCreateTime(Instant.now())
.build())
.withPartitioning(partitions)
.withSortOrders(sortOrders)
.withFormat("lance")
.build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can move this to the general table operation, and in this class we could mainly focus on the lance related operations.

@@ -43,6 +43,7 @@ dependencies {
implementation(libs.commons.lang3)
implementation(libs.guava)
implementation(libs.hadoop3.client.api)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think hadoop3 is not required.

import org.apache.gravitino.rel.indexes.Index;

@Getter
public class GenericTableEntity extends TableEntity {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have to create a new GenericTableEntity, we can extend the current TableEntity.

col.name(),
org.apache.arrow.vector.types.pojo.FieldType.nullable(
converter.fromGravitino(col.dataType())),
null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's meaning of null here?

Comment on lines +201 to +208
LakehouseCatalogOperations lakehouseCatalogOperations =
SUPPORTED_FORMATS.compute(
format,
(k, v) ->
v == null
? createLakehouseCatalogOperations(
format, properties, catalogInfo, propertiesMetadata)
: v);
Copy link
Contributor

@jerryshao jerryshao Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This map seems have threading issue if the request coming in concurrently.

HasPropertyMetadata propertiesMetadata) {
LakehouseCatalogOperations operations;
switch (format.toLowerCase()) {
case "lance":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd better not hardcode "lance" here and above, using a constant and enum value is better.

import org.apache.gravitino.connector.CatalogOperations;
import org.apache.gravitino.rel.TableCatalog;

public interface LakehouseCatalogOperations extends CatalogOperations, TableCatalog {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between LakehouseCatalogOperations and GenericLakehouseCatalogOperations?

@jerryshao jerryshao merged commit d093824 into apache:branch-lance-namepspace-dev Oct 24, 2025
31 of 32 checks passed
yuqi1129 added a commit to yuqi1129/gravitino that referenced this pull request Oct 27, 2025
mchades pushed a commit that referenced this pull request Oct 30, 2025
### What changes were proposed in this pull request?

This PR trys to resolve the comments that have not been addressed in
#8879

### Why are the changes needed?

It's an improvement.

Fix: #8915

### Does this PR introduce _any_ user-facing change?

N/A.

### How was this patch tested?

Test locally, and we will add ITs in
#8921
jerryshao pushed a commit to jerryshao/gravitino that referenced this pull request Nov 11, 2025
…n for lance table. (apache#8879)

### What changes were proposed in this pull request?

Add support create and load table operations for lance table.

### Why are the changes needed?

It's a need.

Fix: apache#8838 
Fix: apache#8837 

### Does this PR introduce _any_ user-facing change?

N/A.

### How was this patch tested?

Currently, I have only tested it locally 

```shell
➜  [/Users/yuqi/Downloads] curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" -d '{
  "name": "lance_table14",
  "comment": "This is an example table",
  "columns": [
    {
      "name": "id",
      "type": "integer",
      "comment": "id column comment",
      "nullable": false,
      "autoIncrement": true,
      "defaultValue": {
        "type": "literal",
        "dataType": "integer",
        "value": "-1"
      }
    }
  ],
  "indexes": [
    {
      "indexType": "primary_key",
      "name": "PRIMARY",
      "fieldNames": [["id"]]
    }
  ],
  "properties": {
    "format": "lance",
    "location": "/tmp/lance_catalog/schema/lance_table14"
  }
}' http://localhost:8090/api/metalakes/test/catalogs/lance_catalog/schemas/schema/tables
{"code":0,"table":{"name":"lance_table14","comment":"This is an example table","columns":[{"name":"id","type":"integer","comment":"id column comment","nullable":false,"autoIncrement":true,"defaultValue":{"type":"literal","dataType":"integer","value":"-1"}}],"properties":{"format":"lance","location":"/tmp/lance_catalog/schema/lance_table14/"},"audit":{"creator":"anonymous","createTime":"2025-10-23T03:18:39.123151Z"},"distribution":{"strategy":"none","number":0,"funcArgs":[]},"sortOrders":[],"partitioning":[],"indexes":[{"indexType":"PRIMARY_KEY","name":"PRIMARY","fieldNames":[["id"]]}]}}
➜  [/Users/yuqi/Downloads] curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" http://localhost:8090/api/metalakes/test/catalogs/lance_catalog/schemas/schema/tables/lance_table14
{"code":0,"table":{"name":"lance_table14","comment":"This is an example table","columns":[{"name":"id","type":"integer","comment":"id column comment","nullable":false,"autoIncrement":false,"defaultValue":{"type":"literal","dataType":"integer","value":"-1"}}],"properties":{"format":"lance","location":"/tmp/lance_catalog/schema/lance_table14/"},"audit":{"creator":"anonymous","createTime":"2025-10-23T03:18:39.123151Z"},"distribution":{"strategy":"none","number":0,"funcArgs":[]},"sortOrders":[],"partitioning":[],"indexes":[{"indexType":"PRIMARY_KEY","name":"PRIMARY","fieldNames":[["id"]]}]}}            ➜  [/Users/yuqi/Downloads] curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" http://localhost:8090/api/metalakes/test/catalogs/lance_catalog/schemas/schema/tables
{"code":0,"identifiers":[{"namespace":["test","lance_catalog","schema"],"name":"lance_table10"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table11"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table12"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table13"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table14"}]}                      
➜  [/Users/yuqi/Downloads]
```

And the lance location 
```shell
➜  [/tmp/lance_catalog/schema] ls
lance_table10 lance_table11 lance_table12 lance_table13 lance_table14
➜  [/tmp/lance_catalog/schema] cd lance_table14
➜  [/tmp/lance_catalog/schema/lance_table14] ls -al
total 0
drwxr-xr-x@ 4 yuqi  wheel  128 10 23 11:18 .
drwxr-xr-x@ 7 yuqi  wheel  224 10 23 11:18 ..
drwxr-xr-x@ 3 yuqi  wheel   96 10 23 11:18 _transactions
drwxr-xr-x@ 3 yuqi  wheel   96 10 23 11:18 _versions
➜  [/tmp/lance_catalog/schema/lance_table14] ls -al _versions
total 8
drwxr-xr-x@ 3 yuqi  wheel   96 10 23 11:18 .
drwxr-xr-x@ 4 yuqi  wheel  128 10 23 11:18 ..
-rw-r--r--@ 1 yuqi  wheel  225 10 23 11:18 1.manifest
➜  [/tmp/lance_catalog/schema/lance_table14]
```
jerryshao pushed a commit to jerryshao/gravitino that referenced this pull request Nov 11, 2025
…pache#8922)

### What changes were proposed in this pull request?

This PR trys to resolve the comments that have not been addressed in
apache#8879

### Why are the changes needed?

It's an improvement.

Fix: apache#8915

### Does this PR introduce _any_ user-facing change?

N/A.

### How was this patch tested?

Test locally, and we will add ITs in
apache#8921
youngyjd pushed a commit to youngyjd/gravitino that referenced this pull request Nov 19, 2025
…n for lance table. (apache#8879)

Add support create and load table operations for lance table.

It's a need.

Fix: apache#8838
Fix: apache#8837

N/A.

Currently, I have only tested it locally

```shell
➜  [/Users/yuqi/Downloads] curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" -d '{
  "name": "lance_table14",
  "comment": "This is an example table",
  "columns": [
    {
      "name": "id",
      "type": "integer",
      "comment": "id column comment",
      "nullable": false,
      "autoIncrement": true,
      "defaultValue": {
        "type": "literal",
        "dataType": "integer",
        "value": "-1"
      }
    }
  ],
  "indexes": [
    {
      "indexType": "primary_key",
      "name": "PRIMARY",
      "fieldNames": [["id"]]
    }
  ],
  "properties": {
    "format": "lance",
    "location": "/tmp/lance_catalog/schema/lance_table14"
  }
}' http://localhost:8090/api/metalakes/test/catalogs/lance_catalog/schemas/schema/tables
{"code":0,"table":{"name":"lance_table14","comment":"This is an example table","columns":[{"name":"id","type":"integer","comment":"id column comment","nullable":false,"autoIncrement":true,"defaultValue":{"type":"literal","dataType":"integer","value":"-1"}}],"properties":{"format":"lance","location":"/tmp/lance_catalog/schema/lance_table14/"},"audit":{"creator":"anonymous","createTime":"2025-10-23T03:18:39.123151Z"},"distribution":{"strategy":"none","number":0,"funcArgs":[]},"sortOrders":[],"partitioning":[],"indexes":[{"indexType":"PRIMARY_KEY","name":"PRIMARY","fieldNames":[["id"]]}]}}
➜  [/Users/yuqi/Downloads] curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" http://localhost:8090/api/metalakes/test/catalogs/lance_catalog/schemas/schema/tables/lance_table14
{"code":0,"table":{"name":"lance_table14","comment":"This is an example table","columns":[{"name":"id","type":"integer","comment":"id column comment","nullable":false,"autoIncrement":false,"defaultValue":{"type":"literal","dataType":"integer","value":"-1"}}],"properties":{"format":"lance","location":"/tmp/lance_catalog/schema/lance_table14/"},"audit":{"creator":"anonymous","createTime":"2025-10-23T03:18:39.123151Z"},"distribution":{"strategy":"none","number":0,"funcArgs":[]},"sortOrders":[],"partitioning":[],"indexes":[{"indexType":"PRIMARY_KEY","name":"PRIMARY","fieldNames":[["id"]]}]}}            ➜  [/Users/yuqi/Downloads] curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" http://localhost:8090/api/metalakes/test/catalogs/lance_catalog/schemas/schema/tables
{"code":0,"identifiers":[{"namespace":["test","lance_catalog","schema"],"name":"lance_table10"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table11"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table12"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table13"},{"namespace":["test","lance_catalog","schema"],"name":"lance_table14"}]}
➜  [/Users/yuqi/Downloads]
```

And the lance location
```shell
➜  [/tmp/lance_catalog/schema] ls
lance_table10 lance_table11 lance_table12 lance_table13 lance_table14
➜  [/tmp/lance_catalog/schema] cd lance_table14
➜  [/tmp/lance_catalog/schema/lance_table14] ls -al
total 0
drwxr-xr-x@ 4 yuqi  wheel  128 10 23 11:18 .
drwxr-xr-x@ 7 yuqi  wheel  224 10 23 11:18 ..
drwxr-xr-x@ 3 yuqi  wheel   96 10 23 11:18 _transactions
drwxr-xr-x@ 3 yuqi  wheel   96 10 23 11:18 _versions
➜  [/tmp/lance_catalog/schema/lance_table14] ls -al _versions
total 8
drwxr-xr-x@ 3 yuqi  wheel   96 10 23 11:18 .
drwxr-xr-x@ 4 yuqi  wheel  128 10 23 11:18 ..
-rw-r--r--@ 1 yuqi  wheel  225 10 23 11:18 1.manifest
➜  [/tmp/lance_catalog/schema/lance_table14]
```
youngyjd pushed a commit to youngyjd/gravitino that referenced this pull request Nov 21, 2025
…pache#8922)

This PR trys to resolve the comments that have not been addressed in

It's an improvement.

Fix: apache#8915

N/A.

Test locally, and we will add ITs in
apache#8921
youngyjd pushed a commit to youngyjd/gravitino that referenced this pull request Nov 21, 2025
…pache#8922)

This PR trys to resolve the comments that have not been addressed in

It's an improvement.

Fix: apache#8915

N/A.

Test locally, and we will add ITs in
apache#8921
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants