Skip to content

[Improvement] Fix Lance partition stats drop filter construction for quoted partition/statistic names #10273

@justinmclean

Description

@justinmclean

What would you like to be improved?

LancePartitionStatisticStorage.dropStatisticsImpl builds the dataset.delete(...) predicate by directly concatenating partitionName and statisticNames into SQL-like quoted strings. If a name contains a single quote (for example part'ition0), the generated predicate becomes invalid. This path is reachable from production REST flow for drop-partition-statistics requests when Lance partition stats storage is enabled.

How should we improve?

Stop building predicates with raw string concatenation. Escape single quotes in all string literals before composing the filter (for SQL-style literals, replace ' with ''), or switch to a safe API that supports parameterized/structured predicates if available in the Lance Java API.

Here's a test to help:

@Test
public void testDropPartitionStatisticsWithSingleQuotePartitionName() throws Exception {
  PartitionStatisticStorageFactory factory = new LancePartitionStatisticStorageFactory();

  String metalakeName = "metalake";
  MetadataObject metadataObject =
      MetadataObjects.of(Lists.newArrayList("catalog", "schema", "table"), MetadataObject.Type.TABLE);

  EntityStore entityStore = mock(EntityStore.class);
  TableEntity tableEntity = mock(TableEntity.class);
  when(entityStore.get(any(), any(), any())).thenReturn(tableEntity);
  when(tableEntity.id()).thenReturn(202L);
  FieldUtils.writeField(GravitinoEnv.getInstance(), "entityStore", entityStore, true);

  String location = Files.createTempDirectory("lance_stats_quote_drop_test").toString();
  Map<String, String> properties = Maps.newHashMap();
  properties.put("location", location);
  LancePartitionStatisticStorage storage =
      (LancePartitionStatisticStorage) factory.create(properties);

  try {
    Map<String, StatisticValue<?>> statistics = Maps.newHashMap();
    statistics.put("statistic0", StatisticValues.longValue(1L));

    List<MetadataObjectStatisticsUpdate> updates =
        Lists.newArrayList(
            MetadataObjectStatisticsUpdate.of(
                metadataObject,
                Lists.newArrayList(
                    PartitionStatisticsModification.update("part'ition0", statistics))));
    storage.updateStatistics(metalakeName, updates);

    List<MetadataObjectStatisticsDrop> drops =
        Lists.newArrayList(
            MetadataObjectStatisticsDrop.of(
                metadataObject,
                Lists.newArrayList(
                    PartitionStatisticsModification.drop(
                        "part'ition0", Lists.newArrayList("statistic0")))));

    Assertions.assertDoesNotThrow(() -> storage.dropStatistics(metalakeName, drops));
  } finally {
    FileUtils.deleteDirectory(new File(location + "/" + tableEntity.id() + ".lance"));
    storage.close();
  }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions