[Vector Index] File-group mapping function for cluster-to-file-group routing

Part of #18676. RFC-104 / [design PR](https://github.com/chrevanthreddy/hudi/pull/1).

## Scope

Records belonging to the same cluster must land in the same contiguous bucket of MDT file groups (cluster = a folder containing N files). This sub-task adds the mapping function used by the MDT writer.

## Tasks

- Add `getVectorKeyToFileGroupMappingFunction(numClusters, fgPerCluster)` in `hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java`.
- Key encoding: prefix the record key with the cluster ID, e.g. `C<hex(clusterId)>|<recordKey>`. Allows prefix scans per cluster at read time.
- Mapping: `fileGroupIndex = (clusterId * fgPerCluster) + (hash(recordKey) % fgPerCluster)`.
- Override `getFileGroupMappingFunction(HoodieIndexVersion)` on the `VECTOR_INDEX` enum in `MetadataPartitionType` so MDT routes records to the right file group.

## Tests

- Unit test: insert many synthetic `(recordKey, clusterId)` tuples; assert all records for cluster `c` land in file groups `[c*fgPerCluster, (c+1)*fgPerCluster)`.
- Unit test: varying `fgPerCluster` (1, 4, 16) — distribution of records within a cluster is roughly uniform across that cluster's file groups.

## Depends on

- Sub-issue 1 (partition type registration)

## Out of scope

Actual writing into the file groups — that happens in sub-issue 5.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Vector Index] File-group mapping function for cluster-to-file-group routing #18852

Scope

Tasks

Tests

Depends on

Out of scope

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Vector Index] File-group mapping function for cluster-to-file-group routing #18852

Description

Scope

Tasks

Tests

Depends on

Out of scope

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions