Skip to content

feat(banyandb/measure): add validation to enforce ShardingKey compatibility with Entity #13814

@hanahmily

Description

@hanahmily

Background

Measure defines two fields that affect data routing:

  • entity (Entity.tag_names) — determines the series and shard for a data point
  • sharding_key (ShardingKey.tag_names) — intended to enhance TopN streaming performance by overriding shard routing
message Measure {
  Entity entity = 4;
  ShardingKey sharding_key = 8;
}

Problem

Although ShardingKey was designed to augment TopN performance while preserving entity locality, there is no enforcement of this contract at the schema level. A user could configure:

entity.tag_names       = ["service_id"]
sharding_key.tag_names = ["instance_id"]

With this configuration, data points sharing the same service_id (same entity) but different instance_id values would be routed to different shards/nodes. Each node would then hold only a partial view of that entity, producing incorrect TopN aggregation results and breaking query correctness.

The rule that the same entity must always map to the same node is currently implicit and relies entirely on the caller (e.g., OAP server) following the convention correctly.

Proposed Fix

Add a server-side validation step when a Measure schema is created or updated.

Rule: ShardingKey.tag_names must be a superset of Entity.tag_names — it may add extra tags for finer-grained routing, but it must include all entity tags to preserve locality.

Validation pseudocode:

if sharding_key is set:
    for each tag in entity.tag_names:
        assert tag ∈ sharding_key.tag_names,
            "ShardingKey must contain all Entity tags to guarantee entity locality"

Acceptance Criteria

  • Schema creation/update for Measure returns a validation error when sharding_key.tag_names does not contain all tags in entity.tag_names
  • The error message clearly explains the rule and provides guidance
  • Existing schemas that already satisfy the rule (e.g., OAP-generated ones) are unaffected
  • Unit tests cover both valid and invalid configurations

References

  • Proto definition: api/proto/banyandb/database/v1/schema.proto

Metadata

Metadata

Assignees

No one assigned

    Labels

    databaseBanyanDB - SkyWalking native database

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions