Skip to content

Use of generated columns with JSON flattening violates benchmark rules #92

@xzx0xzx

Description

@xzx0xzx

Hi, the table in StarRocks maybe violates the benchmark rules, because its DDL uses generated columns, which flatten the JSON and turn it into multiple non-JSON columns at insertion time.

CREATE TABLE bluesky (
    `id` BIGINT AUTO_INCREMENT,
    `data` JSON NOT NULL COMMENT "Primary JSON object, optimized for field access using FlatJSON",

    sort_key VARBINARY AS encode_sort_key(
        get_json_string(data, 'kind'),
        get_json_string(data, 'commit.operation'),
        get_json_string(data, 'commit.collection'),
        get_json_string(data, 'did')
    )
)
ORDER BY (sort_key);

As shown in the CREATE TABLE statement, encode_sort_key is a generated column, and the get_json_string function flattens the JSON, which violates the benchmark rules.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions