Skip to content

Schema resolution at query time #4610

@haibow

Description

@haibow

Right now in BrokerReduceService, it will do a basic check on whether there are conflicting schemas across servers. If the row names don't match or some column fails type compatibility, it will just drop some segments.

This has caused issues with dropping segments unexpectedly due to known bugs or operational errors, especially with schema evolution:

  • When updating the schema (e.g. add a new column) for a REALTIME table, even if we call RELOAD on the table, the segment being consumed at the time of schema update will end up with the old schema, while other old and new segments will have the new schema, causing that single segment to be dropped at query time due to schema mismatch (Make Pinot schema evolution easier #4225 (comment))
  • When updating the schema for OFFLINE tables with daily append cadence, if the user didn't backfill old data after changing the schema and devs did not RELOAD the table/old segments, segments would end up in a mix of old and new schemas, causing some segments to be dropped at query time as well.

Possible solutions:

  1. Resolve schema at query time. When segments have different but potentially compatible schemas (e.g. found schema v1 and v2 in dataTableMap, and v2 has one extra column than v1), try to merge all segments in the result, e.g. add default values for the new column for segments with schema v1 on the fly. This might add query latency overhead, and need to be done at each query, since schema inconsistency in the underlying segments are still not fixed.
  2. Resolve schema in an asyn manner. We still keep the existing behavior of dropping segments with conflicting schemas, but when this happens, we will call controller to trigger RELOAD on select segments or the whole table, in an attempt to fix potential schema issues for future queries. Trigger conditions could be:
    • when not all segments have the same schema.
    • when the schema of some segments is different from the table schema. (I may have missed something, but seems at query time, the schema is inferred from segments, and the table schema in ZK is not referenced at all)

For option 2, we might need some extra check to make sure there aren't memory constraints due to reloading unnecessarily/frequently, e.g. we will only reload tables in scenarios where different schemas are compatible (e.g. backward compatible, type compatible), and are confident reloading would resolve the schema inconsistencies as a one-time effort.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions