Skip to content

[multistage] On Multistage Engine Query Parallelism #15842

@ankitsultana

Description

@ankitsultana

MSE Query Parallelism as it stands is quite complicated for users to understand. There are some other issues with it too:

  • Options like stageParallelism can or cannot apply based on Runtime conditions which are complicated to understand for users.
  • There are overlapping responsibilities. e.g. partition_parallelism is largely the same as stageParallelism.
  • Names are confusing. e.g. "partition_size" controls the number of workers in the leaf stage.
  • ... etc.

This is covered in detail in the following doc which also proposes a new behavior that the new optimizer will follow: https://docs.google.com/document/d/1h_IgCiUU4u0xQQ6lz4ZhzH0htd6NaJV8lFBE59MeQZA/edit?tab=t.0#heading=h.aq4kaj522uyn

Metadata

Metadata

Assignees

No one assigned

    Labels

    multi-stageRelated to the multi-stage query enginestaleNo activity for an extended period

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions