Skip to content

refactor(query): use graphs to build pipelines #18184

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 19, 2025

Conversation

zhang2014
Copy link
Member

@zhang2014 zhang2014 commented Jun 18, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

refactor(query): use graphs to build pipelines

  • Use digraph to describe the pipeline for quick visualization.
  • Allow a single processor to include multiple inputs and outputs in scheduling, which will enhance the orchestration of union and join operations.

example

root@localhost:8000/test_explain_window> explain pipeline select number, lead(number,1, 0) over (partition by number % 3 order by number+ 1), lead(number,2, 0) over (partition by number % 3 order by number + 1) from numbers(50);

-[ EXPLAIN ]-----------------------------------
digraph {
    0 [ label = "NumbersSourceTransform" ]
    1 [ label = "CompoundBlockOperator(Map)" ]
    2 [ label = "ScatterTransform" ]
    3 [ label = "TransformScatterExchangeSerializer" ]
    4 [ label = "ExchangeShuffleTransform" ]
    5 [ label = "ExchangeWriterSink" ]
    6 [ label = "Resize" ]
    7 [ label = "ExchangeWriterSink" ]
    8 [ label = "ExchangeSourceReader" ]
    9 [ label = "ExchangeSourceReader" ]
    10 [ label = "Resize" ]
    11 [ label = "TransformExchangeDeserializer" ]
    12 [ label = "TransformExchangeDeserializer" ]
    13 [ label = "TransformExchangeDeserializer" ]
    14 [ label = "TransformExchangeDeserializer" ]
    15 [ label = "ShufflePartition(Window)" ]
    16 [ label = "ShufflePartition(Window)" ]
    17 [ label = "ShufflePartition(Window)" ]
    18 [ label = "ShufflePartition(Window)" ]
    19 [ label = "ShuffleMergePartition(Window)" ]
    20 [ label = "ShuffleMergePartition(Window)" ]
    21 [ label = "ShuffleMergePartition(Window)" ]
    22 [ label = "ShuffleMergePartition(Window)" ]
    23 [ label = "TransformWindowPartitionCollect(Sort)" ]
    24 [ label = "TransformWindowPartitionCollect(Sort)" ]
    25 [ label = "TransformWindowPartitionCollect(Sort)" ]
    26 [ label = "TransformWindowPartitionCollect(Sort)" ]
    27 [ label = "Transform Window" ]
    28 [ label = "Transform Window" ]
    29 [ label = "Transform Window" ]
    30 [ label = "Transform Window" ]
    31 [ label = "CompoundBlockOperator(Map)" ]
    32 [ label = "CompoundBlockOperator(Map)" ]
    33 [ label = "CompoundBlockOperator(Map)" ]
    34 [ label = "CompoundBlockOperator(Map)" ]
    35 [ label = "ShufflePartition(Window)" ]
    36 [ label = "ShufflePartition(Window)" ]
    37 [ label = "ShufflePartition(Window)" ]
    38 [ label = "ShufflePartition(Window)" ]
    39 [ label = "ShuffleMergePartition(Window)" ]
    40 [ label = "ShuffleMergePartition(Window)" ]
    41 [ label = "ShuffleMergePartition(Window)" ]
    42 [ label = "ShuffleMergePartition(Window)" ]
    43 [ label = "TransformWindowPartitionCollect(Sort)" ]
    44 [ label = "TransformWindowPartitionCollect(Sort)" ]
    45 [ label = "TransformWindowPartitionCollect(Sort)" ]
    46 [ label = "TransformWindowPartitionCollect(Sort)" ]
    47 [ label = "Transform Window" ]
    48 [ label = "Transform Window" ]
    49 [ label = "Transform Window" ]
    50 [ label = "Transform Window" ]
    51 [ label = "DummyTransform" ]
    52 [ label = "DummyTransform" ]
    53 [ label = "DummyTransform" ]
    54 [ label = "DummyTransform" ]
    55 [ label = "ExchangeSourceReader" ]
    56 [ label = "ExchangeSourceReader" ]
    57 [ label = "Resize" ]
    58 [ label = "TransformExchangeDeserializer" ]
    59 [ label = "TransformExchangeDeserializer" ]
    60 [ label = "TransformExchangeDeserializer" ]
    61 [ label = "TransformExchangeDeserializer" ]
    62 [ label = "CompoundBlockOperator(Project)" ]
    63 [ label = "CompoundBlockOperator(Project)" ]
    64 [ label = "CompoundBlockOperator(Project)" ]
    65 [ label = "CompoundBlockOperator(Project)" ]
    0 -> 1 [ label = "" ]
    1 -> 2 [ label = "" ]
    2 -> 3 [ label = "" ]
    3 -> 4 [ label = "" ]
    4 -> 5 [ label = "from: 0, to: 0" ]
    4 -> 6 [ label = "from: 1, to: 0" ]
    4 -> 7 [ label = "from: 2, to: 0" ]
    6 -> 10 [ label = "from: 0, to: 0" ]
    6 -> 10 [ label = "from: 1, to: 1" ]
    6 -> 10 [ label = "from: 2, to: 2" ]
    6 -> 10 [ label = "from: 3, to: 3" ]
    8 -> 10 [ label = "from: 0, to: 4" ]
    9 -> 10 [ label = "from: 0, to: 5" ]
    10 -> 11 [ label = "from: 0, to: 0" ]
    10 -> 12 [ label = "from: 1, to: 0" ]
    10 -> 13 [ label = "from: 2, to: 0" ]
    10 -> 14 [ label = "from: 3, to: 0" ]
    11 -> 15 [ label = "" ]
    12 -> 16 [ label = "" ]
    13 -> 17 [ label = "" ]
    14 -> 18 [ label = "" ]
    15 -> 19 [ label = "from: 0, to: 0" ]
    16 -> 19 [ label = "from: 0, to: 1" ]
    17 -> 19 [ label = "from: 0, to: 2" ]
    18 -> 19 [ label = "from: 0, to: 3" ]
    15 -> 20 [ label = "from: 1, to: 0" ]
    16 -> 20 [ label = "from: 1, to: 1" ]
    17 -> 20 [ label = "from: 1, to: 2" ]
    18 -> 20 [ label = "from: 1, to: 3" ]
    15 -> 21 [ label = "from: 2, to: 0" ]
    16 -> 21 [ label = "from: 2, to: 1" ]
    17 -> 21 [ label = "from: 2, to: 2" ]
    18 -> 21 [ label = "from: 2, to: 3" ]
    15 -> 22 [ label = "from: 3, to: 0" ]
    16 -> 22 [ label = "from: 3, to: 1" ]
    17 -> 22 [ label = "from: 3, to: 2" ]
    18 -> 22 [ label = "from: 3, to: 3" ]
    19 -> 23 [ label = "" ]
    20 -> 24 [ label = "" ]
    21 -> 25 [ label = "" ]
    22 -> 26 [ label = "" ]
    23 -> 27 [ label = "" ]
    24 -> 28 [ label = "" ]
    25 -> 29 [ label = "" ]
    26 -> 30 [ label = "" ]
    27 -> 31 [ label = "" ]
    28 -> 32 [ label = "" ]
    29 -> 33 [ label = "" ]
    30 -> 34 [ label = "" ]
    31 -> 35 [ label = "" ]
    32 -> 36 [ label = "" ]
    33 -> 37 [ label = "" ]
    34 -> 38 [ label = "" ]
    35 -> 39 [ label = "from: 0, to: 0" ]
    36 -> 39 [ label = "from: 0, to: 1" ]
    37 -> 39 [ label = "from: 0, to: 2" ]
    38 -> 39 [ label = "from: 0, to: 3" ]
    35 -> 40 [ label = "from: 1, to: 0" ]
    36 -> 40 [ label = "from: 1, to: 1" ]
    37 -> 40 [ label = "from: 1, to: 2" ]
    38 -> 40 [ label = "from: 1, to: 3" ]
    35 -> 41 [ label = "from: 2, to: 0" ]
    36 -> 41 [ label = "from: 2, to: 1" ]
    37 -> 41 [ label = "from: 2, to: 2" ]
    38 -> 41 [ label = "from: 2, to: 3" ]
    35 -> 42 [ label = "from: 3, to: 0" ]
    36 -> 42 [ label = "from: 3, to: 1" ]
    37 -> 42 [ label = "from: 3, to: 2" ]
    38 -> 42 [ label = "from: 3, to: 3" ]
    39 -> 43 [ label = "" ]
    40 -> 44 [ label = "" ]
    41 -> 45 [ label = "" ]
    42 -> 46 [ label = "" ]
    43 -> 47 [ label = "" ]
    44 -> 48 [ label = "" ]
    45 -> 49 [ label = "" ]
    46 -> 50 [ label = "" ]
    47 -> 51 [ label = "" ]
    48 -> 52 [ label = "" ]
    49 -> 53 [ label = "" ]
    50 -> 54 [ label = "" ]
    51 -> 57 [ label = "from: 0, to: 0" ]
    52 -> 57 [ label = "from: 0, to: 1" ]
    53 -> 57 [ label = "from: 0, to: 2" ]
    54 -> 57 [ label = "from: 0, to: 3" ]
    55 -> 57 [ label = "from: 0, to: 4" ]
    56 -> 57 [ label = "from: 0, to: 5" ]
    57 -> 58 [ label = "from: 0, to: 0" ]
    57 -> 59 [ label = "from: 1, to: 0" ]
    57 -> 60 [ label = "from: 2, to: 0" ]
    57 -> 61 [ label = "from: 3, to: 0" ]
    58 -> 62 [ label = "" ]
    59 -> 63 [ label = "" ]
    60 -> 64 [ label = "" ]
    61 -> 65 [ label = "" ]
}

Visualization tools: https://magjac.com/graphviz-visual-editor/
example online

image

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-refactor this PR changes the code base without new features or bugfix label Jun 18, 2025
@zhang2014 zhang2014 requested review from BohuTANG and dqhl76 June 18, 2025 13:26
@zhang2014 zhang2014 marked this pull request as ready for review June 18, 2025 13:27
Copy link
Collaborator

@dqhl76 dqhl76 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Others LGTM

@zhang2014 zhang2014 merged commit a18d1f0 into databendlabs:main Jun 19, 2025
113 of 120 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-refactor this PR changes the code base without new features or bugfix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants