Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ST-Engine][Design] The Design of LogicalPlan to PhysicalPlan #2269

Closed
3 tasks done
Hisoka-X opened this issue Jul 26, 2022 · 2 comments
Closed
3 tasks done

[ST-Engine][Design] The Design of LogicalPlan to PhysicalPlan #2269

Hisoka-X opened this issue Jul 26, 2022 · 2 comments

Comments

@Hisoka-X
Copy link
Member

Hisoka-X commented Jul 26, 2022

Search before asking

  • I had searched in the feature and found no similar feature requirement.

Description

SeaTunnel engine will receive the logical plan sent by the client, and the engine needs to convert it into a physical plan that can be directly executed. Therefore, it is necessary to process the logical execution plan and generate a physical plan through conversion. The specific process is as follows:

  1. Logical Plan
    image
    Received the logical plan, we need to remove redundant Actions, and verifying the Schema(Transform2 and Transform 5 should be same)
  2. Execution Plan
    image
    While converting to an execution plan:
  • Transforms need to be merged, and the basis for merging is whether the data will be split after the Transform.
  • Convert Shuffle Action to Queue
  • Convert to multi pipeline
  1. Physical Plan
    image
    We will split the Pipeline into separate executable tasks according to the degree of parallelism, also need add SourceSplitEnumerator and SinkAggregatedCommitter task
    After this, can send task to task execution service. Then task can run normally.

Usage Scenario

No response

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@2013650523
Copy link
Contributor

2013650523 commented Aug 10, 2022

hi, Excuse me,Let me ask you a question, Why is the data queue designed to decouple source and sink? Is the data cached in the queue when the source fails?

@Hisoka-X
Copy link
Member Author

hi, Excuse me,Let me ask you a question, Why is the data queue designed to decouple source and sink? Is the data cached in the queue when the source fails?

  1. The Queue will create when use shuffle transform (at now call partition transform). Used for data shuffle and change parallelism
  2. The cache feature you mentioned will be added later

@Hisoka-X Hisoka-X closed this as completed Nov 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants