Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature][Sort] Adjust sort resources according to data scale #7056

Open
2 tasks done
Tracked by #2154
featzhang opened this issue Dec 24, 2022 · 1 comment · May be fixed by #8822
Open
2 tasks done
Tracked by #2154

[Feature][Sort] Adjust sort resources according to data scale #7056

featzhang opened this issue Dec 24, 2022 · 1 comment · May be fixed by #8822
Labels
stage/stale Issues or PRs that had no activity for a long time type/feature

Comments

@featzhang
Copy link
Member

featzhang commented Dec 24, 2022

Description

Currently, the total amount of resources for the Flink Sort Job comes from the configuration file flink-sort-plugin.properties, so all submitted sort jobs will use the same amount of resources. When the data scale is large, the resources will be insufficient. When the data scale is small, the resources are wasted.

# Flink parallelism
flink.parallelism=1

Therefore, dynamically adjusting the number of resources according to the amount of data is one of the urgently needed functions

Resource Adaptive Adjustment

Theoretically, the processing performance of Flink can reach about 1000/second/core, of course, it depends on factors such as state-backed.

Influencing factors:

  • Data scale:
  • Storage IO bottleneck:
    When the performance of a single client connection to external storage becomes a bottleneck, it is a good idea to increase the degree of parallelism or the number of threads
  • Transformation computational complexity:
    In the case of a fixed LoadNode, it is a deterministic factor
  • Advance factors:
    core/task manager, parallelism/core, and so on.

Use case

No response

Are you willing to submit PR?

  • Yes, I am willing to submit a PR!

Code of Conduct

@featzhang featzhang changed the title [Feature][Sort] Adjust sort resources according to the amount of data [Feature][Sort] Adjust sort resources according to data scale Dec 24, 2022
@github-actions
Copy link

This issue is stale because it has been open for 60 days with no activity.

@github-actions github-actions bot added the stage/stale Issues or PRs that had no activity for a long time label Feb 25, 2023
@dockerzhang dockerzhang mentioned this issue Jul 26, 2023
6 tasks
NorthDataEngineer pushed a commit to NorthDataEngineer/inlong that referenced this issue Aug 30, 2023
…et flink config from Map before getFlinkConfig,not only get default flink config from flink-sort-plugin.properties
NorthDataEngineer pushed a commit to NorthDataEngineer/inlong that referenced this issue Aug 30, 2023
…et flink config from Map before getFlinkConfig,not only get default flink config from flink-sort-plugin.properties
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stage/stale Issues or PRs that had no activity for a long time type/feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant