Skip to content

Conversation

@swuferhong
Copy link
Contributor

What is the purpose of the change

In our test environment, we set the default parallelism to 1 and got the most appropriate default value of parameter table.exec.local-hash-agg.adaptive.sampling-threshold is 5000000. However, for these batch jobs with high parallelism in produce environment, the amount of data in single parallelism is almost less than 5000000. Therefore, after testing, we found that set to 500000 can get better results.

Brief change log

Modify the default value of table.exec.local-hash-agg.adaptive.sampling-threshold.

Verifying this change

no test.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? no

@flinkbot
Copy link
Collaborator

flinkbot commented Feb 9, 2023

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@swuferhong
Copy link
Contributor Author

@flinkbot run azure

@swuferhong
Copy link
Contributor Author

@flinkbot run azure

…able.exec.local-hash-agg.adaptive.sampling-threshold'
@swuferhong
Copy link
Contributor Author

@flinkbot run azure

Copy link
Contributor

@godfreyhe godfreyhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@godfreyhe godfreyhe merged commit 55b927b into apache:master Feb 17, 2023
mohsenrezaeithe pushed a commit to mohsenrezaeithe/flink that referenced this pull request Feb 21, 2023
…able.exec.local-hash-agg.adaptive.sampling-threshold'

This closes apache#21900
godfreyhe pushed a commit that referenced this pull request Feb 28, 2023
…able.exec.local-hash-agg.adaptive.sampling-threshold'

This closes #21900

(cherry picked from commit 55b927b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants