Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-15145][config] Change TM memory configuration default values for FLIP-49. #10860

Closed

Conversation

xintongsong
Copy link
Contributor

What is the purpose of the change

This PR updates FLIP-49 TM memory sizes configuration default values according to the outcome of tuning with real jobs.

Brief change log

  • JVM overhead min: 128MB -> 192MB
  • JVM metaspace: 128MB -> 96MB
  • Total process size (in default flink-conf.yaml): 1024MB -> 1568MB

Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (not applicable)

@flinkbot
Copy link
Collaborator

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 5f9cd22 (Wed Jan 15 09:40:04 UTC 2020)

Warnings:

  • No documentation files were touched! Remember to keep the Flink docs up to date!

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Jan 15, 2020

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build
  • @flinkbot run azure re-run the last Azure build

Copy link
Contributor

@azagrebin azagrebin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @xintongsong
It looks good to me, just added suggestions for a bit more details in comments
and reorganising process and Flink memory sizes.

EDIT: we probably want to mention the legacy option taskmanager.heap.size only in migration guide.

# The total process memory size for the TaskManager.
#
# Note this accounts for all memory usages of a TaskManager process, including JVM metaspace and other overheads.
# To exclude JVM metaspace and other overheads, please use total flink memory size (taskmanager.memory.flink.size) instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# To exclude JVM metaspace and other overheads, please use total flink memory size (taskmanager.memory.flink.size) instead.

@@ -42,9 +42,12 @@ jobmanager.rpc.port: 6123
jobmanager.heap.size: 1024m


# The heap size for the TaskManager JVM
# The total process memory size for the TaskManager.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# The total process memory size for the TaskManager.
# The total process memory size for the TaskExecutor.

I think this the most recent term for task manager after FLIP-6

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the code base it is correct that we call it TaskExecutor. However, we never fully transitioned from TaskManager to TaskExecutor in our documentation. Given that the memory config options are prefixed with taskmanager, I would suggest to stick with TaskManager for the time being.

# The heap size for the TaskManager JVM
# The total process memory size for the TaskManager.
#
# Note this accounts for all memory usages of a TaskManager process, including JVM metaspace and other overheads.
Copy link
Contributor

@azagrebin azagrebin Jan 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Note this accounts for all memory usages of a TaskManager process, including JVM metaspace and other overheads.
# Note this accounts for all memory usage within the TaskExecutor process, including JVM metaspace and other overhead.
taskmanager.memory.process.size: 1568m

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here with TaskExecutor -> TaskManager.

…in default flink-conf.yaml.

- Correct the description of process memory size.
- Provide the alternative flink memory size.
@xintongsong
Copy link
Contributor Author

Thanks for the review, @azagrebin.
Comments addressed.

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for providing this PR @xintongsong. The changes look good to me. The only thing I would suggest to change is the term TaskExecutor into TaskManager because that is also the prefix of the newly introduced memory configuration options.

@@ -42,10 +42,16 @@ jobmanager.rpc.port: 6123
jobmanager.heap.size: 1024m


# The heap size for the TaskManager JVM
# The total process memory size for the TaskExecutor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to use the term TaskExecutor for time being because we have never really changed the name in the documetation.

# The heap size for the TaskManager JVM
# The total process memory size for the TaskExecutor.
#
# Note this accounts for all memory usage within the TaskExecutor process, including JVM metaspace and other overhead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here with TaskExecutor and TaskManager.

@xintongsong
Copy link
Contributor Author

Thank you @tillrohrmann.
I've addressed your comments.

@tillrohrmann
Copy link
Contributor

Thanks @xintongsong for addressing my comments so quickly. Merging this PR now.

tillrohrmann pushed a commit that referenced this pull request Jan 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants