Skip to content

Conversation

@gaoyunhaii
Copy link
Contributor

@gaoyunhaii gaoyunhaii commented Mar 17, 2020

What is the purpose of the change

Decrease the default Netty chunk size from 16MB to 4MB. Since Netty allocates memory by chunks, a smaller chunk size would decrease the amount of unused direct memory, thus decrease the total direct memory required.

To allow users changes the default behaviors, we also makes the page size and max order properties of NettyBufferPool configurable.

Brief change log

  • ce674a7 introduce the configuration options and change the default chunk size to 4MB.

Verifying this change

This change added tests and can be verified as follows:

  • Unit tests for the configuration checking logic.
  • Decrease the required direct memory in E2E test NettyShuffleMemoryControlTest from 20M to 7M and verifies that without the change there will be direct memory OOM exception, and with the change there will be no exception.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
  • The S3 file system connector:no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@flinkbot
Copy link
Collaborator

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit ce674a7 (Tue Mar 17 00:56:18 UTC 2020)

Warnings:

  • No documentation files were touched! Remember to keep the Flink docs up to date!

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details
The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Mar 17, 2020

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build
  • @flinkbot run azure re-run the last Azure build

@gaoyunhaii
Copy link
Contributor Author

@flinkbot run travis

@gaoyunhaii
Copy link
Contributor Author

@flinkbot run azure

Copy link
Contributor

@zhijiangW zhijiangW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this following improvement after #7368.

I am a bit concern of introducing two new configuration options related to internal netty stack. My assumption is that most users are lack of expertise to understand these options, and it is really not necessary to adjust them in practice. The additional potential trouble is for maintaining compatible if these options might be changed future.

My suggestion is to only adjust the default value inside NettyBufferPool ATM. We can consider exposing some options if feeling really necessary in future. WDYT?

@gaoyunhaii
Copy link
Contributor Author

Hi @zhijiangW very thanks for the review! I agree with that we could postpone adding the options. I updated the PR and change the max-order directly, and re-run the e2e test.

@zhijiangW
Copy link
Contributor

Thanks for the updates, @gaoyunhaii !

The code is generally good to me. I have only one concern. As we known, the default chunk size in netty is 16MB. If we adjust it to a smaller value, it is better to give some reasons why we consider 4MB, not 1MB or other values as default.

Another thought is that it might be possible to integrate another NettyBufferPool instance used in queryable state now, not only for the chunk size issue. But it would be a separate ticket future.

@gaoyunhaii
Copy link
Contributor Author

Hi @zhijiangW very thanks for the review! I have updated the comments. I will continue checking the NettyBufferPool used in the Queryable State.

@gaoyunhaii
Copy link
Contributor Author

@flinkbot run travis

@gaoyunhaii
Copy link
Contributor Author

@flinkbot run azure

Copy link
Contributor

@zhijiangW zhijiangW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @gaoyunhaii ! LGTM!

@zhijiangW zhijiangW merged commit 0bc38ba into apache:master Mar 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants