Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-14566] Enable to get/set whether an operator uses managed memory #10427

Conversation

zhuzhurk
Copy link
Contributor

@zhuzhurk zhuzhurk commented Dec 5, 2019

What is the purpose of the change

To calculate managed memory fraction for an operator with UNKNOWN resources, we need to know whether the operator will use managed memory to better utilize memory memory for better performance, according to FLINK-14062.

To achieve this, we need an interface to set/get whether an operator uses managed memory.

Brief change log

  • Enable to get/set managed memory weight in Transformation
  • Set Transformation managed memory weight to corresponding StreamNode
  • Adjust managed memory fraction calculation regarding managed memory weights

Verifying this change

This change added tests and can be verified as follows:

  • Added unit tests for changes in Transformation
  • Added unit tests for changes in StreamGraph/StreamNode
  • Adjusted unit tests for managed memory fraction calculation changes

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@flinkbot
Copy link
Collaborator

flinkbot commented Dec 5, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 22ccb31 (Thu Dec 05 04:38:52 UTC 2019)

Warnings:

  • No documentation files were touched! Remember to keep the Flink docs up to date!

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Dec 5, 2019

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build

* memory in runtime (linear association). Note that it only works in cases of UNKNOWN
* resources.
*/
private int managedMemoryWeight = DEFAULT_MANAGED_MEMORY_WEIGHT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how to express I don't need managed memory?

Copy link
Contributor Author

@zhuzhurk zhuzhurk Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One can set the weight to value 0 explicitly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about making don't need managed memory as default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to be aligned with other resources like cpu and heap memory. Operators with UNKNOWN resources are always able to acquire all kinds of available resources currently.
I'd prefer to not making managed memory a special one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what kind of ResourceSpec I should use when some operator doesn't need managed memory? First set UNKNOWN resource and then set managed memory to 0 explicitly?

Copy link
Contributor Author

@zhuzhurk zhuzhurk Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what kind of ResourceSpec I should use when some operator doesn't need managed memory? First set UNKNOWN resource and then set managed memory to 0 explicitly?

Yes. Weights only work in cases of UNKNOWN resources.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am +1 to default don't need managed memory.
Note it is manage memory, only way is requiring memory from memory manager explicitly. It is different from other resources.

And image user write an operator which using manage memory, and he believe there is only one operator to use manage memory, that is his operator. But if he use DataStream/DataSet api, whatever operators including map/source... these operators will rob his memories.

Copy link
Contributor Author

@zhuzhurk zhuzhurk Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note it is manage memory, only way is requiring memory from memory manager explicitly. It is different from other resources.

How much managed memory an operator can acquire actually depends on the declared resources rather the the needed resources. For example, one can specify the task_heap_memory/task_offheap_memory/managed_memory of an operator to be a large number even if the operator does not use that much or even does not use that kinds of resources.
The framework should respect that settings. And I don't see managed memory to be special here.

Copy link
Contributor Author

@zhuzhurk zhuzhurk Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And image user write an operator which using manage memory, and he believe there is only one operator to use manage memory, that is his operator. But if he use DataStream/DataSet api, whatever operators including map/source... these operators will rob his memories.

The weight is not a public interface and users cannot set it. So if the user wants to use the fraction, he will always get a 0 managed memory fraction if the default weight is 0, even if the operator requires managed memory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @zhuzhurk that the default value should not be 0. Otherwise we have a problem if a user writes a stateful DataStream program using RocksDB as he cannot set the weight value. And also if he could, then he would need to remember to set it otherwise his operator wouldn't get any managed memory.

@tillrohrmann tillrohrmann self-assigned this Dec 5, 2019
Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR @zhuzhurk. LGTM. I'll be addressing my comment while merging this PR.

Comment on lines 475 to 476
assertEquals(resources, iterationPair.f0.getMinResources());
assertEquals(ResourceSpec.ZERO, iterationPair.f1.getMinResources());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are testing for implementation details. Wouldn't it be better to test that the sum of the source and sink resources equals resources?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. Testing the merged resources would be better since the way to split the resources should make no difference at the moment and we actually do not care about it.

zhuzhurk and others added 7 commits December 5, 2019 16:57
Currently resources is only validated in DataStream. But table planner may directly set resources to Transformation via Transformation#setResources which is a public interface.
We must validate the resources params in Transformation#setResources.
…urces when the head node has specified resources
…arding managed memory weights

This only applies to vertices with UNKNOWN resources.

This closes apache#10427.
@tillrohrmann tillrohrmann force-pushed the FLINK_14566_operator_managed_memory_weight branch from 80562d5 to 976039e Compare December 5, 2019 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants