Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-15143] Docs for FLIP-49 TM memory model and configuration guide #10999

Merged
merged 3 commits into from Feb 10, 2020

Conversation

azagrebin
Copy link
Contributor

What is the purpose of the change

In release 1.10, with FLIP-49, we introduced significant changes to the TaskExecutor memory model and it's related configuration options / logics.

It is very important that we clearly state the changes and potential effects, and guide our users to tune their clusters with the new configuration for both new setups and migrations of previous setups.

Brief change log

  • Add memory model description and configuration guide
  • Add tuning guide for various features
  • Add troubleshooting guide
  • Add migration guide from pre-FLIP-49 configuration

Verifying this change

cd flink/docs/docker
./run.sh
./build_docs.sh -p

open localhost:4000

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (docs)

@azagrebin
Copy link
Contributor Author

azagrebin commented Feb 3, 2020

I will duplicate files for the later Chinese translation before merging and reference old model (#11004).
cc @xintongsong @sjwiesman

@flinkbot
Copy link
Collaborator

flinkbot commented Feb 3, 2020

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 3ed9750 (Mon Feb 03 13:34:16 UTC 2020)

✅no warnings

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Feb 3, 2020

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build
  • @flinkbot run azure re-run the last Azure build

@tillrohrmann tillrohrmann self-assigned this Feb 3, 2020
Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @azagrebin. I finished the first part of my review (general documentation) and had some comments. I will continue with the remaining parts.

docs/ops/memory/mem_setup.md Outdated Show resolved Hide resolved
The further described memory configuration is applicable starting with the release version 1.10. If you upgrade Flink
from earlier versions, check the [migration guide](mem_migration.html) because many changes were introduced with the 1.10 release.

<strong>Note: This memory setup guide is relevant only for Task Executors!</strong>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent casing. Sometimes Task Executor is written with capital letters and sometimes with lower case letters. I'd suggest to stick to one way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I wanted to emphasise that it is for TM not JM. I can try to use closer to TM.

from earlier versions, check the [migration guide](mem_migration.html) because many changes were introduced with the 1.10 release.

<strong>Note: This memory setup guide is relevant only for Task Executors!</strong>
Check Job Manager [related configuration options](../config.html#jobmanager) for the memory setup of Job Manager.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for "Job Manager"

from earlier versions, check the [migration guide](mem_migration.html) because many changes were introduced with the 1.10 release.

<strong>Note: This memory setup guide is relevant only for Task Executors!</strong>
Check Job Manager [related configuration options](../config.html#jobmanager) for the memory setup of Job Manager.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Check Job Manager [related configuration options](../config.html#jobmanager) for the memory setup of Job Manager.
Check [Job Manager related configuration options](../config.html#jobmanager) for the memory setup of Job Manager.

docs/ops/memory/mem_setup.md Outdated Show resolved Hide resolved
docs/ops/memory/mem_setup.md Outdated Show resolved Hide resolved
docs/ops/memory/mem_setup.md Outdated Show resolved Hide resolved
docs/ops/memory/mem_setup.md Outdated Show resolved Hide resolved
a bigger JVM limit in this case.

<strong>Note:</strong> The *network memory* is also part of JVM *direct memory* but it is managed by Flink and guaranteed
that it is always allocated within its configured size. Therefore, resizing network memory will not necessarily help in this situation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
that it is always allocated within its configured size. Therefore, resizing network memory will not necessarily help in this situation.
to never exceed its configured size. Therefore, resizing network memory will not necessarily help in this situation.


See also [detailed Memory Model](#detailed-memory-model).

## Detailed Memory Model
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to make this a separate page. This will shorten the page a bit and make it easier to digest for the reader.

under the License.
-->

* toc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can you move the TOC below the first paragraph, I think it looks nicer

-->

* toc
{:toc}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, please move to line 30 below the first paragraph

docs/ops/memory/index.md Outdated Show resolved Hide resolved
Comment on lines 80 to 87
[Managed memory](../memory/mem_setup.html#managed-memory) helps Flink to run the batch operators efficiently.
Therefore, the size of the [managed memory](mem_setup.html#managed-memory) can affect the performance of [batch jobs](../../dev/batch).
If it makes sense, Flink will try to allocate and use as much *managed memory* as configured for batch jobs but not beyond its limit.
It prevents OutOfMemoryExceptions because Flink knows how much memory it can use to execute operations.
If Flink runs out of [managed memory](../memory/mem_setup.html#managed-memory), it utilizes disk space.
Using [managed memory](../memory/mem_setup.html#managed-memory), some operations can be performed directly
on the raw data without having to deserialize the data to convert it into Java objects. All in all,
[managed memory](../memory/mem_setup.html#managed-memory) improves the robustness and speed of the system.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flink's batch operators leverage managed memory to run more efficiently. In doing so, some operations can be performed directly on raw data without having to be deserialized into Java objects. This means that managed memory configurations have practical effects on the performance of your applications. Flink will attempt to allocate and use as much managed memory as configured for batch jobs but not go beyond its limits. This prevents OutOfMemoryException's because Flink knows precisely how much memory it has to leverage. If the managed memory is not sufficient, Flink will gracefully spill to disk.

@sjwiesman
Copy link
Contributor

@azagrebin I have a few small comments on the placement of the TOC and other minutiae. I'm relying on Till and others to validate this for correctness but overall I think this looks good.

@azagrebin
Copy link
Contributor Author

Thanks for the review @sjwiesman ! I have addressed the comments.

@sjwiesman
Copy link
Contributor

ok, please tag me if anything is dramatically changed and you’d like me to check the language otherwise tentative +1

Copy link
Contributor

@xintongsong xintongsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening the PR, @azagrebin.
I've also left some comments, most of them related to the links.
In addition, since state_backends.zh.md is already translated into Chinese, I translated the new added contents in this file in my comments. Maybe @carp84 or @Myasuka can help verify my translation.

from earlier versions, check the [migration guide](mem_migration.html) because many changes were introduced with the 1.10 release.

<strong>Note:</strong> This memory setup guide is relevant <strong>only for task executors</strong>!
Check [job manager related configuration options](../config.html#jobmanager) for the memory setup of job manager.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the linked page does not have the anchor #jobmanager. This link jumps to the beginning of the configuration page.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the docs have been recently restructured. I will find a new link

Comment on lines 54 to 55
* Total Flink memory ([taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1))
* Total process memory ([taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two links do not work for me. Why do we have -1 in the urls?
Is it a mistake or something wrong with my building?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The options, which have -1 at the end of their links, were duplicated in the main configuration page. They were listed as common options and then as TM or TM mem specific. After docs restructure, they got deduplicated. I will remove -1 for them.

[Here](#detailed-memory-model) are more details about the other memory components.

Configuring *total Flink memory* is better suited for standalone deployments where you want to declare how much memory
is given to Flink itself. The *total Flink memory* splits up into [managed memory size](#managed-memory) and JVM heap.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
is given to Flink itself. The *total Flink memory* splits up into [managed memory size](#managed-memory) and JVM heap.
is given to Flink itself. The *total Flink memory* splits up into JVM heap, [managed memory](#managed-memory), and direct memory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to align the description with the figure above, to avoid potential confusing.

Comment on lines 74 to 75
* [taskmanager.memory.flink.size](../config.html#taskmanager-memory-flink-size-1)
* [taskmanager.memory.process.size](../config.html#taskmanager-memory-process-size-1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same problem of '-1' in urls for these two lines.

This more fine-grained approach is described in more detail [here](#configure-heap-and-managed-memory).

<strong>Note:</strong> One of the three mentioned ways has to be used to configure Flink’s memory (except for local execution), or the Flink startup will fail.
This means that one of the following option subsets, which do not have default values, have to be configured explicitly in *flink-conf.yaml*:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily in flink-conf.yaml. It can also be '-D' parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I will just remove in *flink-conf.yaml*

@@ -74,6 +74,8 @@ The MemoryStateBackend is encouraged for:
- Local development and debugging
- Jobs that do hold little state, such as jobs that consist only of record-at-a-time functions (Map, FlatMap, Filter, ...). The Kafka Consumer requires very little state.

It is also recommended to set [managed memory](mem_setup.html#managed-memory) to zero.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It is also recommended to set [managed memory](mem_setup.html#managed-memory) to zero.
It is also recommended to set [managed memory](../memory/mem_setup.html#managed-memory) to zero.

Comment on lines 74 to 75
It is also recommended to set [managed memory](mem_setup.html#managed-memory) to zero.
This will ensure that the maximum amount of memory is allocated for user code on the JVM.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Translation:
建议同时将 [managed memory](../memory/mem_setup.html#managed-memory) 设为0,以保证将最大限度的内存分配给 JVM 上的用户代码。

@@ -92,6 +94,9 @@ The FsStateBackend is encouraged for:
- Jobs with large state, long windows, large key/value states.
- All high-availability setups.

It is also recommended to set [managed memory](mem_setup.html#managed-memory) to zero.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It is also recommended to set [managed memory](mem_setup.html#managed-memory) to zero.
It is also recommended to set [managed memory](../memory/mem_setup.html#managed-memory) to zero.

Comment on lines 96 to 97
It is also recommended to set [managed memory](mem_setup.html#managed-memory) to zero.
This will ensure that the maximum amount of memory is allocated for user code on the JVM.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Translation:
建议同时将 [managed memory](../memory/mem_setup.html#managed-memory) 设为0,以保证将最大限度的内存分配给 JVM 上的用户代码。

@@ -115,6 +120,8 @@ RocksDBStateBackend 的适用场景:
然而,这也意味着使用 RocksDBStateBackend 将会使应用程序的最大吞吐量降低。
所有的读写都必须序列化、反序列化操作,这个比基于堆内存的 state backend 的效率要低很多。

Check also recommendations about the [task executor memory configuration](../memory/mem_tuning.html#rocksdb-state-backend) for the RocksDBStateBackend.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Translation:
请同时参考 [Task Executor 内存配置](../memory/mem_tuning.html#rocksdb-state-backend) 中关于 RocksDBStateBackend 的建议。

In doing so, some operations can be performed directly on raw data without having to be deserialized into Java objects.
This means that [managed memory](../memory/mem_setup.html#managed-memory) configurations have practical effects
on the performance of your applications. Flink will attempt to allocate and use as much [managed memory](../memory/mem_setup.html#managed-memory)
as configured for batch jobs but not go beyond its limits. This prevents `OutOfMemoryException`'s because Flink knows precisely
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class name is OutOfMemoryError

@azagrebin
Copy link
Contributor Author

Thanks for the review @tillrohrmann
I addressed the comments

Copy link
Contributor

@xintongsong xintongsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my comments, @azagrebin. LGTM.

In doing so, some operations can be performed directly on raw data without having to be deserialized into Java objects.
This means that [managed memory](../memory/mem_setup.html#managed-memory) configurations have practical effects
on the performance of your applications. Flink will attempt to allocate and use as much [managed memory](../memory/mem_setup.html#managed-memory)
as configured for batch jobs but not go beyond its limits. This prevents `OutOfMemoryError`'s because Flink knows precisely
Copy link
Member

@GJL GJL Feb 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plural is without ' just s
Maybe it's easier to just write OutOfMemoryErrors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sjwiesman what's your opinion? is it typo?


This section describes the changes of the default ‘flink-conf.yaml’ shipped with Flink.

The total memory (‘taskmanager.heap.size’) is replaced by [`taskmanager.memory.process.size`](../config.html#taskmanager-memory-process-size)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why no backticks here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I overlooked old options w/o links

This section describes the changes of the default ‘flink-conf.yaml’ shipped with Flink.

The total memory (‘taskmanager.heap.size’) is replaced by [`taskmanager.memory.process.size`](../config.html#taskmanager-memory-process-size)
in the default ‘flink-conf.yaml’. The value is also increased from 1024Mb to 1568Mb.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well, why no backticks? There are a couple of these.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this PR @azagrebin. I had some additional comments.

Comment on lines 33 to 34
The further described memory configuration is applicable starting with the release version *1.10*. If you upgrade Flink
from earlier versions, check the [migration guide](mem_migration.html) because many changes were introduced with the 1.10 release.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: In the first sentence we use italics for the release version and in the next sentence we don't use italics.

from earlier versions, check the [migration guide](mem_migration.html) because many changes were introduced with the 1.10 release.

<span class="label label-info">Note</span> This memory setup guide is relevant <strong>only for task executors</strong>!
Check [job manager related configuration options](../config.html#jobmanager-heap-size) for the memory setup of job manager.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Check [job manager related configuration options](../config.html#jobmanager-heap-size) for the memory setup of job manager.
Check [job manager related configuration options](../config.html#jobmanager-heap-size) for the memory setup of a job manager.


The *total process memory* of Flink JVM processes consists of memory consumed by Flink application (*total Flink memory*)
and by the JVM to run the process. The *total Flink memory* consumption includes usage of JVM heap,
*managed memory* (managed by Flink) and other direct (or native) memory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should direct memory be italic similar to the other components?

<br />

If you run FIink locally (e.g. from your IDE) without creating a cluster, then only a subset of the memory configuration
options are relevant, see also [local execution](mem_detail.html#local-execution) for more details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
options are relevant, see also [local execution](mem_detail.html#local-execution) for more details.
options are relevant. See [local execution](mem_detail.html#local-execution) for more details.

If you run FIink locally (e.g. from your IDE) without creating a cluster, then only a subset of the memory configuration
options are relevant, see also [local execution](mem_detail.html#local-execution) for more details.

Otherwise, the simplest way to setup memory in Flink is to configure either of the two following options:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Otherwise, the simplest way to setup memory in Flink is to configure either of the two following options:
Otherwise, the simplest way to setup memory in Flink is to configure either of the following two options:

<br/>

All of the components listed above can be but do not have to be explicitly configured for the local execution.
If they are not configured they are set to their default values. [Task heap memory](mem_setup.html#task-operator-heap-memory) and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If they are not configured they are set to their default values. [Task heap memory](mem_setup.html#task-operator-heap-memory) and
If they are not configured they are set to their default values. [*Task heap memory*](mem_setup.html#task-operator-heap-memory) and


All of the components listed above can be but do not have to be explicitly configured for the local execution.
If they are not configured they are set to their default values. [Task heap memory](mem_setup.html#task-operator-heap-memory) and
*task off-heap memory* are considered to be infinite (*Long.MAX_VALUE* bytes) and [managed memory](mem_setup.html#managed-memory)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
*task off-heap memory* are considered to be infinite (*Long.MAX_VALUE* bytes) and [managed memory](mem_setup.html#managed-memory)
*task off-heap memory* are considered to be infinite (*Long.MAX_VALUE* bytes) and [*managed memory*](mem_setup.html#managed-memory)

has a default value of 128Mb only for the local execution mode.

<span class="label label-info">Note</span> The task heap size is not related in any way to the real heap size in this case.
It can become relevant for future optimizations coming with next releases. The actual JVM heap size of the started
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It can become relevant for future optimizations coming with next releases. The actual JVM heap size of the started
The actual JVM heap size of the started

<span class="label label-info">Note</span> The task heap size is not related in any way to the real heap size in this case.
It can become relevant for future optimizations coming with next releases. The actual JVM heap size of the started
local process is not controlled by Flink and depends on how you start the process.
If you want to control the JVM heap size you have to explicitly pass the corresponding JVM arguments, e.g. *-Xmx*, *-Xms*.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you want to control the JVM heap size you have to explicitly pass the corresponding JVM arguments, e.g. *-Xmx*, *-Xms*.
If you want to control the JVM heap size, then you have to explicitly pass the corresponding arguments *-Xmx* and *-Xms* to the JVM.

@@ -0,0 +1,234 @@
---
title: "Migration from old configuration (before 1.10 release)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not consistent capitalization wrt the other titles.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title: "Migration from old configuration (before 1.10 release)"
title: "Migration Guide"

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments.


## IllegalConfigurationException

If you see an *IllegalConfigurationException* thrown from *TaskExecutorProcessUtils*, it usually indicates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you see an *IllegalConfigurationException* thrown from *TaskExecutorProcessUtils*, it usually indicates
If you see an `IllegalConfigurationException` thrown from `TaskExecutorProcessUtils`, it usually indicates

@@ -0,0 +1,86 @@
---
title: "Memory tuning guide"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title: "Memory tuning guide"
title: "Memory Tuning Guide"

under the License.
-->

In addition to the [main memory setup guide](mem_setup.html), this section explains how to setup memory of task executors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In addition to the [main memory setup guide](mem_setup.html), this section explains how to setup memory of task executors
In addition to the [main memory setup guide](mem_setup.html), this section explains how to configure the task executor's memory for different workloads and deployments.

-->

In addition to the [main memory setup guide](mem_setup.html), this section explains how to setup memory of task executors
depending on the use case and which options are important in which case.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
depending on the use case and which options are important in which case.

* toc
{:toc}

## Configure memory for standalone deployment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capitalization of heading is not consistent with the headings of the other pages.


Flink's batch operators leverage [managed memory](../memory/mem_setup.html#managed-memory) to run more efficiently.
In doing so, some operations can be performed directly on raw data without having to be deserialized into Java objects.
This means that [managed memory](../memory/mem_setup.html#managed-memory) configurations have practical effects
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This means that [managed memory](../memory/mem_setup.html#managed-memory) configurations have practical effects
This means that the [*managed memory*](../memory/mem_setup.html#managed-memory) configuration affects

Flink's batch operators leverage [managed memory](../memory/mem_setup.html#managed-memory) to run more efficiently.
In doing so, some operations can be performed directly on raw data without having to be deserialized into Java objects.
This means that [managed memory](../memory/mem_setup.html#managed-memory) configurations have practical effects
on the performance of your applications. Flink will attempt to allocate and use as much [managed memory](../memory/mem_setup.html#managed-memory)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
on the performance of your applications. Flink will attempt to allocate and use as much [managed memory](../memory/mem_setup.html#managed-memory)
the performance of your applications. Flink will use as much [managed memory](../memory/mem_setup.html#managed-memory)

In doing so, some operations can be performed directly on raw data without having to be deserialized into Java objects.
This means that [managed memory](../memory/mem_setup.html#managed-memory) configurations have practical effects
on the performance of your applications. Flink will attempt to allocate and use as much [managed memory](../memory/mem_setup.html#managed-memory)
as configured for batch jobs but not go beyond its limits. This prevents `OutOfMemoryError`'s because Flink knows precisely
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
as configured for batch jobs but not go beyond its limits. This prevents `OutOfMemoryError`'s because Flink knows precisely
as configured for the execution of batch jobs. This prevents `OutOfMemoryErrors` because Flink knows precisely

This means that [managed memory](../memory/mem_setup.html#managed-memory) configurations have practical effects
on the performance of your applications. Flink will attempt to allocate and use as much [managed memory](../memory/mem_setup.html#managed-memory)
as configured for batch jobs but not go beyond its limits. This prevents `OutOfMemoryError`'s because Flink knows precisely
how much memory it has to leverage. If the [managed memory](../memory/mem_setup.html#managed-memory) is not sufficient,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
how much memory it has to leverage. If the [managed memory](../memory/mem_setup.html#managed-memory) is not sufficient,
how much memory it can use. If the [*managed memory*](../memory/mem_setup.html#managed-memory) is not large enough,

on the performance of your applications. Flink will attempt to allocate and use as much [managed memory](../memory/mem_setup.html#managed-memory)
as configured for batch jobs but not go beyond its limits. This prevents `OutOfMemoryError`'s because Flink knows precisely
how much memory it has to leverage. If the [managed memory](../memory/mem_setup.html#managed-memory) is not sufficient,
Flink will gracefully spill to disk.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Flink will gracefully spill to disk.
then Flink will gracefully spill to disk.

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last round of review. Thanks a lot for creating this documentation @azagrebin. I had a couple of comments which we can resolve before merging this PR. Once this is done +1 for merging.

## IllegalConfigurationException

If you see an *IllegalConfigurationException* thrown from *TaskExecutorProcessUtils*, it usually indicates
that there is either an invalid configuration value (e.g., negative memory size, fraction that is greater than 1, etc.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
that there is either an invalid configuration value (e.g., negative memory size, fraction that is greater than 1, etc.)
that there is either an invalid configuration value (e.g. negative memory size, fraction that is greater than 1, etc.)


## OutOfMemoryError: Java heap space

The exception usually indicates that JVM heap is not configured with enough size. You can try to increase the JVM heap size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The exception usually indicates that JVM heap is not configured with enough size. You can try to increase the JVM heap size
The exception usually indicates that the JVM heap is too small. You can try to increase the JVM heap size

## OutOfMemoryError: Java heap space

The exception usually indicates that JVM heap is not configured with enough size. You can try to increase the JVM heap size
by configuring larger [total memory](mem_setup.html#configure-total-memory) or [task heap memory](mem_setup.html#task-operator-heap-memory).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
by configuring larger [total memory](mem_setup.html#configure-total-memory) or [task heap memory](mem_setup.html#task-operator-heap-memory).
by increasing [total memory](mem_setup.html#configure-total-memory) or [task heap memory](mem_setup.html#task-operator-heap-memory).

The exception usually indicates that JVM heap is not configured with enough size. You can try to increase the JVM heap size
by configuring larger [total memory](mem_setup.html#configure-total-memory) or [task heap memory](mem_setup.html#task-operator-heap-memory).

<span class="label label-info">Note</span> You can also increase [framework heap memory](mem_setup.html#framework-memory) but this option
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<span class="label label-info">Note</span> You can also increase [framework heap memory](mem_setup.html#framework-memory) but this option
<span class="label label-info">Note</span> You can also increase the [framework heap memory](mem_setup.html#framework-memory) but this option

by configuring larger [total memory](mem_setup.html#configure-total-memory) or [task heap memory](mem_setup.html#task-operator-heap-memory).

<span class="label label-info">Note</span> You can also increase [framework heap memory](mem_setup.html#framework-memory) but this option
is advanced and recommended to be changed if you are sure that the Flink framework itself needs more memory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
is advanced and recommended to be changed if you are sure that the Flink framework itself needs more memory.
is advanced and should only be changed if you are sure that the Flink framework itself needs more memory.

Additionally, you can now have more direct control over the JVM heap assigned to the operator tasks
([`taskmanager.memory.task.heap.size`](../config.html#taskmanager-memory-task-heap-size)),
see also [Task (Operator) Heap Memory](mem_setup.html#task-operator-heap-memory).
The same memory has to be accounted for the heap state backend ([MemoryStateBackend](../state/state_backends.html#the-memorystatebackend)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The same memory has to be accounted for the heap state backend ([MemoryStateBackend](../state/state_backends.html#the-memorystatebackend)
The JVM heap memory is also used by the heap state backends ([MemoryStateBackend](../state/state_backends.html#the-memorystatebackend)


See also [how to configure managed memory now](mem_setup.html#managed-memory).

### Explicit size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capitalization


If the [RocksDBStateBackend](../state/state_backends.html#the-rocksdbstatebackend) is chosen for a streaming job,
its native memory consumption should now be accounted for in [managed memory](mem_setup.html#managed-memory).
The RocksDB memory allocation is limited by the [managed memory](mem_setup.html#managed-memory) size.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could add that one could turn this behaviour off if one wants the old behaviour.

## Container Cut-Off Memory

For containerized deployments, you could previously specify a cut-off memory. This memory could accommodate for unaccounted memory allocations.
Dependencies which were not directly controlled by Flink were the main source of those allocations, e.g. RocksDB, internals of JVM etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Dependencies which were not directly controlled by Flink were the main source of those allocations, e.g. RocksDB, internals of JVM etc.
Dependencies which were not directly controlled by Flink were the main source of those allocations, e.g. RocksDB, internals of JVM, etc.

The RocksDB memory allocation is also limited by the configured size of the [managed memory](mem_setup.html#managed-memory).
See also [migrating managed memory](#managed-memory) and [how to configure managed memory now](mem_setup.html#managed-memory).

The other specific consumption of direct or native off-heap memory can be now addressed by the following new configuration options:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The other specific consumption of direct or native off-heap memory can be now addressed by the following new configuration options:
The other direct or native off-heap memory consumers can now be addressed by the following new configuration options:

@azagrebin
Copy link
Contributor Author

Thanks for the review @tillrohrmann
I addressed the comments
merged into master by 01a9ff9 - 0cb8834
merged into 1.10 by ff7e25c - 1212f66

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants