[FLINK-15368][e2e] Add end-to-end test for controlling RocksDB memory usage #10930

Myasuka · 2020-01-23T07:51:18Z

What is the purpose of the change

Add end-to-end test for controlling RocksDB memory usage. This job has 4 states in 4 different operator, and all the operators are shared in one slot.

NOTE: This end-to-end test could be a unstable one when too many unflushed immutable mem-tables. I wrote a doc to explain how write buffer manager works in RocksDB. In this doc I explained the most total memory usage could be much higher than expected in the worst case.

Below is the general test result:
1GB TM, 2 slot each without memory control. To compare fairly, I also cache index & filter into cache but not change other configurations of RocksDB.
When we do not control memory usage over RocksDB instances, the total memory should be summed as block-cache-usgae + total-mem-table from all 4 states. As you can see, the total memory usage in one slot could be 400MB+

1GB TM, 2 slot each has 161061276 bytes of managed off-heap memory
Since we use the same cache to share among all rocksDB instances, the total memory usage is the block cache usage. As you can see, the memory usage could be near the vicinity of 161061276 bytes.

Brief change log

Add end-to-end test for controlling RocksDB memory usage.

Verifying this change

This change added tests and can be verified as follows:

Added RocksDBStateMemoryControlTestProgram to verify end-to-end.

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): no
The public API, i.e., is any changed class annotated with @Public(Evolving): no
The serializers: no
The runtime per-record code paths (performance sensitive): no
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no
The S3 file system connector: no

Documentation

Does this pull request introduce a new feature? no
If yes, how is the feature documented? not applicable

… usage

flinkbot · 2020-01-23T07:53:33Z

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit abfc351 (Thu Jan 23 07:53:33 UTC 2020)

Warnings:

2 pom.xml files were touched: Check for build and licensing issues.
No documentation files were touched! Remember to keep the Flink docs up to date!

_{Mention the bot in a comment to re-run the automated checks.}

Review Progress

❓ 1. The [description] looks good.
❓ 2. There is [consensus] that the contribution should go into to Flink.
❓ 3. Needs [attention] from.
❓ 4. The change fits into the overall [architecture].
❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details

The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
@flinkbot approve all to approve all aspects
@flinkbot approve-until architecture to approve everything until architecture
@flinkbot attention @username1 [@username2 ..] to require somebody's attention
@flinkbot disapprove architecture to remove an approval you gave earlier

carp84

The design of the test looks good to me, please check my inline comments.

Could you also trigger the e2e tests in travis after resolving the comments, to confirm that the newly added tests could pass @Myasuka ? Thanks.

flink-end-to-end-tests/test-scripts/test_rocksdb_state_memory_control.sh

flink-end-to-end-tests/run-nightly-tests.sh

flinkbot · 2020-01-23T08:17:15Z

CI report:

abfc351 Travis: FAILURE Azure: SUCCESS

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot run travis re-run the last Travis build
@flinkbot run azure re-run the last Azure build

StephanEwen · 2020-01-23T10:49:21Z

Nice work, @Myasuka and @carp84 .

With Chinese New Year happening now, I can take this over from here and address the remaining comments.

… usage This closes apache#10930

… usage This closes #10930

… usage This closes apache#10930

[FLINK-15368][e2e] Add end-to-end test for controlling RocksDB memory…

abfc351

… usage

rmetzger added the review=description? label Jan 23, 2020

carp84 requested changes Jan 23, 2020

View reviewed changes

flink-end-to-end-tests/test-scripts/test_rocksdb_state_memory_control.sh Show resolved Hide resolved

flink-end-to-end-tests/run-nightly-tests.sh Show resolved Hide resolved

rmetzger added the component=Runtime/StateBackends label Jan 23, 2020

asfgit closed this in 6128028 Jan 23, 2020

StephanEwen pushed a commit to StephanEwen/flink that referenced this pull request Jan 23, 2020

[FLINK-15368][e2e] Add end-to-end test for controlling RocksDB memory…

b349c2d

… usage This closes apache#10930

asfgit pushed a commit that referenced this pull request Jan 23, 2020

[FLINK-15368][e2e] Add end-to-end test for controlling RocksDB memory…

ab16448

… usage This closes #10930

JTaky pushed a commit to JTaky/flink that referenced this pull request Feb 20, 2020

[FLINK-15368][e2e] Add end-to-end test for controlling RocksDB memory…

e426152

… usage This closes apache#10930

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FLINK-15368][e2e] Add end-to-end test for controlling RocksDB memory usage #10930

[FLINK-15368][e2e] Add end-to-end test for controlling RocksDB memory usage #10930

Uh oh!

Myasuka commented Jan 23, 2020 •

edited

Loading

Uh oh!

flinkbot commented Jan 23, 2020

Uh oh!

carp84 left a comment

Uh oh!

Uh oh!

Uh oh!

flinkbot commented Jan 23, 2020 •

edited

Loading

Uh oh!

StephanEwen commented Jan 23, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[FLINK-15368][e2e] Add end-to-end test for controlling RocksDB memory usage #10930

[FLINK-15368][e2e] Add end-to-end test for controlling RocksDB memory usage #10930

Uh oh!

Conversation

Myasuka commented Jan 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

Uh oh!

flinkbot commented Jan 23, 2020

Automated Checks

Review Progress

Uh oh!

carp84 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

flinkbot commented Jan 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI report:

Uh oh!

StephanEwen commented Jan 23, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Myasuka commented Jan 23, 2020 •

edited

Loading

flinkbot commented Jan 23, 2020 •

edited

Loading