-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[FLINK-17511][tests] Rerun RocksDB memory control tests if necessary #14744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit 0a3a317 (Thu Sep 23 18:00:38 UTC 2021) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. DetailsThe Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
|
@flinkbot run azure |
1 similar comment
|
@flinkbot run azure |
|
@StephanEwen could you please take a look at this PR? |
|
@flinkbot run azure |
|
This looks like a fair way to stabilize the tests, but it ultimately hides a real problem: The fact that the RocksDB memory footprint is not controllable. I am fine with merging this, but we need to keep working on the other issue, to make the experience in containerized environments more smooth. |
|
@flinkbot run azure |
|
@StephanEwen , yes I agree with your concern. Current memory control mechanism of RocksDB cannot handle any malloced memory not inserted into block cache or write buffer, those just uncompressed data block could have OOM risk.
|
|
Do you think this PR is still relevant after the RocksDB version upgrade and using strict mode? If the answer to the above question is "yes", then I would suggest to not merge this PR, because it may hide problems if there is a regression in the memory management code. |
|
@StephanEwen , sorry for missing this reply during my spring festival holidays. Flink-1.13 did not bump the RocksDB version finally and it seems there is no test failure reported during this time. |
|
@StephanEwen @Myasuka Could we close the PR? I closed the JIRA ticket as it did not appear in a while. |
|
This PR is being marked as stale since it has not had any activity in the last 180 days. If you are having difficulty finding a reviewer, please reach out to the [community](https://flink.apache.org/what-is-flink/community/). If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 90 days, it will be automatically closed. |
|
This PR has been closed since it has not had any activity in 120 days. |
What is the purpose of the change
Currently, RocksDB memory control end-to-end tests failed with very small probability due to the limitation of RocksDB itself. In general, we think RocksDB memory control should take effect in most cases and cannot ensure it behaves perfectly. Thus, if the chance of exceeding the limit is really low, we can retry up to 3 times.
Brief change log
Run the tests with at most three times and failed the end-to-end test case if all failed.
Verifying this change
This change added tests and can be verified as follows:
Run the tests with at most three times and failed the end-to-end test case if all failed.
Does this pull request potentially affect one of the following parts:
@Public(Evolving): noDocumentation