-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] MR Client may lost data or throw exception when rss.storage.type without MEMORY. #886
Comments
…age.type without MEMORY.
@jerqi I will use MEMORY_LOCALFILE in production environment. LOCALFILE_HDFS is just my develop environment. |
For hadoop version 3.1, you can add an extra profile for it. |
…torage.type without MEMORY. (#887) ### What changes were proposed in this pull request? Make sure finishShuffle after send all shuffle data. ### Why are the changes needed? If type without MEMORY, some data will never flush. ### How was this patch tested? I test in two mode: * Tez local debug mode * MR on yarn mode Add new UT Co-authored-by: zhengchenyu001 <zhengchenyu001@ke.com>
…torage.type without MEMORY. (#887) ### What changes were proposed in this pull request? Make sure finishShuffle after send all shuffle data. ### Why are the changes needed? If type without MEMORY, some data will never flush. ### How was this patch tested? I test in two mode: * Tez local debug mode * MR on yarn mode Add new UT Co-authored-by: zhengchenyu001 <zhengchenyu001@ke.com>
…ion when rss.storage.type without MEMORY. (apache#887)" This reverts commit 4423b43.
…en rss.storage.type without MEMORY.
Code of Conduct
Search before asking
Describe the bug
1 Bug description
When rss.storage.type without MEMORY, client-mr may raise exception as below:
In fact, the problem happen firstly in our internal version on client-tez module. Below is tez error stack:
2 Reason
When the bug happen, the value of
expect committed
in below log is a random value.Here we know that shuffleWriteClient.sendShuffleData run in a async thread. when we call finishShuffle, sendShuffleData may not happen, so some data will never flush in shuffle server.
Affects Version(s)
master
Uniffle Server Log Output
No response
Uniffle Engine Log Output
No response
Uniffle Server Configurations
No response
Uniffle Engine Configurations
No response
Additional context
No response
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: