-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Experimental Feature] MR Supports Remote Spill #55
Conversation
client-mr/src/main/java/org/apache/hadoop/mapreduce/v2/app/RssMRAppMaster.java
Outdated
Show resolved
Hide resolved
client-mr/src/main/java/org/apache/hadoop/mapreduce/v2/app/RssMRAppMaster.java
Outdated
Show resolved
Hide resolved
client-mr/src/main/java/org/apache/hadoop/mapreduce/v2/app/RssMRAppMaster.java
Outdated
Show resolved
Hide resolved
client-mr/src/main/java/org/apache/hadoop/mapreduce/v2/app/RssMRAppMaster.java
Show resolved
Hide resolved
Because this pr will introduce user-facing change. We should update doc. |
client-mr/src/main/java/org/apache/hadoop/mapreduce/v2/app/RssMRAppMaster.java
Outdated
Show resolved
Hide resolved
client-mr/src/main/java/org/apache/hadoop/mapreduce/v2/app/RssMRAppMaster.java
Show resolved
Hide resolved
client-mr/src/main/java/org/apache/hadoop/mapreduce/v2/app/RssMRAppMaster.java
Outdated
Show resolved
Hide resolved
client-mr/src/main/java/org/apache/hadoop/mapreduce/task/reduce/RssRemoteMergeManagerImpl.java
Outdated
Show resolved
Hide resolved
08077bb
to
4537e5a
Compare
client-mr/src/main/java/org/apache/hadoop/mapreduce/task/reduce/RssInMemoryRemoteMerger.java
Outdated
Show resolved
Hide resolved
Add RssInMemoryMerger We need write memory data to Hdfs Yes UT Co-authored-by: roryqi <roryqi@tencent.com>
update your description and document. This pr introduce another configuration option. |
4537e5a
to
f4540ad
Compare
LGTM except for pr's description and document. |
Doc is updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What changes were proposed in this pull request?
Rewrite Mapreduce's MergerManager to spill sorted segments to HDFS,
It returns a merge-sorted iterator to read these HDFS segments.
Why are the changes needed?
In cloud, machines may have very limited disk space and performance.
This PR allows to spill data to remote storage (e.g., hdfs)
Does this PR introduce any user-facing change?
Yes.
How was this patch tested?
New UT and IT with remote spill.
Co-authored-by: roryqi roryqi@tencent.com