[CELEBORN-1530] support MPU for S3#2830
Conversation
|
Thanks for this PR. Are there any test results? |
|
FMX
left a comment
There was a problem hiding this comment.
Thanks for this PR but there are some points to polish.
| </dependency> | ||
| <dependency> | ||
| <groupId>org.apache.logging.log4j</groupId> | ||
| <artifactId>log4j-1.2-api</artifactId> |
There was a problem hiding this comment.
This dependency is duplicated.
| <name>aws-mpu-deps</name> | ||
| </property> | ||
| </activation> | ||
| <dependencies> |
There was a problem hiding this comment.
I think these dependencies can be moved to dependencies section because this module is loaded when aws-mpu profile is activated only.
|
|
||
| <profiles> | ||
| <profile> | ||
| <id>aws-mpu</id> |
There was a problem hiding this comment.
The profile name can be changed to aws.
|
|
||
| package org.apache.celeborn.server.common.service.mpu.bean; | ||
|
|
||
| public class AWSCredentials { |
There was a problem hiding this comment.
This class should not be in the common module.
| <property> | ||
| <name>aws-mpu-deps</name> | ||
| </property> | ||
| </activation> |
There was a problem hiding this comment.
This segment is not needed.
<activation>
<property>
<name>aws-mpu-deps</name>
</property>
</activation>
| DynConstructors.builder() | ||
| .impl( | ||
| "org.apache.celeborn.S3MultipartUploadHandler", | ||
| awsCredentials.getClass(), |
There was a problem hiding this comment.
Pass the arguments to S3MultipartUploadHandler should be enough for this scenerio.
| task = new S3FlushTask(flushBuffer, diskFileInfo.getDfsPath(), notifier, true); | ||
| task = | ||
| new S3FlushTask( | ||
| flushBuffer, notifier, true, s3MultipartUploadHandler, partNumber); |
There was a problem hiding this comment.
| flushBuffer, notifier, true, s3MultipartUploadHandler, partNumber); | |
| flushBuffer, notifier, true, s3MultipartUploadHandler, partNumber++); |
| if (task != null) { | ||
| addTask(task); | ||
| flushBuffer = null; | ||
| partNumber++; |
| s3MultipartUploadHandler.complete(); | ||
| } | ||
|
|
||
| if (notifier.hasException()) { |
There was a problem hiding this comment.
These two if blocks can be merged.
| import java.lang.{Long => JLong} | ||
| import java.util.{List => JList} | ||
|
|
||
| case class MultipartUploadRequestParam( |
|
@zhaohehuhu @FMX @WillemJiang |
I still need more time to fully test it as S3 has some limitations related to MPU. |
Every PR should be production-ready before it's been merged. |
FMX
left a comment
There was a problem hiding this comment.
LGTM. Thanks. Merged into main(v0.6.0).
What changes were proposed in this pull request?
as title
Why are the changes needed?
AWS S3 doesn't support append, so Celeborn had to copy the historical data from s3 to worker and write to s3 again, which heavily scales out the write. This PR implements a better solution via MPU to avoid copy-and-write.
Does this PR introduce any user-facing change?
How was this patch tested?
I conducted an experiment with a 1GB input dataset to compare the performance of Celeborn using only S3 storage versus using SSD storage. The results showed that Celeborn with SSD storage was approximately three times faster than with only S3 storage.
The above screenshot is the second test with 5000 mapper and reducer that I did.