Skip to content

Use multipart uploads for large JARs#17

Closed
charlesconnell wants to merge 2 commits intomasterfrom
multipart-upload
Closed

Use multipart uploads for large JARs#17
charlesconnell wants to merge 2 commits intomasterfrom
multipart-upload

Conversation

@charlesconnell
Copy link
Copy Markdown

I recently tried to build something that depends on com.amazonaws:aws-java-sdk-bundle and had issues with SlimFast. When uploading that dependency JAR to S3, I got this error:

[ERROR] Failed to execute goal com.hubspot.maven.plugins:slimfast-plugin:0.19:upload (upload-dependency-jars) on project hbase-tasks-jobs: Error uploading file /usr/share/hubspot/build/.m2/repository/com/amazonaws/aws-java-sdk-bundle/1.11.628/aws-java-sdk-bundle-1.11.628.jar: Unable to execute HTTP request: Request did not complete before the request timeout configuration. Connection or outbound has been closed -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal com.hubspot.maven.plugins:slimfast-plugin:0.19:upload (upload-dependency-jars) on project hbase-tasks-jobs: Error uploading file /usr/share/hubspot/build/.m2/repository/com/amazonaws/aws-java-sdk-bundle/1.11.628/aws-java-sdk-bundle-1.11.628.jar
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:215)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
.....
Caused by: org.apache.maven.plugin.MojoFailureException: Error uploading file /usr/share/hubspot/build/.m2/repository/com/amazonaws/aws-java-sdk-bundle/1.11.628/aws-java-sdk-bundle-1.11.628.jar
    at com.hubspot.maven.plugins.slimfast.DefaultFileUploader.doUpload (DefaultFileUploader.java:35)
    at com.hubspot.maven.plugins.slimfast.BaseFileUploader.upload (BaseFileUploader.java:57)
    at com.hubspot.slimfast.LoggingJarUploader.upload (LoggingJarUploader.java:42)
    at com.hubspot.maven.plugins.slimfast.UploadJarsMojo$1.call (UploadJarsMojo.java:95)
....
Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: Request did not complete before the request timeout configuration.
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException (AmazonHttpClient.java:1163)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper (AmazonHttpClient.java:1109)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute (AmazonHttpClient.java:758)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer (AmazonHttpClient.java:732)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute (AmazonHttpClient.java:714)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500 (AmazonHttpClient.java:674)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute (AmazonHttpClient.java:656)
    at com.amazonaws.http.AmazonHttpClient.execute (AmazonHttpClient.java:520)
    at com.amazonaws.services.s3.AmazonS3Client.invoke (AmazonS3Client.java:4705)
    at com.amazonaws.services.s3.AmazonS3Client.invoke (AmazonS3Client.java:4652)
    at com.amazonaws.services.s3.AmazonS3Client.putObject (AmazonS3Client.java:1807)
    at com.amazonaws.services.s3.AmazonS3Client.putObject (AmazonS3Client.java:1658)
    at com.hubspot.maven.plugins.slimfast.DefaultFileUploader.doUpload (DefaultFileUploader.java:32)
    at com.hubspot.maven.plugins.slimfast.BaseFileUploader.upload (BaseFileUploader.java:57)
    at com.hubspot.slimfast.LoggingJarUploader.upload (LoggingJarUploader.java:42)
    at com.hubspot.maven.plugins.slimfast.UploadJarsMojo$1.call (UploadJarsMojo.java:95)
.....
Caused by: com.amazonaws.http.exception.HttpRequestTimeoutException: Request did not complete before the request timeout configuration.
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest (AmazonHttpClient.java:1299)
.....
    at com.amazonaws.services.s3.AmazonS3Client.putObject (AmazonS3Client.java:1807)
    at com.amazonaws.services.s3.AmazonS3Client.putObject (AmazonS3Client.java:1658)
    at com.hubspot.maven.plugins.slimfast.DefaultFileUploader.doUpload (DefaultFileUploader.java:32)
    at com.hubspot.maven.plugins.slimfast.BaseFileUploader.upload (BaseFileUploader.java:57)
    at com.hubspot.slimfast.LoggingJarUploader.upload (LoggingJarUploader.java:42)
    at com.hubspot.maven.plugins.slimfast.UploadJarsMojo$1.call (UploadJarsMojo.java:95)
.....
Caused by: javax.net.ssl.SSLException: Connection or outbound has been closed
    at sun.security.ssl.Alert.createSSLException (Alert.java:127)
    at sun.security.ssl.TransportContext.fatal (TransportContext.java:320)
    at sun.security.ssl.TransportContext.fatal (TransportContext.java:263)
    at sun.security.ssl.TransportContext.fatal (TransportContext.java:258)
    at sun.security.ssl.SSLSocketImpl$AppOutputStream.write (SSLSocketImpl.java:988)
......   
Caused by: java.net.SocketException: Connection or outbound has been closed
    at sun.security.ssl.SSLSocketOutputRecord.deliver (SSLSocketOutputRecord.java:267)
    at sun.security.ssl.SSLSocketImpl$AppOutputStream.write (SSLSocketImpl.java:983)
    at org.apache.http.impl.io.SessionOutputBufferImpl.streamWrite (SessionOutputBufferImpl.java:124)
    at org.apache.http.impl.io.SessionOutputBufferImpl.flushBuffer (SessionOutputBufferImpl.java:136)
    at org.apache.http.impl.io.SessionOutputBufferImpl.write (SessionOutputBufferImpl.java:167)
    at org.apache.http.impl.io.ContentLengthOutputStream.write (ContentLengthOutputStream.java:113)
.......

The aws-java-sdk-bundle-1.11.628.jar file is particularly large, 134 MiB, and I think that's the problem. SlimFast sets a 5 second request timeout and this big PUT request is probably exceeding that. Amazon supports multipart uploads for just this reason. This change uses Amazon's TransferManager to automatically split the upload into multiple HTTP requests. The TransferManager will automatically shut down its thread pool when garbage collected.

I've tested this on an internal HubSpot build by setting dep.plugin.slimfast-plugin.version and it did successfully upload aws-java-sdk-bundle-1.11.628.jar.

@charlesconnell
Copy link
Copy Markdown
Author

I'm not waiting on this because it looks like I'm moving away from using aws-java-sdk-bundle anyways, but it may be a nice change for everyone to have going forward.

@jhaber
Copy link
Copy Markdown
Member

jhaber commented Apr 28, 2020

Thanks for the PR, using multipart uploads definitely makes sense for large JARs.

I wonder if we should also update S3Factory to replace the request timeout with a socket timeout. That way as long as you're making progress the request won't time out. But if you don't make any progress for 5 seconds then it fails.

My only concern with this change is how it affects performance for small file uploads. I'm not sure if TransferManager performs poorly for that use-case, or if it internally has heuristics based on file size.

@charlesconnell
Copy link
Copy Markdown
Author

charlesconnell commented May 14, 2020

Yeah I'm happy to make that change in S3Factory.

I do know that TransferManager only uses multipart uploads for files greater than 16 MiB. However, even for files under that limit, it still uses the thread pool for those simple upload requests. That does theoretically add some overhead.

@jaredstehler
Copy link
Copy Markdown
Contributor

this is superceded by #37

@jaredstehler jaredstehler deleted the multipart-upload branch November 5, 2024 00:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants