Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No retries on network timeouts S3 InputStream #856

Closed
phraktle opened this issue Sep 26, 2016 · 12 comments
Closed

No retries on network timeouts S3 InputStream #856

phraktle opened this issue Sep 26, 2016 · 12 comments
Assignees
Labels
feature-request A feature should be added or improved.

Comments

@phraktle
Copy link

It appears that there are no retries attempted when there's a network timeout on the underlying HTTP connection while reading the InputStream from S3Object#getObjectContent. It should instead transparently reconnect (as per the retry policy) and continue from the last byte's position.

Stack trace

Caused by: java.net.SocketTimeoutException: Read timed out 
at java.net.SocketInputStream.socketRead0(Native Method) 
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) 
at java.net.SocketInputStream.read(SocketInputStream.java:170) 
at java.net.SocketInputStream.read(SocketInputStream.java:141) 
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465) 
at sun.security.ssl.InputRecord.read(InputRecord.java:503) 
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973) 
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930) 
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105) 
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:139) 
at org.apache.http.impl.io.SessionInputBufferImpl.read(SessionInputBufferImpl.java:200) 
at org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:178) 
at org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:137) 
at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72) 
at com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:151) 
at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72) 
at com.amazonaws.services.s3.model.S3ObjectInputStream.read(S3ObjectInputStream.java:155) 
at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72) 
at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72) 
at com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:151) 
at java.security.DigestInputStream.read(DigestInputStream.java:161) 
at com.amazonaws.services.s3.internal.DigestValidationInputStream.read(DigestValidationInputStream.java:59) 
at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72) 
at com.amazonaws.services.s3.model.S3ObjectInputStream.read(S3ObjectInputStream.java:155) 
at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:238) 
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158) 
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117) 
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) 
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) 
at java.io.BufferedInputStream.read(BufferedInputStream.java:345) 
...
@shorea
Copy link
Contributor

shorea commented Sep 26, 2016

This is probably not something we'd consider taking on until the next major version bump as it is a big departure from what we do today. The retry policy for all streaming operations do not apply while reading the content because we've passed control back to the caller already. Presumably we could retry transparently by capturing a reference to the client in a special input stream and on calls to read, catch the IO exception and make another ranged GET starting from the last successful byte.

The transfer manager utility has some more robust retry and resume behavior, would that meet your needs for now?

@phraktle
Copy link
Author

Hi @shorea,

TransferManager does not allow streaming and requires a temporary file, which is not desirable in our use case. Since there's already a S3ObjectInputStream wrapping the stream, it doesn't sound like a stretch to imagine that it should internally reconnect.

Regards,
Viktor

@shorea
Copy link
Contributor

shorea commented Sep 27, 2016

Yeah I think it's definitely possible and makes a lot of sense to honor the retry policy even for streaming operations but I don't think we can add it to the SDK without a major version bump due to the performance implications.

@dagnir dagnir added feature-request A feature should be added or improved. and removed waiting-reply labels Sep 28, 2016
@stevematyas
Copy link

stevematyas commented Nov 4, 2016

Using final S3Object s3Object = s3Client.getObject(bucketName, keyName); I believe I got the same error as well.

Caused by: java.net.SocketTimeoutException: Read timed out
dataservice-app_1             |     at java.net.SocketInputStream.socketRead0(Native Method)
dataservice-app_1             |     at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
dataservice-app_1             |     at java.net.SocketInputStream.read(SocketInputStream.java:170)
dataservice-app_1             |     at java.net.SocketInputStream.read(SocketInputStream.java:141)
dataservice-app_1             |     at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
dataservice-app_1             |     at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:593)
dataservice-app_1             |     at sun.security.ssl.InputRecord.read(InputRecord.java:532)
dataservice-app_1             |     at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
dataservice-app_1             |     at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930)
dataservice-app_1             |     at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
dataservice-app_1             |     at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
dataservice-app_1             |     at org.apache.http.impl.io.SessionInputBufferImpl.read(SessionInputBufferImpl.java:198)
dataservice-app_1             |     at org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:176)
dataservice-app_1             |     at org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:137)
dataservice-app_1             |     at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72)
dataservice-app_1             |     at com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:151)
dataservice-app_1             |     at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72)
dataservice-app_1             |     at com.amazonaws.services.s3.model.S3ObjectInputStream.read(S3ObjectInputStream.java:155)
dataservice-app_1             |     at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72)
dataservice-app_1             |     at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72)
dataservice-app_1             |     at com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:151)
dataservice-app_1             |     at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72)
dataservice-app_1             |     at com.amazonaws.util.LengthCheckInputStream.read(LengthCheckInputStream.java:108)
dataservice-app_1             |     at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72)
dataservice-app_1             |     at com.amazonaws.services.s3.model.S3ObjectInputStream.read(S3ObjectInputStream.java:155)
dataservice-app_1             |     at com.amazonaws.services.s3.model.S3ObjectInputStream.read(S3ObjectInputStream.java:147)
dataservice-app_1             |     at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1792)
dataservice-app_1             |     at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1769)

Using aws-java-sdk-s3:1.11.18 here.

Introducing a new method that accepts a OutputStream would be so great, instead of File, as streaming is a much desired use-case -- TransferManger::download (GetObjectRequest, OutputStream): Download or TransferManger::download (GetObjectRequest, InputStream): Download

Also, @shorea it'd be great if any retry examples, PR, existed before official rollout within SDK -- #893! I have all objects have stored using multi-part upload (5MB or greater partsize).

@stevematyas
Copy link

@phraktle : Did you come up with a work-around?

@OrigamiMarie
Copy link

Is this lack of retries the cause of the error I have been getting very frequently while streaming data from S3 to an EC2 instance in a VPC? I really don't want to download these files (I don't want to deal with the disk at all -- and streaming seems like it ought to work). But the error rate when downloading files is increasing dramatically, and it's a big operational pain. The failure happens at random places in the files (when I retry, the same file will often fail again, but at a different place).

Stack trace:
com.amazonaws.SdkClientException: Data read has a different length than the expected: dataLength=122569353; expectedLength=664918217; includeSkipped=true; in.getClass()=class com.amazonaws.services.s3.AmazonS3Client$2; markedSupported=false; marked=0; resetSinceLastMarked=false; markCount=0; resetCount=0
at com.amazonaws.util.LengthCheckInputStream.checkLength(LengthCheckInputStream.java:152) ~[file-dapi-importer.jar!/:0.0.1]
at com.amazonaws.util.LengthCheckInputStream.read(LengthCheckInputStream.java:110) ~[file-dapi-importer.jar!/:0.0.1]
at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72) ~[file-dapi-importer.jar!/:0.0.1]
at com.amazonaws.services.s3.model.S3ObjectInputStream.read(S3ObjectInputStream.java:155) ~[file-dapi-importer.jar!/:0.0.1]
at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:238) ~[?:1.8.0_66]
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158) ~[?:1.8.0_66]
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117) ~[?:1.8.0_66]
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) ~[?:1.8.0_66]
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) ~[?:1.8.0_66]
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) ~[?:1.8.0_66]
at java.io.InputStreamReader.read(InputStreamReader.java:184) ~[?:1.8.0_66]
at java.io.BufferedReader.fill(BufferedReader.java:161) ~[?:1.8.0_66]
at java.io.BufferedReader.readLine(BufferedReader.java:324) ~[?:1.8.0_66]
at java.io.BufferedReader.readLine(BufferedReader.java:389) ~[?:1.8.0_66]

@dagnir
Copy link
Contributor

dagnir commented Jan 25, 2017

Hi @OrigamiMarie, sorry to hear you're having issues. We do have #893 in our backlog, which is to allow downloading to a InputStream using TransferManager. We are actively looking at ways to support it and hope to deliver it soon!

@steveloughran
Copy link

If anyone ever does add transparent retries to failures in input stream reads, can I, as a representative of the Hadoop team who maintain the S3A connector, have a way to turn this off? Because we do our own reconnect logic and think we've got it under control (now), and having something underneath trying to be helpful might be a regression. Happy to discuss what could be done here, including what exceptions should be treated as recoverable...

@coolpistachio
Copy link

There's a similar problem when the underlying s3 client fails the getObject call in retryableS3DownloadTask.getS3ObjectStream(). The failed call is not retried and the whole parallel download fails.

https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/internal/ServiceUtils.java#L397

@dagnir Is there any workaround other than catching the exceptions from the TransferManager and retrying the whole download?

@millems
Copy link
Contributor

millems commented Mar 31, 2020

V2 supports retrying streaming operations using the retry policy, if the proper API is used. We do not intend to make this change in 1.11.x.

@millems millems closed this as completed Mar 31, 2020
fixmebot bot referenced this issue in VectorXz/elasticsearch Apr 22, 2021
fixmebot bot referenced this issue in VectorXz/elasticsearch May 28, 2021
fixmebot bot referenced this issue in VectorXz/elasticsearch Aug 4, 2021
@electrum
Copy link

@millems Can you give an example or link to some documentation on how to this properly with V2? The InputStream returned from S3Client.getObject() doesn't seem to handle retry for socket exceptions during read operations.

@millems
Copy link
Contributor

millems commented Mar 22, 2024

@electrum In 2.x, you can use the response transformer abstraction to allow retrying failures that occur while reading the response:

s3.getObject(r -> r.bucket("bucket").key("key"), (response, inputStream) -> {
    try {
        // Do something with the stream. 
        IoUtils.copy(inputStream, System.out);
        return null;
    } catch (IOException e) {
        throw RetryableException.create("Failed to read from input stream.", e);
    }
});

Note that the response transformer can be called multiple times, once for each retry. It's a new input stream each time, so it will start back at the beginning of the object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved.
Projects
None yet
Development

No branches or pull requests

9 participants